Note: This documentation is for an old version of Webinator. The latest documentaion is here.

Using dowalk

 

Normally a walk is initiated from the administrative interface. There may, however, be times when it is desirable to start a walk by hand from a shell (or command) prompt or as a part of some other automated task. When the administrative interface starts a walk it shows you the command line to use (using gw is discussed later in this section). It is of the form

texis profile=PROFILENAME dowalk/dispatch.txt

You may also specify the parameter ttyverbose to be 1, or higher, to tell dowalk to print various status messages to the screen when being run by hand. The form would be

texis profile=PROFILENAME ttyverbose=1 dowalk/dispatch.txt

Where PROFILENAME is the name of the profile you have configured using the administrative interface. You will need to supply the full path to texis if it is not in your PATH. You will also need to supply the path to the dowalk script if it is not in the current directory when you run the command.

INSTALLDIR/bin/texis profile=PROFILENAME INSTALLDIR/texis/scripts/webinator/dowalk/dispatch.txt

or

INSTALLDIR\texis profile=PROFILENAME INSTALLDIR\Texis\Scripts\Webinator\dowalk/dispatch.txt

The walker will behave the same as it does from the administrative interface. Walk info will be logged to the same files. See section 6.1.

There are several other ``entry points'' that can be used to get various different behaviors when starting the walker. They all take the same form as dispatch above except that dispatch is replaced by the name of the entry point. The entry points are:

  • dispatch
    Start a complete new walk.

  • hold
    To stop a walk that is in progress, create/update the search indices and make it the live search.

  • stop
    To stop and abandon a walk that is in progress.

  • indexmakelive
    To create/update the search indices on an abandoned walk and make it the live search.

  • refreshnow
    To force soonest refresh of a particular URL. This requires an extra u=THEURL argument to tell it what URL to refresh. This will flag the page for refresh on the next refresh check. It will not refresh anything itself. So you need to have walk type set to refresh and a schedule set. texis profile=PROFILENAME u=THEURL dowalk/refreshnow.txt

  • ifmodified
    Checks the Watch URL. If the watched page has changed a walk is started. If not no action is taken. This is generally used on a frequent schedule to automatically rewalk a site if it changes.

  • singles
    Fetches and indexes any single pages specified in the profile that are not yet in the database. You would call this after adding adding to Single Page, Page File, or Page URL.

  • refresh
    Start a ``refresh'' walk. This walk will check all pages already in the database and download only changed ones. Missing pages will be deleted. New pages discovered on modified pages will be added.

  • recat
    Recategorize the database based on the current settings of Categories.

  • reindex
    Drop and recreate the Metamorph index on the html table. This would be used after changing the Word Definition expressions.

  • updateindex
    Update the Metamorph index on the html table. This would be used after performing manual sql operations against the html table.

  • remakeindex
    Drop and recreate all (standard) indices on the database. This has little use except in the case where indices got corrupted by disk errors or such.

  • checkandbuild
    Ensure that the proper search index exists for the search fields selected in the profile. Wouldn't generally be called except internally when the desired fields to search are changed.

  • tsverrors
    Dumps the error table as tab separated values of Date, Url, Reason. Optional start and end date-times may be specified. Not specifying start means start at beginning. Not specifying end means continue to end. texis profile=PROFILENAME start="2004-10-01" end="2004-11-01" dowalk/refreshnow.txt

  • convert
    The entry point convert has a different syntax than the others.

    texis v2db=DB v2profile=PROFILE v4profile=PROFILE dowalk/convert.txt

    It is used to convert Webinator 2 profiles to Webinator 4 profiles (as well as possible). Set v2db to the full path to the existing Webinator 2 database containing the profile to convert. Set v2profile to the name of the Webinator 2 profile in the specified database to convert. Set v4profile to the name of the new Webinator 4 profile to create in the global database.

    A walk is NOT started. After conversion you would select the new profile, make any adjustments or fixups, then start a new walk.


Copyright © Thunderstone Software     Last updated: Thu Mar 11 16:13:32 EST 2010
Copyright © 2024 Thunderstone Software LLC. All rights reserved.