Note: This documentation is for an old version of Webinator. The latest documentaion is here.

Enable extra page duplication prevention (-unique)

Syntax: -unique

This option will enable extra checking for duplicate documents. Documents with the same content will only be stored once, even if their URLs are different. This is accomplished by placing a unique index on the id field of the html table and storing a hash code for the HTML source of the document there instead of the normal counter variable. All subsequent walks will perform hashing.

This option should only be used on an empty database since any existing counter id's would not be proper hash codes. This option must be respecified, if desired, after performing a database wipe with -wipe.

NOTE: Dynamic debugging insertions into the HTML source, such as the current time, whether visible or in comments, will change the hash thereby defeating this feature.


Copyright © Thunderstone Software     Last updated: Tue Nov 6 10:58:47 EST 2007
Copyright © 2024 Thunderstone Software LLC. All rights reserved.