Thunderstone Software Document Retreival and Management
Search:
Advanced Search
Home | Products | Company | News | Tech Support | Demos | Contact Us
Webinator 2 Manual

Enable extra page duplication prevention (-unique)

Syntax: -unique

This option will enable extra checking for duplicate documents. Documents with the same content will only be stored once, even if their URLs are different. This is accomplished by placing a unique index on the id field of the html table and storing a hash code for the HTML source of the document there instead of the normal counter variable. All subsequent walks will perform hashing.

This option should only be used on an empty database since any existing counter id's would not be proper hash codes. This option must be respecified, if desired, after performing a database wipe with -wipe.

NOTE: Dynamic debugging insertions into the HTML source, such as the current time, whether visible or in comments, will change the hash thereby defeating this feature.


Copyright © Thunderstone Software     Last updated: Tue Nov 6 10:58:47 EST 2007
 
Home   ::   Products   ::   Company   ::   News   ::   Tech Support   ::   Demos   ::   Contact Us
Copyright © 2009 Thunderstone Software LLC. All rights reserved.