Note: This documentation is for an old version of Webinator. The latest documentaion is here.

Refresh in version 5 vs. 4

In Webinator version 4 and earlier, the refresh walk checked every page in the database to determine whether it needed updating. Since only changed pages need updating, and those are typically a small percentage of the site, checking for changed pages is faster than doing a complete new walk. However, it is still time-consuming, because the web server must be accessed for every page on the site, and only the web server can inform Webinator whether the page has changed.

In Webinator version 5 and later, there is an improved refresh process. The walk is adapted to focus on the small but important group of changing pages. As each page is walked, a refresh period is calculated for that individual page. The calculation is based on whether the page has changed since the last time it was fetched, and how long ago that fetch was. This refresh information is used to determine when the page should be checked again. In this way, the walk prioritizes the walking of pages that change often or are new, and it delays the fetch of pages that seldom change.

Thus, when a walk (scheduled or manual) takes place, only the pages that need to be refreshed now are actually fetched - not the entire database. The result is a database that is updated by a process that consumes fewer server resources.


Copyright © Thunderstone Software     Last updated: Thu Mar 11 16:13:32 EST 2010
Copyright © 2024 Thunderstone Software LLC. All rights reserved.