Toggle navigation
+1 216-820-2200
+1 216-820-2200
Toggle navigation
Products
Solutions
How to Buy
Support
Contact Us
News
About
Note: This documentation is for an old version of Webinator. The latest documentaion is
here
.
Thunderstone Webinator
WWW Site Indexer Version 5.1.87
Thunderstone Software
Contents
Document Conventions
Overview
Features
Obtaining Webinator
Technical Support
Installation
Unix Download and Installation
Windows Download and Installation
Filesystem Layout
File Permissions and OS Specific Notes
Customizing Webinator's Appearance
Operation
Running the Administrative Interface
First Time Run: Quick Start
Step 1: Create an Account
Step 2: Create a Profile
Step 3: Walk the Profile
Last Step: Search
Administrative Interface Overview
Entry
Basic Walk Settings
All Walk Settings
Search Settings
Best Bet Groups
Profile Tools
List/Edit URLs
List Duplicates
Walk Status
Now button
Pause/Auto button
STOP walk button
Pause walk and Make live button
Query Log
Test Search
Live Search
Profiles
Accounts
Add a User
Change Password
Delete
Documentation
Webinator Home
Logout
Basic Walk Settings
Database
Walk Summary
Notes
Base URL
Enterprise
Robots
Extensions
Exclusions
Crawl Delay
Parallelism
Verbosity
Rewalk Type
New
Refresh
Refresh in version 5 vs. 4
Rewalk Type Summary Table
Rewalk Schedule
Action Buttons
Advanced Walk Settings
Watch URL
Notify
Attach Logs
Categories
URL File
URL URL
Single Page
Page File
Page URL
Strip Queries
Ignore Case
Extra Domains
Extra Networks
Extra URLs REX
Exclusion REX
Exclusion Prefix
Exclude by Field
Data from Field
Data From Field Example - Using Description for Title
Data From Field Example - using PublishDate for Modified Date
Data From Field Example - grabbing Price from meta
Data From Field Example - grabbing Price from Text
Required REX
Required Prefix
Max Page Size
Max Pages
Max Bytes
Max Depth
Max URL Size
Max Requests
Max Connection Lifetime
Page Timeout
Meta Tags
Standard Meta
All Meta
Storage Charset
Source Default Charset
XML UTF-8
Keep HTML
Keep Links
Remove Common
Ignore Tags
Keep Tags
Ignore Characters
Plugin Split
Word Definition
Index Fields
Compound Index Fields
Extra Indexes
Spell-check Dictionaries
Primer Type
Primer URLs
Submitting the Form Directly: Custom Primer URL
Filling Out the Form: Custom Primer Variables
Checking for Bad Logins: Bad Login MM Query
Multiple Primers: Base URL MM Query
Login Info
Proxy
Proxy Login Info
Cookie Source Path
Off-Site Pages
Stay Under
Prevent Duplicates
Duplicate Check Fields
All Extensions
Store Refs
Inline Iframes
Max Frames
Execute JavaScript
Fetch JavaScript
JavaScript String Links
Debug JavaScript
JavaScript Memory
JavaScript Timeout
Protocols
SSL Client Protocols
Authentication Schemes
Embedded Security
Entropy Source
Max Redirects
Index Name
DNS Mode
Net Mode
User Agent
Mime Types
Respect Expires Header
Default Refresh Time
Minimum Refresh Time
Maximum Refresh Time
Maximum Process Size
Search Settings
Notes
Query Logging
Rotate Schedule
Email
Result Order
Results Style
Abstract Style
Abstract Length
Max Title Length
Max URL Display Length
Results per Page
Max User Results per Page
Results Width
Box Color
Show Advanced Search
Query Highlighting
PDF Query Highlighting
Font
Display Charset
Top HTML and Bottom HTML
Enable Sherlock
Apply Appearance and Revert Appearance
Top Best Bet Title
Right Best Bet Title
Top Best Bet Group
Right Best Bet Group
Top Best Bet Box Color
Right Best Bet Box Color
Top Best Bet Border Style
Right Best Bet Border Style
Right Best Bet Box Width
Enable Spell Check
Suggest Time Limit
Number of Suggestions
Synonyms
Translate Boolean
Allow the @ Operator
Allow Linear
Allow NOT Logic
Allow Post-Processing
Allow Wildcards
Allow Leading Wildcards
Single-Word Wildcards
Allow WITHIN Operators
Resolve Phrase Noise Words
Keep Noise Words
Noise List
Search Timeout
Show Error Messages
Debug SQL Level
Fast Result Counts
Proximity
Language Characters
Word Forms
Word Ordering
Word Proximity
Database Frequency
Document Frequency
Position in Text
Clicks from Home
Ranked Rows
Phishing Protection
Decode Displayed URLs
Running the Walker by Hand
Using dowalk
Using gw
Running the Search Interface
Procedures and Examples
Searching your Index
Similarity Searching
Page Exclusion, Robots.txt, and Meta-robots
Indexing Other Sites
Indexing Individual Pages
Reindexing on a Schedule
Checking for Web Server Errors
Removing Pages from the Database
Erasing the Entire Database
Using Multiple Databases
Using Best Bets
Quick Creation
Fully Customized
Reference
Database and File Usage
Walk Database Tables and Fields
Options Table Fields
Customizing the Search
Customizing the Walker
Texis ISAPI
Overview
How it Works
Settings for Texis ISAPI
Reading values from texis.cnf
Reading values from the Registry
IIS Manual Configuration
IIS 5.X or earlier
IIS 6 or later
Third-Party Software
Antiword
Aspell
Catdoc xls2csv
Cole library
iconv
ppt2html, msg2html
SSL/HTTPS plugin
unrar
unzip
zlib
SpiderMonkey (JavaScript-C) Engine
PDF/anytotx plugin
thttpd - throttling HTTP server
prngd
GNU General Public License
GNU Lesser General Public License
GNU Library General Public License
Netscape Public License
Search Interface Help
Forming a Query
Query Rules of Thumb
Overview of Query Abilities
Controlling Proximity
Ranking Factors
Keywords Phrases and Wild-cards
Applying Search Logic
Natural Language Query
Using the Special Pattern Matchers
Invoking Thesaurus Expansion
Using Word Forms
Controlling Proximity
Interpreting Search Results
Viewing Match Info
Finding Similar Documents
Showing Document Parents
Copyright © Thunderstone Software
Last updated: Thu Mar 11 16:13:32 EST 2010
Webinator Manual
Top
Next: Contents
PDF
Contact
Submit Request
Copyright © 2024 Thunderstone Software LLC. All rights reserved.