THUNDERSTONE NEWS
CONTENTS
NEW FEATURES
File Splitting.
Some customers have very large documents. A single PDF or word processor
file can hold an entire book. When indexing with Webinator, the default
behavior is to treat each file as a single record. It may be more user-friendly,
however, to split such files into pieces, so that search results point to
the relevant sections of the original document.
The new Plugin
Split option makes that easy. For example, you can specify that PDFs
be broken into a separate record for each page, or each n pages.
Documents without page markers may be split at an arbitrary number of characters.
The feature is now part of the Webinator and Texis
File-Format plug-in (anytotx) from version 4.3 on. Texis
maintenance customers or those with Webinator paid versions 4.0+ may
request a copy of the new plug-in from Tech Support. Other customers
may obtain the new plug-in by upgrading Webinator or joining Texis
Maintenance.
Lock and Run.
A new utility lockandrun is now part of Texis, which allows you to lock some or all tables in a database while an arbitrary shell command is run, for example a backup process.
CUSTOMER SPOTLIGHT: QVC
Most Americans know QVC by its shopping channel on television.
QVC's web site, however, also is one of the most successful
online retailers, bringing in a substantial portion of QVC's $4.4
billion of sales yearly.
Thunderstone's Texis software plays an important role at QVC
in a variety of ways.
Unlike television sales, where each item is on sale for only a
few minutes at a time, QVC's entire inventory is available all
the time on its web site. The online database thus has hundreds
of thousands of products, and requires a robust search engine to
help users find the items that satisfy them.
more...
Technology customers in the U.S. government and defense
sectors are invited to stop by and meet us at the Symposium on
Information Sharing & Homeland Security, June 30 and July
1 in Philadelphia. Approximately 100 other technology vendors also
will be exhibiting. To be admitted to the exhibit hall
without charge, mention that you are a guest of Thunderstone.
FEATURE SPOTLIGHT: The Texis Profiler
The Profiler is one of the under-appreciated capabilities of
Texis. It optimizes the handling of stored queries. Your users
might greatly appreciate having a stored query feature.
Originally the Profiler was developed for monitoring
newswires. However, it could be useful in many other
applications, such as tracking message board content, email, or
new pages found by the Webinator walker.
The Profiler can efficiently power a real-time notification
service, if needed. Even if you just send notifications as email,
using the profiler insures that your users always have the most
up-to-date information, whenever they choose to check their mail.
Technically, the essence of the Profiler is that it turns the
standard search model on its head: Queries are stored in a table
and indexed, and a new message or document is turned into a query
against that table.
Suppose you have 10,000 stored queries. With the Profiler,
only one SQL SELECT need be done to test each new item and find
out which of the 10,000 users to notify about it. That's as
opposed to a batch approach where you run 10,000 queries every so
often. And the other disadvantage of a batch approach is that the
notifications will be less timely.
Setting up the profiler involves these steps:
1. Queries are stored in a profiles table.
2. A Metamorph counter index is created on the query field of the
profiles table.
3. The Vortex INIT command loads the words from the index into
memory.
4. The GET command trims new items (news, messages etc.) down to
only the words matching at least one query.
5. A query using the trimmed item and LIKEIN is run against the
profiles table.
6. The result is the list of matching profiles or users to
receive notifications.
More detail is in the tutorial chapter
16 and the Vortex manual Profiler
section.
Feedback, suggestions and questions are welcome to