Integrates Entity Extraction With Guided Navigation During The Creation Of Search Indexes
Looking for a way to more quickly and accurately access valuable BI data residing in disparate forms and
locations across an enterpise, Thunderstone wanted to improve actionable query results on the Thunderstone
Search Appliance platform by incorporating "entity extraction with guided navigation" technology that automatically
creates metadata based on the content of indexed documents.
So, Thunderstone licensed Inxight Software's SmartDiscovery Awareness Server and ThingFinder Professional
SDK -- which intelligently identifies and extracts key entities (or "things") from any text-data source.
Now Thunderstone customers can add this valuable option to their Thunderstone Search Appliances, providing even
better contextual query results with built-in recognition of “People,” “Places” and
“Organizations” in the searched data.
Unlike others who have licensed similar tools from Inxight Software, Thunderstone has integrated these guided
navigation capabilities into the actual crawling technology of the Thunderstone Search Appliances. Instead of
applying on-the-fly entity extraction to filter the results of searches as queries are executed, Thunderstone Search
Appliances apply the appropriate entity tags to all relevant data during the creation of the search indexes.
By eschewing the post-query "results filtering" approach to entity extraction and opting to take fuller advantage
of automated meta-tagging at the time the indexing process takes place, the Thunderstone Search Appliances enable
users to more rapidly and thoroughly search the entire contents of all included documents.
With a scheduled release date of 21 November 2007, this new “Automated Tagging” option has special
pricing available for Thunderstone Search Appliance customers who order it before the end of the calendar year.
Contact us at +1 216 820 2200 for more details.
We are extending our introductory offer of up to 25percent off on the connectors being offered by Persistent
Systems until the end of 2007 to ensure that you have time to test out these easy to use ways to load data into
the Thunderstone Search Appliance. We currently have a test lab set up, and we invite anybody interested in seeing
the connectors in action to contact us at +1 216 820 2200.
Thunderstone Parametric Search Appliance
The NEW Thunderstone Parametric Search Appliance is being offered at an introductory price of 10 percent off
until the end of November. Demonstrations and evaluations of the Parametric Search Appliance are available using
your own data so you can see the immediate impact of having an easy to use complete search.
Entity Extraction with Guided Navigation
The NEW Entity Extraction with Guided Navigation for the Thunderstone Search Appliance is being offered at an
introductory price of 10 percent off until the end of 2007. Demonstrations and evaluations are available using
your own data.
“Ah! on Thanksgiving day, when from East and from West,
From North and South, come the pilgrim and guest,
When the gray-haired New Englander sees round his board
The old broken links of affection restored,
When the care-wearied man seeks his mother once more,
And the worn matron smiles where the girl smiled before.
What moistens the lips and what brightens the eye?
What calls back the past, like the rich pumpking pie?
”
John Turnbull, President & CEO of Thunderstone Software LLC, has accepted an invitation to participate in an
expert panel discussion on "Enterprise Search Platforms" for the upcoming December/January printed and online
editions of Business Management magazine.
http://www.busmanagement.com/pastissue/article.asp?art=272445&issue=235
ARIBA UTILIZES TEXIS AS SEARCH PLATFORM FOR ENTERPRISE-WIDE KNOWLEDGE MANAGEMENT
Ariba, Inc. (http://www.ariba.com) offers the world's leading Spend Management
software and services
to a wide range of customers that include people from all industries. It provides a set of both CD-based
and on-demand software, along with services related to sourcing products and commodities, negotiating
contracts, buying against negotiated contracts and other key components of a complete, end-to-end Spend
Management solution. Ariba helps organizations analyze, understand and manage their spending in order to
rapidly achieve sustainable cost savings and to improve business process efficiency.
Ariba initially purchased a Thunderstone Texis license due to the availability of integration code from its
content management system vendor. The integration code turned out to be inefficient and was horribly sluggish for
Ariba's end users. Because Thunderstone provides the ability to completely customize everything from the Texis
databases to Vortex scripts, Ariba's software engineers developed their own integration with the content management
system. The result was several customized interfaces that enabled users of the company's various portals to search
for content items containing full text and file attachments plus all the associated metadata needed for efficient
filtering, security and results sorting.
According to Derek Matthews, Ariba's Lead Knowledge Architect, “I started a project to essentially rip out
all the "Solution Accelerator" integration code that had been done with the content management vendor. And we wrote
our own interface into Thunderstone Texis. It was then that we started to learn the power of Texis and the power of
the Vortex scripting and all the things that we really could do, that we had been wanting to do, but didn't even
realize was possible. At that point we upped our license to fit our growing content and the growing hits we were
having, and we started customizing based on all the security and filtering and other parameters that we wanted to be
able to do. For instance, we have search results screening and we have customized sorting based on various fields
that we track separately - which is very powerful. A lot of vendors don't provide that capability to customize how
the sorting even works on your results from the search engine.
“We started rolling out enhancement wave after enhancement wave and have continued that to this day. One of
the latest enhancements that we've rolled out has been federated searching capabilities, what Thunderstone calls the
metasearching. We actually have portlets that we use on various portals across our extranet that allow people to
pull together incongruent content based on metasearches that they want to perform. It is possible now, within the
last year, for an Ariba employee to do a search and be able to pull a service request from our CRM system, a defect
from our engineering quality system, a marketing presentation from our content management system, as well as
searching any number of other internal web sites that have been indexed. You can search over all of the sources
through one integrated portlet that we built on top of Thunderstone Texis.
“Texis allows Ariba to index widely varying content into a platform that can be securely accessed by users
from various portals for our customers, partners, suppliers, prospects and employees. We are able to use the search
engine to present dynamic, context-sensitive views of content for users to browse with the ability to refine through
full-text searching.
“I am amazed at the whole design of the engine itself. It really is powerful. Thunderstone doesn't get
enough credit, and I've never quite understood why. When it comes to customizing the database, customizing the
indexing and how that works, and customizing the user interface those three things I have not seen any of
Thunderstone's competitors demonstrate the ability to hit all three of those the way Thunderstone does and with the
depth that Thunderstone does.”
To find out more about Texis and how you can use it to enhance your business, or to get the complete case study
contact Thunderstone Software at +1 216 820 2200.
CaseCentral, the leader in discovery lifecycle management solutions, uses Texis as a core technology and says
the following on their partner page:
“Texis integrates relevance-ranked full-text indexing with real-time SQL operations. This provides
advanced search logic within a versatile system that easily adapts to new requirements, and scales to
industry-leading levels of documents.”
When displaying search results to users it is often helpful to display results in pages of results with a
small number of results on each page, say 10 to 20, but still let the user know how many results there were
overall.
Texis provides an count of how many records were selected by the most recent index operation, which can be used
to provide an estimate of how many total results there are.
Accessing indexcount
This number can be accessed from Vortex using the $indexcount variable. It will be displayed by tsql with
the -n -n options, or from the C API using the n_getindexcount() function and is accessible after fetching the
first result row.
Uses
The most common use for indexcount is to provide a total count of search results, as well as to provide next
page links. Vortex sets $rows.max to $indexcount, and $rows.max is used by default with the Vortex
function to generate the page navigation links automatically.
Limitations
This number is generally accurate if the query is fully resolved by an index, or will indicate the number of
records that will be processed to find the answer if the query is not fully indexed.
Since indexcount uses the most recent index operation if the SQL statement you are executing contains a join
of multiple tables then indexcount is not likely to be useful as it may correspond to one or other of the tables
only.
Alternatives
If the limitations on indexcount prevent you from using indexcount and you need an accurate result count the best
solution is to issue a SQL count query. Simply replace the fields you want in the SELECT clause with a count(*).
If you are using Vortex you probably want to rename the result so you can access it in a Vortex variable. Do not
rename to indexcount directly though as that will be overridden. For example if the SQL you are using for results
starts:
SELECT Title, Author, PubDate FROM articles WHERE ...
you would convert it to:
SELECT count(*) numresults FROM articles WHERE ...
keeping the rest of the SQL the same (although you can drop
any ORDER BY) and use $numresults instead of $indexcount.
Feedback, suggestions and questions are welcome to