|
Within Metamorph a "set" can be any one of four different types of
text data:
-
The set of words or phrases that mean the same thing.
-
The set of text patterns that match a regular-expression.
-
The set of text patterns that are approximately the same.
-
The set of quantities that are within some range.
There are three types of operations that can be used in
conjunction with any set:
INCLUSION > The set must be present.
EXCLUSION > The set must not be present.
PERMUTATION > X out of Y sets must be present.
The set logic operations are performed within two boundaries:
-
A starting delimiter (e.g.; the beginning of a sentence).
-
An ending delimiter (e.g,; the end of a sentence).
Each type of set plays an important role in the real-world use of
a text retrieval tool:
-
The word-list pattern matcher can locate any word
form of an entire list of English words and/or
phrases.
-
The regular-expression pattern matcher allows the
user to search for things like dates, part numbers,
social security numbers, and product codes.
-
The approximate pattern matcher can search for
things like misspellings, typos, and names or
addresses that are similar.
-
The numeric/quantity pattern matcher can look for
numeric values that are present in the text in
almost any form and allows the user to search for
them generically by their value.
The Metamorph search engine will always optimize the search
operations performed so that it will minimize the amount of CPU
utilization and maximize the throughput search rate. At the heart
of the Metamorph search engine lie seven of the most efficient
pattern matchers there are for locating items within text. With
the exception of the Approximate Pattern Matcher, all of these
pattern matchers use a proprietary algorithmic technique that is
guaranteed to out-perform any other published pattern matching
algorithm (including those described by Boyer-Moore-Gosper and
Knuth-Pratt-Morris).
Providing the user with set-logic to manipulate combinations of
these set-types gives them the ability to search for just about
anything that they might want to find in their textual
information. The query tool in general can be as simple or
sophisticated as the user wishes, with the simplest query being a
simple natural-language question.
Copyright © Thunderstone Software Last updated: Sun Mar 17 21:14:49 EDT 2013
|