|
By default Texis considers the entire field to be a hit when the
full text is retrieved.
If you want your search items to occur within a more tightly
constrained proximity range this can be adjusted. If you are using
Vortex you will need to allow within operators which are disabled by
default due to the extra processing required.
Add a "within" operator to your query syntax;
"w/line" indicates a line;
"w/para" indicates a paragraph;
"w/sent" indicates a sentence;
"w/all" incdicates the entire field;
"w/#" indicates # characters.
The default proximity is "w/all".
Example:
Using the legal ordinance text, we are searching the full text bodies
of those ordinances for controls issued about dogs. The following
query uses sentence proximity to qualify its hits.
WHERE BODY LIKE 'dog control w/sent'
This sentence qualifies as a hit because "control" and "dogs" are
in the same sentence.
Ordinances provide that the animal CONTROL officer takes
possession of DOGS which are free of restraint.
Add a within operator to the Metamorph query to indicate both stated
search items must occur within a single line of text, rather than
within a sentence.
WHERE BODY LIKE 'dog control w/line'
The retrieved concept group has changed from a sentence to a line, so
"dog" and "control" must occur in closer proximity to each other.
Now the line, rather than the sentence, is the hit.
CONTROL officer takes possession of DOGS
Expanding the proximity range to a paragraph broadens the allowed
distance between located search words.
WHERE BODY LIKE 'dog control w/para'
The same query with a different "within" operator now locates this
whole paragraph as the hit:
The mayor, subject to the approval of the city council,
shall appoint an animal CONTROL officer who is qualified to
perform the duties of an animal control officer under the
laws of this state and the ordinances of the city. This
officer shall take possession of any DOG which is free of
restraint in the city.
The words "control" and "dog" span different lines and different
sentences, but are within the same paragraph.
These "within" operators for designating proximity are also referred
to as delimiters. Any delimiter can be designed by creating a regular
expression using REX syntax which follows the "w/". Anything
following "w/" that is not one of the previously defined
special delimiters is assumed to be a REX expression. For example:
WHERE BODY LIKE 'dog control w/\RSECTION'
What follows the `w/' now is a user designed REX expression for
sections. This would work on text which contained capitalized headers
leading with "SECTION" at the beginning of each such section
of text.
Delimiters can also be expressed as a number of characters forward and
backwards from the located search items. For example:
WHERE BODY LIKE 'dog control w/500'
In this example "dog" and "control" must occur within a window of
500 characters forwards and backwards from the first item located.
More often than not the beginning and ending delimiters are the same.
Therefore if you do not specify an ending delimiter (as in the above
example), it will be assumed that the one specified is to be used for
both. If two expressions are specified, the first will be beginning,
the second will be ending. Specifying both would be required most
frequently where special types of messages or sections are used which
follow a prescribed format.
Another factor to consider is whether you want the expression defining
the text unit to be included inside that text unit or not. For
example, the ending delimiter for a sentence obviously belongs with
the hit. However, the beginning delimiter is really the end of the
last sentence, and therefore should be excluded.
Inclusion or exclusion of beginning and ending delimiters with the hit
has been thought out for the defaults provided with the program.
However, if you are designing your own beginning and ending
expressions, you may wish to specify this.
Copyright © Thunderstone Software Last updated: Sun Mar 17 21:14:49 EDT 2013
|