|
These settings affect the way that text searches are performed. They are
equivalent to changing the corresponding parameter in the profile, or by
calling the Metamorph API function to set them (if there is an equivalent).
They are:
- minwordlen
- The smallest a word can get due to suffix and prefix
removal. Removal of trailing vowel or double consonant can make it a
letter shorter than this. Default 255.
- keepnoise
- Whether noise words should be stripped from the query
and index. Default off.
- suffixproc
- Whether suffixes should be stripped from the words to
find a match. Default on.
- prefixproc
- Whether prefixes should be stripped from the words to
find a match. Turning this on is not suggested when using a Metamorph
index. Default off.
- rebuild
- Make sure that the word found can be built from the root
and appropriate suffixes and prefixes. This increases the accuracy of
the search. Default on.
- useequiv
- Perform thesaurus lookup. If this is on then the word
and all equivalences will be searched for. If it is off then only the
query word is searched for. Default off. Aka keepeqvs
in version 5.01.1171414736 20070213 and later.
- inc_sdexp
-
Include the start delimiter as part of the hit. This is not
generally useful in Texis unless hit offset information is being
retrieved. Default off.
- inc_edexp
-
Include the end delimiter as part of the hit. This is not
generally useful in Texis unless hit offset information is being
retrieved. Default on.
- sdexp
- Start delimiter to use: a regular expression to match
the start of a hit. The default is no delimiter.
- edexp
- End delimiter to use: a regular expression to match
the start of a hit. The default is no delimiter.
- hyphenphrase
- Controls whether a hyphen between words searches
for the phrase of the two words next to each other, or searches for
the hyphen literally. The default value of 1 will search for the two
words as a phrase. Setting it to 0 will search for a single term
including the hyphen. If you anticipate setting hyphenphrase to 0 then
you should modify the index word expression to include hyphens.
- wordc
- Defines which characters consitute a word, during linear
(non-index) searches. When a match is found the hit is expanded to
include all surrounding word characters, as defined by this setting.
The value is specified as a REX character set. The default setting
is
[\alpha\'] which corresponds to all letters and
apostrophe. If you wanted to exclude apostrophe and include digits
you could say: set wordc='[\alnum]' Added in version
3.00.942260000. Note that this setting is for linear searches: what
constitutes a word for Metamorph index searches is controlled
by the index expressions (addexp property,
here).
- langc
- Defines which characters make a search term a language
query. A language query term will have prefix/suffix processing
applied (if enabled), as well as force the use of wordc to
qualify the hit (during linear searches). Normally langc
should be set the same as wordc with the addition of the
phrase characters space and hyphen. The default is
[\alpha\' \-] Added in version 3.00.942260000.
- withinmode
-
A space- or comma-separated unit and optional type for the
"within-N" operator (e.g.
w/5). The unit is one of:
-
char for within-N characters -
word for within-N words
The optional type determines what distance the operator measures.
It is one of the following:
-
radius (the default if no type is specified when
set) indicates all sets must be within a radius N of an
"anchor" set, i.e. there is a set in the match such that all
other sets are within N units right of its right edge or N
units left of its left edge. -
span indicates all sets must be within an N-unit
span
Added in version 4.04.1077930936 20040227. The optional type was
added in version 5.01.1258712000 20091120; previously the only
type was implicitly radius. In version 5 and earlier the
default setting was char (i.e. char radius); in
version 6 and later the default is word span.
- phrasewordproc
-
Which words of a phrase to do suffix/wildcard processing on. The
possible values are
mono to treat the phrase as a
monolithic word (i.e. only last word processed, but entire phrase
counts towards minwordlen); none for no
suffix/wildcard processing on phrases; or last to process just
the last word.
Note that a phrase is multi-word, i.e. a single word in double-quotes
is not considered a phrase, and thus phrasewordproc does not apply.
Added in version 4.03.1082000000 20040414. Mode none
supported in version 5.01.1127760000 20050926.
- mdparmodifyterms
-
If nonzero, allows the Metamorph query parser to modify search terms
by compression of whitespace and quoting/unquoting. This is for
back-compatibility with earlier versions; enabling it will break the
information from bit 4 of
mminfo() (query offset/lengths of
sets). Added in version 5.01.1220640000 20080905.
Copyright © Thunderstone Software Last updated: Sun Mar 17 21:14:49 EDT 2013
|