Thunderstone Software Document Retreival and Management
Search:
Home Products News Support Contact Us About
Vortex Manual

Query Protection

 

The following apicp settings alter the set of query syntax and features that are allowed. Metamorph has a powerful search syntax, but if improperly or inadvertently used can take a long time to resolve poorly constructed queries. In a high-load environment such as a Web search engine this can bog down a server, slowing all users for the sake of one bad search.

Therefore, Vortex is by default highly restrictive of the queries it will allow, denying some specialized features for the sake of quicker resolution of all queries. By altering these settings, script authors can "open up" Texis and Metamorph to allow more powerful searches, at the risk of higher load for special searches.

  • alpostproc (boolean, off by default in Vortex, on by default in tsql)
    If on, post-processing of queries is allowed when needed after an index lookup, eg. to resolve unindexable terms like REX expressions, or like queries with a non-inverted Metamorph index. If off, some queries are faster, but may not be as accurate if they aren't completely resolved. The error message "Query would require post-processing" may be generated by such queries if alpostproc is off.

  • allinear (boolean, off by default in Vortex, on by default in tsql)
    If on, an all-linear query-one without any indexable "anchor" words-is allowed. A query like "/money #million" where all the terms use unindexable pattern matchers (REX, NPM or XPM) is an example. Such a query requires that the entire table be linearly searched, which can be very slow for a table of significant size.

    If allinear is off, all queries must have at least one term that can be resolved with the Metamorph index, and a Metamorph index must exist on the field. Under such circumstances, other unindexable terms in the query can generally be resolved quickly, if the "anchor" term limits the linear search to a tiny fraction of the table. The error message "Query would require linear search" may be generated by linear queries if allinear is off.

    Note that an otherwise indexable query like "rocket" may become linear if there is no Metamorph index on its field, or if an index for another part of the SQL query is favored instead by Texis. For example, with the SQL query "select Title from Books where Date > 'May 1998' and Title like 'gardening'" Texis may use a Date index rather than a Title Metamorph index for speed. In such a case it may be necessary to enable linear processing for a complicated query to proceed-since part of the table is being linearly searched.

  • alequivs (boolean, off by default in Vortex, on by default in tsql)
    If on, allows equivalences in queries. If off, only the actual terms in a query will be searched for; no equivalences. This is regardless of ~ usage or the setting of keepeqvs. Note that the equivalence file will still be used to check for phrases in the query, however. Turning this on allows greater search flexibility, as equivalent words to a term can be searched for, but decreases search speed.

  • alintersects (boolean, off by default in Vortex, on by default in tsql)
    If on, allow use of the @ (intersections) operator in queries. Queries with few or no intersections (eg. @0) may be slower, as they can generate a copious number of hits.

  • alnot (boolean, on by default)
    If on, allows "NOT" logic (eg. the - operator) in a query.

  • alwithin (boolean, off by default in Vortex, on by default in tsql)
    If on, "within" operators (w/) are allowed. These generally require a post-process to resolve, and hence can slow searches. If off, the error message "'delimiters' not allowed in query" will be generated if the within operator is used in a query.

  • alwild (boolean, on by default)
    If on, wildcards are allowed in queries. Wildcards can slow searches because potentially many words must be looked for.

  • qminprelen (integer, 2 by default in Vortex, 1 by default in tsql)
    The minimum allowed length of the prefix (non-* part) of a wildcard term. Short prefixes (eg. "a*") may match many words and thus slow the search.

  • qminwordlen (integer, 2 by default in Vortex, 1 by default in tsql)
    The minimum allowed length of a word in a query. Note that this is different from minwordlen, the minimum word length for prefix/suffix processing to occur.

  • qmaxsets (integer, 100 by default)
    The maximum number of sets (terms) allowed in a query. Added in version 2.6.934800000 19990816. Note: also settable as qmaxterms for back-compatibility with earlier versions.

  • qmaxsetwords (integer, 500 by default in Vortex, unlimited by default in tsql)
    unlimited by default in tsql)
    The maximum number of search words allowed per set (term), after equivalence and wildcard expansion. Some wildcard searches can potentially match thousands of distinct words in an index, many of which may be garbage or typos but still have to be looked up, slowing a query. If this limit is exceeded, a message such as "Max words per set exceeded at word `xyz*' in query `xyz* abc'" is generated, and the entire set is considered a noise word and not looked up in the index. A value of 0 means unlimited. Added in version 2.6.934900000 19990817.

    In version 3.0.947600000 20000110 and later, the set may only be partially dropped (with the message "Partially dropping term `xyz*' in query `xyz* abc'") depending on the setting of dropwordmode (which must be set with a SQL set statement). If dropwordmode is 0 (the default), the root word, valid suffixes, and more-common words are still searched, up to the qmaxsetwords limit if possible; the remaining wildcard matches are dropped. If dropwordmode is 1, the entire set is dropped as if a noise word.

    Note that qmaxsetwords is the max number of search words, not the number of matching hits after the search. Thus a single but often-occurring word like "html" counts as one word in this context.

  • qmaxwords (integer, 1100 by default in Vortex, unlimited by default in tsql)
    The maximum number of words allowed in the entire query, after equivalence and wildcard expansion. If this limit is exceeded, a message such as "Max words per query exceeded at word `xyz*' in query `xyz* abc'" is generated, and the query cannot be resolved. 0 means unlimited. Added in version 2.6.934900000 19990817. Like qmaxsetwords, this is distinct search words, not hits. dropwordmode also applies here.

  • denymode (string or integer; warning by default)
    What action to take when a disallowed query is attempted:

    • silent or 0
      Silently remove the offending set or operation.

    • warning or 1
      Remove the term and warn about it with a putmsg-catchable message.

    • error or 2
      Fail the query.
    A message such as "'delimiters' not allowed in query" may be generated when a disallowed query is attempted and denymode is not silent.

  • texisdefaults
    Restore Texis (as opposed to Vortex) default values: enable linear searches, post-processing, within operators, etc. Note: this will permit some queries to run than can potentially take an inordinate amount of time, even with a Metamorph index. Use with caution.

  • defaults
    Restore all settings to Vortex default values.

Copyright © Thunderstone Software     Last updated: Thu Mar 11 17:19:03 EST 2010
 
Home   ::   Products   ::   News   ::   Support   ::   Contact Us   ::   About
Copyright © 2010 Thunderstone Software LLC. All rights reserved.