Thunderstone Software Document Retreival and Management
Search:
Home Products News Support Contact Us About
Vortex Manual

Metamorph Hit Mark-up

 

The %s, %H, %V and %v <fmt> codes can execute Metamorph queries on the string argument and mark-up the resulting hits. An m flag to these codes indicates that Metamorph hit mark-up should occur; the Metamorph query string is then taken to be the next argument (before the normal string argument to be searched and printed). The m flag and its sub-flags are only valid for the %s and %H codes.

Following the m flag can be any of the following sub-flags. These must immediately follow the m flag, as some letters have other meanings elsewhere:

  • I for inline stylesheet (<span style=...>) highlighting with different styles per term

  • C for class (<span class=...>) highlighting with different classes per term

  • b for HTML bold highlighting of hits

  • B for VT100 bold highlighting of hits

  • U for VT100 underline highlighting of hits

  • h for HTML HREF highlighting (default)

  • n indicates that hits that overlap tags should not be truncated/moved

  • p for paragraph formatting: print "<p/>" at paragraph breaks

  • P same as p, but use (next additional argument) REX expression to match paragraph breaks. If given twice (PP), use another additional argument after REX expression as replacement string, instead of "<p/>. PP was added in version 6.

  • c to continue hit count into next query call

  • N to mark up NOT terms as well

  • e to mark up the exact query (no processing)

  • q to mark up the query itself, not the text, eg. as a legend

For example, to highlight query terms from $query in the text contained in $buffer in different colors, insert paragraph breaks, and escape the output to be HTML-safe, use:

  <fmt "%mIpH" $query $buffer>

Each hit found by the query has each of its sets' hits (eg. each term) highlighted in the output. With I and/or C highlighting, if there are delimiters used in the query, the entire delimited region is also highlighted. The Metamorph query uses the same APICP defaults and parameters as SQL queries. These can be changed with the apicp function (link).

If a width is given for the format code, it indicates the character offset in the string argument to begin the query and printing (0 is the first character). Thus a large text argument can be marked up in several chunks. Note that this differs from the normal behavior of the width, which is to specify the overall width of the field to print in. The precision is the same - it gives the maximum number of characters of the input string to print - only it starts counting from the width.

The h flag sets HREF highlighting (the default). Each hit becomes an HREF that links to the next hit in the output, with the last hit pointing back to the first. In the output, the anchors for the hits are named hitN, where N is the hit number (starting with 1).

Hits can be bold highlighted in the output with the b flag; this surrounds them with <b> and </b> tags. b and h can be combined; the default if neither is given is HREF highlighting. In version 5.01.1212100000 20080529 and later, the B and U flags may be given, for VT100-terminal bold and underline highlighting; this may be useful for command-line scripts.

In version 6 and later, the I or C flags may be given, for inline styles or classes. This allows much more flexibility in defining the markup, as a style or class for each distinct query term may then be defined. The styles and classes used can be controlled with <fmtcp> (link).

In version 5.01.1223065000 20081003 and later, the q flag may be given, to highlight the query itself, instead of the following text buffer (which must still be given but is ignored). This can be used at the top of a highlighted document to give a highlighting "legend" to illustrate what terms are highlighted and how. The n and e flags are also implicitly enabled when q is given.

Normally, hits that overlap HTML tags in the search string are truncated or moved to appear outside the tag in the output, so that the highlighting tags don't overlap them and muddle the HTML output. The n tag indicates that this truncation should not be done. (It is also not done for the %H (HTML escapement) format code, since the tags in the string will be escaped already.)

The p and P flags do paragraph formatting as documented previously.

The c flag indicates that the hit count should be continued for the next query. By default, the last hit marked up is linked back to the first hit. Therefore, each %-code query markup is self-contained: if multiple calls are made, the hit count (and resulting HREFs) will start over for each call, which may not be desired. If the c flag is given, the last hit in the string is linked to the "next" hit (N+1) instead of the first, and the next query will start numbering hits at N+1 instead of 1. Thus, all but the last query markup call by a script should use the c flag.

The e flag indicates that the query should be used exactly as given. Normally, some processing is done to the query to help ensure that all occurrences of its terms are highlighted: in effect, "w/." is prefixed and "@0 w/." is appended to the end. Otherwise redundant hits may be ignored, eg. if "within-document" is set (no w/ in the query). With e set, such processing is not done, and some apparent hits may be left unhighlighted. This processing and the e flag were added in version 2.00.897097720 19980605.

The following example marks up each $body value from a table that matches the user's submitted $query string. The hits are marked with HREFs to link them, and the $body text is HTML-escaped:

<SQL MAX=10 "select body from data where body like $query">
  <fmt "%mhH" $query $body>
</SQL>


Copyright © Thunderstone Software     Last updated: Thu Jul 1 14:19:39 EDT 2010
 
Home   ::   Products   ::   News   ::   Support   ::   Contact Us   ::   About
Copyright © 2010 Thunderstone Software LLC. All rights reserved.