|
| 10.4 Fast Value Lookup - xtree |
|
The <xtree>
function in Vortex is an extremely fast
way to maintain a temporary list of strings, and be able to search
the list quickly. Each value is maintained with the number of
times it was inserted. This makes <xtree>
useful for
histogram operations.
For example, we could find out the most common words in a
large chunk of English text like this:
(Run this example.
Download the source.)
<SCRIPT LANGUAGE=vortex>
<A NAME=main PUBLIC>
<FORM METHOD=post ACTION=$url/search.html>
Text:<BR>
<TEXTAREA NAME=text ROWS=10 COLS=60>$text</TEXTAREA><BR>
<INPUT TYPE=submit>
</FORM>
</A>
<A NAME=search PUBLIC>
<main>
<lower $text>
<rex ROW "\alnum+" $ret>
<xtree INSERT $ret>
</rex>
<xtree SKIP=0 DUMP></xtree>
<sort $ret.count DESC $ret>
Top 10 words are: <P>
<LOOP MAX=10 $ret $ret.count>
<B>$ret</B> occured $ret.count times <BR>
</LOOP>
</A>
</SCRIPT>
|
Here we ask for a chunk of text, lower-case it for case insensitivity,
and use <rex>
in a loop to pull out every word, one at a time.
Each word we insert into xtree
, which will store only the unique
words and count the duplicates.
Then we DUMP
the entire set of unique words, along with
their corresponding counts - <xtree>
sets the special
variables $ret.count
and $ret.seq
in addition to
the usual $ret
. This returns the words in sorted order; we
want to know the most frequent, so we <sort>
the list by
frequency. Then the top 5 are listed.
If we run this example with the Gettysburg address, we see
(next page):
|