|
A schema file contains the basic information necessary to create a
table or tables of a particular type. Associated information would be
included in this file as comments. Information which TIMPORT will act
upon directly is listed as keywords with assigned values.
The simplest kind of table you could create would be one where the
full text of each file was loaded into a table as a text field, and
statistics about each file would be captured into respective fields.
This requires virtually no study of text content, so we'll use it as
the first example.
The Thunderstone's old indexing program 3DB indexed text files into a
database. TIMPORT can create this general kind of table with the
sample schema file provided, called 3db.sch. The content of
this schema file follows:
#
# create a 3DB style Texis table with no extra info
#
database /tmp/testdb
table threedb
stats
# create table threedb(id counter,File varind,Fsize long,Ftime date);
To make sense of this file, read it with the following rules in mind,
which apply to all schema files:
Preliminary Schema File Format Rules
- The Texis server must be running for client/server data loading
. - Comment lines start with a
# character. - Blank lines are ignored.
- Syntax is "keyword value(s)", where any number of
space(s) and/or tab(s) separate keywords and values. Or
"keyword=value(s)".
- Each line should be terminated with a newline.
- Order is not important except that fields must be listed in the
order that they appeared in the CREATE TABLE statement and
fields should be listed last.
The first 3 lines of the example file begin with a #, as does the last.
These are comment lines but include important information to the
creation of the table. The first comment describes what this schema
file is for. The last comment gives the exact CREATE TABLE
command to create with Texis first, before running TIMPORT on this
schema file.
The remaining 3 lines which are not comments, are the keywords and
their values which TIMPORT will act on to create the table. This is
the lowest minimum requirement to a schema file:
- A database must always be listed; in this case it is named
/tmp/testdb. The keyword is database, separated with one
or more spaces or tabs from its value /tmp/testdb. - A table (or tables) must always be listed; in this case it
is named
threedb. The keyword is table, separated with
one or more spaces or tabs from its value threedb. - Some information for the table's fields must be listed. In this
case that is done by using the keyword stats, which requires no
value.
Stats automatically gets the file size and date. Field
information can alternatively be obtained by listing the fields
individually, as will be shown in later examples. Where no
fields have been defined and stats is used, it will also
automatically load the full text of the file as an indirect field.
You can use the keyword stats along with specified
fields, to capture file size and date. To also load the full text of
the file where additional fields have been specified, you would
specify it as a field within the schema file.
In data type terms, stats adds the fields "Fsize long"
and "Ftime date" and fills them in with the file's info for
each file. It will also add "File varind" if no fields have
been defined. Refer the Texis manual for a more complete
understanding of data types.
Copyright © Thunderstone Software Last updated: Sun Mar 17 21:14:49 EDT 2013
|