Expressions are composed of characters and operators. Operators
are characters with special meaning to REX. The following
characters have special meaning: "\=+*?{},[]^$.-!" and must
be escaped with a `\' if they are meant to be taken literally.
The string ">>" is also special and if it is to be matched,
it should be written "\>>" .
A `\' followed by an `R' or an `I' mean to begin
respecting or ignoring alphabetic case distinction. (Ignoring case is
the default.) These switches do not apply inside range brackets.
A `\' followed by an `L' indicates that the characters
following are to be taken literally up to the next `\L'. The
purpose of this operation is to remove the special meanings from
characters.
A subexpression following `\F' (followed by) or `\P'
(preceded by) can be used to root the rest of an expression to which
it is tied. It means to look for the rest of the expression "as long as followed
by ..." or "as long as preceded by ..." the subexpression
following the \F or \P, but the designated subexpression will be
considered excluded from the located expression itself.
A `\' followed by one of the following `C' language character
classes matches that character class: alpha, upper, lower,
digit, xdigit, alnum, space, punct,
print, graph, cntrl, ascii.
A `\' followed by one of the following special characters
will assume the following meaning: n=newline, t=tab,
v=vertical tab, b=backspace, r=carriage return,
f=form feed, 0=the null character.
A `\' followed by Xn or Xnn where n is a
hexadecimal digit will match that character.
A `\' followed by any single character (not one of the
above) matches that character. Escaping a character that is
not a special escape is not recommended, as the expression
could change meaning if the character becomes an escape in a
future release.
The character `^' placed anywhere in an expression (except after a
`[') matches the beginning of a line. (same as: \x0A in Unix or
\x0D\x0A in Windows)
The character `$' placed anywhere in an expression
matches the end of a line. (\x0A in Unix, \x0D\x0A in Windows)
The character `.' matches any character.
A single character not having special meaning matches that character.
A string enclosed in brackets [] is a set, and matches
any single character from the string.
Ranges of ASCII character codes may be
abbreviated as in [a-z] or [0-9]. A `^'
occurring as the first character of the set will invert the
meaning of the set. A literal `-' must be preceded by a
`\'. The case of alphabetic characters is always respected
within brackets.
A double-dash ("--") may be used inside a bracketed set
to subtract characters from the set; e.g. "[\alpha--x]"
for all alphabetic characters except "x". The
left-hand side of a set subtraction must be a range, character
class, or another set subtraction. The right-hand side of a set
subtraction must be a range, character class, or a single
character. Set subtraction groups left-to-right. The range
operator "-" has precedence over set subtraction.
Set subtraction was added in version 6.
The `>>' operator in the first position of a fixed expression
will force REX to use that expression as the "root" expression
off which the other fixed expressions are matched. This operator
overrides one of the optimizers in REX. This operator can
be quite handy if you are trying to match an expression
with a `!' operator or if you are matching an item that
is surrounded by other items. For example: "x+>>y+z+"
would force REX to find the "y's" first then go
backwards and forwards for the leading "x's" and trailing
"z's".
The `!' character in the first position of an expression means
that it is not to match the following fixed expression.
For example: "start=!finish+" would match the word "start"
and anything past it up to (but not including the word "finish".
Usually operations involving the "!" operator involve knowing
what direction the pattern is being matched in. In these cases
the `>>' operator comes in handy. If the `>>' operator is
used, it comes before the `!'. For example:
">>start=!finish+finish" would match anything that began
with "start" and ended with "finish". The "!"
operator cannot be used by itself in an expression, or as the root
expression in a compound expression. NOTE: This "!" operator
"nots" the whole expression rather than its sequence of characters,
as in earlier versions of REX.
Note that "!" expressions match a character at a time,
so their repetition operators count characters, not
expression-lengths as with normal expressions.
E.g. "!finish{2,4}" matches 2 to 4 characters, whereas
"finish{2,4}" matches 2 to 4 times the length of
"finish".