Other Format Codes

In addition to the standard printf() formatting codes, other <fmt> codes are available:

  • %t, %T strftime()-style output of a date or counter field (see above)

  • %L Output of a latitude, longitude, or location (geocode); see above

  • %H

    Prints its string (e.g. varchar) argument, applying HTML escape codes where needed to make the string "safe" for HTML output (", &, <, >, DEL and control chars less than 32 except TAB, LF, FF and CR are escaped). With the ! flag, decodes instead (to ISO-8859-1); see also the l (el) flag, here. The j flag (here) may be given for newline translation. When decoding with !, out-of-ISO-8859-1-range characters are output as ?; to decode HTML to UTF-8 instead, use %hV.

  • %U Prints its string argument, encoding for a URL, i.e using %-codes. With the ! flag, decodes instead. In compatibilityversion 8 and later, the characters colon, tilde, exclamation point, dollar-sign, single-quote, left- and right-parenthesis, asterisk, and comma are left as-is (earlier versions percent-encoded them), and space is encoded as %20 (instead of +), since that encoding is safe in both path(-like) and query parts of a URL. With the p (path) flag, spaces are encoded as %20 instead of +, and in compatibilityversion 8 and later &+;= (query-relevant characters) are left as-is (since they should not need to be encoded in the path part of the URL) and + is not decoded (when ! flag also given). With the q (query) flag, in compatibilityversion 8 and later + is used instead of %20 for space (since it is more compact and readable, and safe in the query part of the URL) and colon is percent-encoded. With the q flag in compatibilityversion 7 and earlier, slash (/) and at-sign (@) are encoded as well (or only unreserved/safe chars are decoded, if ! too). See Extended Flags, here.

  • %V (upper-case vee)

    Prints its string argument, encoding 8-bit ISO-8859-1 chars for UTF-8 (compressed Unicode). With the ! flag (here), decodes instead (to ISO-8859-1). Illegal, truncated, or out-of-range sequences are translated as question-marks (?); this can be modified with the h flag (here). The j flag (here) may be given for newline translation. Added in version 3.01.970000000 20000926.

  • %v (lower-case vee) Prints its UTF-8 string argument, encoding to UTF-16. With the ! flag (here), decodes to UTF-8 instead. Illegal, truncated, or out-of-range sequences are translated as question-marks (?); this can be modified with the h flag (here. The < (less-than) flag forces UTF-16LE (little-endian) output (encode) or treats input as little-endian (decode). The > flag forces UTF-16BE (big-endian) output (encode) or treats input as big-endian (decode). The default endian-ness is big-endian; for decode, a leading byte-order-mark character (hex 0xFEFF) will determine endian-ness if present. The _ (underscore) flag skips printing a leading byte-order-mark when encoding; when decoding the _ flag saves (does not delete) a leading byte-order-mark in the input. The j flag (here) may be given for newline translation. Added in version 4.03.1049741744 20030407.

  • %B Prints its string argument, encoding to base64. If a non-zero field width is given, a newline is output after every "width" bytes output (absolute value, rounded up to 4) and at the end of the base64 output. Thus "%64B" would format with no more than 64 bytes per line. This is useful for encoding into a MIME mail message with line length restraints. A ! flag indicates that the string is to be decoded instead of encoded. The j flag (here) may be given to set the newline style, though it only applies to soft (output) newlines; input CR/LF bytes are never modified since base64 is a binary encoding. The p flag may be given in version 8.01.1677102000 20230222 and later to use the base64url (URL/path-safe, RFC 4648) charset instead of the base64 charset (i.e. use "-_" instead of "+/"). The _ (underscore) flag may be given in version 8.01.1677102000 20230222 and later to skip (not print) any "=" padding normally called for at the end.

    Note that in version 8.01.1680035921 20230328 and later, decoding is more strict: characters encountered other than in the requested (base64/base64url) charset, =, or whitespace cause an error (suppressable with <urlcp charsetmsgs>) and ? to be output. Previous versions silently ignored such out-of-domain characters. This change helps detect corrupt base64 data - or when the p flag is inadvertently forgotten (or used).

    The %B code was added in version 3.01.984400000 20010312.

  • %Q Prints its string argument, encoding to quoted-printable (per RFC 2045). If a non-zero field width is given, a newline is output after every "width" bytes output (absolute value, rounded up where needed). A negative field width or - flag indicates "binary" encoding: input CR and LF bytes are also hex-encoded; normally they are output as-is (or subject to the j flag, here) and therefore subject to possible newline translation by a mail transfer agent etc. A ! flag indicates that decoding instead of encoding is to be done (and the field width and negative flag are ignored). The j flag (here) may be given for newline translation.

    If an underscore (_) flag is given, "Q" encoding (per RFC 2047) is used instead of quoted-printable: it is similar, except that U+0020 (space) is output as underscore (_), no whitespace is ever output (e.g. tab/CR/LF are hex-encoded, and the field width is ignored), and certain other special characters are hex-encoded that normally would not be (e.g. dollar sign, percent, ampersand etc.). With the underscore flag, the resulting output is safe for all RFC 2047 "Q" encoding contexts.

    Added in version 4.03.1051320912 20030425.

  • %W

    Prints its UTF-8 string argument, encoding linear-whitespace-separated tokens to RFC 2047 encoded-word format (i.e. "=?...?=" mail header tokens) as needed. Tokens that do not require encoding are left as-is. A ! flag indicates that decoding instead of encoding should be done. A q flag for %W indicates that only the "Q" encoding should be used for encoded words; normally either Q or base64 - whichever is shorter - is used. The hh, hhh, j, ^ and | flags are respected. In version 7.02.1421703000 20150119 and later, the h flag is supported for %!W. If a non-zero field width is given, it is used as the desired maximum byte length of encoded words: if an encoded word would be longer than this, it is split atomically into multiple words, separated by newline-space. Added in version 6.00.1283370000 20100901.

  • %z

    Prints its argument, encoded (compressed) in the gzip deflate format. The ! flag will decode (decompress) the argument instead. A precision value will limit the output to that many bytes, as with %s; this can be used to "peek" at the start of compressed data without decoding all of it (and consuming memory to do so). Added in version 7.05.1457041000 20160303.

    In version 7.07.1579815000 20200123 and later, for either encode or decode, a single l flag may be given to indicate zlib deflate format instead, or a double ll to indicate raw deflate format instead. All variants use the same deflate algorithm, but gzip adds (typically) 18 bytes of headers/footers, zlib 6, and raw none. Additionally in this version and later, decoding with %!z (no flags) will accept any of the three variants.

  • %b Binary output of an integer.

  • %F Prints a float as a fraction: whole number plus fraction.

  • %r Lowercase Roman numeral output of an integer.

  • %R Uppercase Roman numeral output of an integer.

  • %/

    Print platform-specific directory separator, e.g. "/" for Unix and "\" for Windows. No argument. Added in version 5.01.1131507000 20051108. With an l (el) flag, the code instead prints a REX character class (bracketed expression) to match all valid directory separators, e.g. "[/]" for Unix, and "[\\/]" for Windows; this behavior was added in version 7.00.1352409000 20121108, which also added the ! flag to negate the expression. The REX repetition operator is omitted for user flexibility in adding one.

  • %:

    Print platform-specific search path separator, e.g. ":" for Unix and ";" for Windows. No argument. Added in version 5.01.1131507000 20051108. With an l (el) flag, the code instead prints a REX character class (bracketed expression) to match all valid search path separators, e.g. "[:]" for Unix, and "[;]" for Windows; this behavior was added in version 7.00.1352409000 20121108, which also added the ! flag to negate the expression. The REX repetition operator is omitted for user flexibility in adding one. (See also %/ footnote.)

All the standard flags, as well as the extended flags (below), can be given to these codes, where applicable. Examples:

<fmt "Year %R %H %R" 1977 "<" 1997>
  Year MCMLXXVII &lt; MCMXCVII
  <fmt "%F" 5.75>
  5 3/4


Copyright © Thunderstone Software     Last updated: Apr 15 2024
Copyright © 2024 Thunderstone Software LLC. All rights reserved.