character sets
... and converting


changes from earlier versions
tag numbers

drafts (partly obsolete)

object model
HTLD - the hypertext linker


HTLD does to hypertext files, what does to dynamically linked binaries. It resolves references to parts stored in independent files or variables, inserting them at the specified locations into the output stream.
It actually is in no way confined to "hypertext" or even to text, however, this is a typical usage and there is special support for URL and HTML encoding.

There is a standalone htld binary, which is typically the interpreter of executable htld documents (just like and friends are the interpreters to run dynamically linked binaries).
Engines for dynamic web content may also contain functions to resolve htld files.

variables and encoding

Htld reads variables from their first occurrence in the environment variable QUERY_STRING.
Htld urldecodes variables, refusing malformed %xx and any (C0) control character (less than 32) but HT, CR and LF.
  • encodes as "HTML" by using entities lt, gt, amp and quot
  • encodes as "URL" by passing alphanum and "!'()*,-._~", encoding blank as + (violating RFC 2396 for aesthetical reasons) and everything else as (lowercase) %02x

the htld file format

A htld file consist of a textual header, a blank line and a body. Header lines and the closing blank line are terminated by linefeed (10).
The first line typically specifies the htld interpreter like '#!/bin/htld'. Other header lines contain linking instructions (ASCII digits 0-7) followed by parameters as tabulator (9) separated fields. Offsets are relative to the body and must be in ascending order.
  • 0 _header_
    declares additional header to send
  • 1 _ctype_
    declares content type
  • 2 _ttl_
    declares cache time to live (Expires: now+_ttl_) in seconds. Use a single '-' for no cache.
  • 3 _defaults_
    declares an URL encoded string to append to QUERY_STRING, effectively providing parameter defaults for instructions following this.
  • 4 _offset_ _var_
    include the plain value of var at offset (after applying any URL decoding)
  • 5 _offset_ _var_
    include the URL encoded value of var at offset (with URL decoding and encoding)
  • 6 _offset_ _var_
    include the value of var at offset (with URL decoding and minimal HTML encoding of lt, gt, amp and quot)
  • 7 _offset_ _path_[?_query_]
    include the contents of file _path_ at offset. If the file starts with a #! line containing "htld", it is linked up recursively. See below for details.

An include _path_ is handled by first replacing any $varname in _path_ by varname's URL encoded value. A value starting with '.' is refused at the beginning or after a '/' (a '/' in the value is encoded as %2f).
Then anything following a question mark is stripped and prepended to QUERY_STRING during the include (effectively overriding vars; the original path, however, is untouched during recursive includes). The remaining path is used to stat the file to include, so it should usually be relative to the webroot.
Offset may be 'H' or 'F' for headers and footers, resp. Header and footer instructions are ignored on recursive includes. The header file is typically used to include anything up to the opening body tag, so the htld doc's body is a proper HTML fragment suitable to be included elsewhere.

Unlike server side includes as featured by Apache and other webservers, htld will in no way parse the body nor replace anything in the body, but rather require all locations for linking to be precomputed. Some ht compiler may be used to turn SSI-style comments in a html file into a htld linkable object.

$Id: htld.txt,v 1.1 2005/07/31 11:21:33 krip Exp $