Coronita

Announcement
htld


Selene

Announcement


Malete

DownLoad
Status
OverView
Usage
Structures
Protocol
Query
MultiProcess


formats

FileFormats
character sets
... and converting
CDS/ISIS
IIF/ISO2709


misc

changes from earlier versions
tag numbers


drafts (partly obsolete)

MetaData
object model
Tcl
MOM
About cases and trunks: La Maleta and the Malete Object Model.
DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT

This document describes two data structures:
  • la maleta (suitcase or Malete Array MA) is a flexible and lightweight two dimensional array, which can be represented (stored, exchanged) as and provides an interface to a Malete Record
  • el maletero (car boot, trunk or Malete Object MO) is an extended maleta, supporting a DOM-style tree of contained "objects". The term "object" here, like in the somewhat mislabeled DOM, relates to structure, not to behaviour (methods).


la maleta - the Malete Array

While the actual implementation of a maleta (e.g. by means of an actual two dimensional array) is not part of this specification, the concepts are probably easiest to understand by thinking in terms of a Malete Record as described in RecStruct .
In the first dimension, there is a list of fields (pairs of tags and values). Every fields value is typically structured into subfields. The first (index 0) field's value (header) is considered special, it typically contains some record's id a/o control information. Other fields (body) constitute the record's contents.
A maleta is considered to be a field (it's field 0) augmented by a body.

Like any array-like data structure, the maleta uses an index expression to address it's parts for either reading them or assigning to them. Only values can be assigned to; tags can only be inserted or deleted.
Like the PHP array, it has a builtin cursor (in the first dimension) for the concept of a current field.
Like the Perl array, it supports slices (addressing multiple parts at once) in both dimensions.

Here we describe the textual representation of an index, which implementations will typically parse into an internal representation.
The parts of an index are, optional, but in this order:
  • a field spec selecting one or more fields by tag or position
  • a subfield spec selecting one or more subfields by id or position
  • a range spec selecting an offset and length
  • a key spec selecting a key to match
Every part has an operator and value (id). An index may address multiple fields or subfields. Selecting both depends on the implementation supporting nested lists. Implementations may ignore whitespace in index expressions.

The field spec uses a numerical value as tag or position. Addressing a single field sets the cursor to that field:
  • '-' first:
    reset to head and move to next (having tag=id, if given).
  • '+' next:
    move to the next element (having tag=id, if given) without resetting
  • (none) current:
    no change with no id or if cursor is on tag=id, else first.
  • '@' index:
    selects the ith element, using id as index (0 is head).

Addressing multiple fields:
  • '--' loop:
    loop elements having id, returning a list of the individual results. Without an id, the list contains alternating tags and values.
  • '++' end:
    loop at end; useful to append fields
  • '@@' values:
    loops, returning a list of values.

The subfield spec defaults to none, selecting the entire field value.
  • '^' subfield:
    selects the value (with id cut) of the subfield with this id. Id may also be the pseudo subfield '&', selecting the tag, or '@', selecting the cursor position.
  • '?' test:
    returns boolean 0/1 whether the field a/o subfield (with id) exists.
  • '!' break:
    returns the field or (with id) subfield value, breaks processing else.
  • '#' position:
    with a number, selects the ith subfield value, including any id.

Addressing multiple subfields:
  • '^^' subfields:
    returns a list of subfield values (with id cut) for the given id or all. Without an id, the list contains alternating ids and values.
  • '##' position:
    returns a list of unmodified values (i.e. a split on subfield delimiter).

A range spec can have one or both, in that order, of the following:
  • '*' offset:
    cuts the first offset bytes (not characters)
  • '.' length:
    cuts to the first length bytes

A keyspec is part of setting the cursor, doing a next while the selected data does not match the specified key. When used with a test or break, it applies to the data (not the boolean result), and, with empty field spec, returns false or breaks, instead of moving to next.
  • '==' exact:
    checks for exact match
  • '=%' prefix:
    checks for prefix match
  • '=:' contains:
    checks for substring
  • '=~' expr:
    evaluate key as regular expression (optional extension)

Index expressions are independent of any metadata. Especially they do not know anything about fixed subfields, but only check for the delimiter character. Fixed subfields may be accessed using ranges.
However, a helper procedure can be set to rewrite bad expressions, e.g. turning field and subfield names into tags, subfield identifiers and ranges.

Minimal implementation requirements:
  • tags may be limited to the range 0 to 65534, inclusive
  • position ('#'), offset ('*') and length ('.') may be limited to the range 0 to 255, inclusive

array operators

Basic operations on maletas are
  • getting a single index
    returns the value or list (empty value or list if not found)
  • getting multiple indexes
    A failing test or break stops processing. A positive test does not produce an output value. Returns a list of the values returned by each index (unless there is only one non-test index).

In Tcl, get is the default operation. Examples, assuming a maleta called v:
 v 24 ;# select the current (or first) field 24
 v 24^a ^b ;# list of a and b subfield of current field 24
 foreach {i v} [v ^^] { puts "$i=$v" } ;# list all subfields of current
 v --24 ;# list of all 24 values
 v td^width ;# helper should rewrite this to 100^w
 v -24?a:foo .2 ;# the MARC indicators of first 24 field where ^a contains foo
 $v->get("-24?a:foo", ".2"); # more verbose in PHP, Perl
 v.get("-24?a:foo .2"); // no varargs in Java, split at blanks

Assignment (set), like retrieval, takes any number of string parameters. Implementations should also support passing multiple values in one parameter as a list, maleta or serialized record. Depending on the environment, this may require a different or overloaded method.
An index addressing a single value (i.e. not a test) takes the next parameter as new value. If the addressed item does not exist, it is created. Assigning no value (there is no next parameter) deletes an item.
If multiple items are addressed, all following parameters (or the elements of a single list parameter) are applied in turn. Excess parameters create new items, lacking parameters delete items.

Tcl uses a '=' parameter as assignment operator, '=@' to assign from a list. Examples:
 v ^a = $a ^b = $b ;# set current subfields a and b to the variables
 v --24 = foo bar baz ;# rec has now exactly 3 24 fields
 v --24 =@ {foo bar baz} ;# same
 $v->set("--24", "foo", "bar", "baz");

Insertion is a variant of assignment addressing newly created items.

el maletero - the Malete Object

A maletero (or MO) is a maleta where every field is itself a maletero, i.e. can have a body. It's body fields are called childs.
A maletero corresponds to a region (contigous sequence of fields) in a plain Malete record by means of counted or delimited structures.

Maleteros come in three flavours:
  • list (plain vanilla):
    The maletero behaves exactly like a maleta. All childs are treated as simple fields, regardless of their tag. The MO maps one-to-one to it's record. This is the most efficient mode where no complex childs are needed.
  • struct (+ strawberry, chocolate):
    Childs with non-negative tags are treated as simple. A child with a negative tag -n corresponds to a region spanning n fields. This includes one field for the child's tag and header and any fields it's childs correspond to in turn. When looping or setting the cursor, counted subrecords are recognized and their body is skipped over.
  • mom (with fruit and liquor):
    in this DOM-style mode, only fields with tag 0 are simple (textnodes). Every child with a positive tag orresponds to a delimited structure. An implementation may or may not support counted structures in mom mode.

A maletero provides object handles to it's parent and childs, either by modifying the current handle or by creating new handles. New handles can be based on a copy of the corresponding record or use region in the same record. The latter may not be supported by all implementations or make the objects immutable to avoid conflicting concurrent modifications.

implementations

A basic implementation may provide only a maleta, which is sufficient for traditional CDS/ISIS style record access.
A complete implementation may provide only a maletero which can be used as maleta (like in english trunk means both car boot and suitcase).
A particularly efficient implementation may provide both separately.

The initial implementation is a Tcl extension (written in C), optionally augmented by a Tcl module (written in Tcl). The abstract model, however, can be similarly implemented in other languages.

$Id: MOM.txt,v 1.3 2004/05/03 13:04:36 kripke Exp $