IIF, MARC and Z39.50
IIF is the "Information Interchange Format", a record serialization format
specified in ISO standard 2709, also published as ANSI
IIF is mostly a plaintext format, in that almost any information is encoded
using ASCII characters (no binary numbers) and the only control characters
used are byte values 29 (record terminator RT), 30 (field terminator FT)
and 31 (as subfield delimiter).
("MAchine Readable Catalogue") is actually a family of largely incompatible
, UKMARC, ...) that evolved from MARC I (1965).
While the main concern of the MARC standards is to specify actual data models
(assigning tags and subfield codes, which can be used perfectly well in
Malete, CDS/ISIS or other databases), they also specify a variant of IIF as
suggested common format for data exchange, which we here refer to as "MARC".
(This file syntax seems to be mostly the same for all MARC standards).
is a network protocol to search and retrieve records.
It supports various query "languages", the most commonly used of which
is called Type-1 query. Type-1 is similar to the queries as supported
by Malete and CDS/ISIS, however, much more general and complex.
Terms can be searched for in any indexed field or with restriction
to one or more "attributes".
Attributes are basically the tags used in the index, which are almost always
different from those used in records. While it is common for records to use
any of the various MARCs or even completely different formats, the attributes
used in bibliographical systems are typically those specified by the Bib-1
attribute set (e.g. assigning 4 to title).
Z39.50 allows a client to select a record format from various conversions
supported by a server. When a MARC format is selected,
the data is actually transmitted serialized according to IIF.
| IIF and MARC serialized records |
IIF specifies a serialization for records. Like the Malete record data file,
an IIF file is simply a stream of such records; there is no additional
A record has
- a 24 byte leader, containing 16 bytes structural data
and 8 bytes application data (x, imported as "MARC leader").
The format for MARC is LLLLLxxxxx22BBBBBxxx4500.
The Ls and Bs are total record length (including leader and a terminating RT)
and start of data (field values, after an FT terminating the dictionary).
The first '2' denotes that every field starts with two indicator bytes,
the second is the subfield identifier length including the delimiter char.
- a "dictionary" array with one entry per field containing 3 bytes tag,
and n and m bytes for length and offset.
n and m are digits at leader offset 20 and 21, MARC uses 4 and 5.
In general IIF, leader byte 22 may specify a number of implementation
defined entry bytes.
- the actual field values, each terminated by the FT character.
As opposed to folklore, MARC does NOT use a '$' as subfield delimiter,
nor a '#' for unused indicators. Rather, the examples in the specs
use a '$' to REPRESENT the subfield delimiter control character 31 (^_),
and a '#' to REPRESENT a blank. The RT(29, ^]) is sometimes represented as '\'
and the FT(30, ^^) as '^' or '@'.
| Malete IIF import and export |
The malete tool provides two rather simplistic
iifimp and iifexp.
The command specific options are:
specify full filename for the IIF files.
Default is the basename of the Malete database with extension .iif.
On UNIX, a filename '-' selects stdin/out.
- Nomarc (literally)
do not assume the MARC structure 22/450 on import. Requires proper IIF data.
on export, prepend indicators ii and, where needed, subfield c.
A single -P uses two blanks as indicators and subfield '0'.
Suggested to produce at least syntactically correct MARC.
- Rid (literally)
on import, use a numeric control number (1st field, if it has tag 1)
as record id. Note that on export, the record id is always used as
control number unless the record already has one,
since this is specified as a must not only by MARC, but by IIF.
| creating proper IIF from WinIsis |
In Database-Export, set the subfield separator to \031 and
output line length to 0.
If the fields do not contain valid MARC data, use a reformatting FST like
001 0 MFN
044 0 |00^a|,v44
024 0 |00^a|,v24
026 0 |00|,v26
070 0 (|00^a|,v70/)
Make sure, that
Still the output is not 100% correct, since WinIsis sets
number of indicators and identifier length to 0, where MARC specifies 2.
However, many other MARC processors, including zebraidx, ignore these settings.
- the first output field is tag 1 containing some unique id
- every field (other than tagged 00*) starts with two indicator characters
(really should be blank, but that would be stripped during export)
- the indicators are followed by a delimiter and subfield identifier
| making MARC data available via Z39.50 |
MARC records can be made easily available using indexdata's
If records in your IIF file use tags and subfields conforming to, say, USmarc,
simply check out the test/usmarc example in the zebra distribution.
Put your data in the records subdir and run "zebraidx update records; zebrasrv".
If your data was exported from WinIsis, you may want to put a line
"encoding Cp850" in the .abs file.
You must use recordType: grs.marc.something, meaning that it's general
structured data in some marc file format.
The sample usmarc.abs uses the "marc usmarc.mar" statement,
and usmarc.mar (in the zebra/tab directory) contains "reference USmarc",
stating that the marc input actually IS in USmarc.
This need not be the truth, it just means that the records will be served
as is, if a client asks for USmarc.
However, only the tags listed in "elm" statements in the .abs files
will be indexed.
Note that zebra's indexing support is not as flexible as that of CDS/ISIS:
you can only select fields or subfields to be indexed in one of a couple
of modes (like word or phrase). To take full advantage of sophisticated
CDS/ISIS FSTs, include them in your export reformatting FST.
Use some otherwise unused field tags to hold the index terms and "elm"
statements to map them to bib-1 attributes.
Omit those fields from the display mapping.
To keep the data in its native format (say CDS), change the elm
statements to map the fields to index to the corresponding bib-1 attributes
for searching, e.g. "elm 024 Conference-name !",
and, instead of using the "marc usmarc.mar" statement,
create one or more maptabs to map the full record to one or more
USmarc a/o other presentation formats as applicable.
Check out the gils-usmarc.map example in the zebra/tab directory.
| links |
$Id: IIF.txt,v 1.6 2005/05/24 16:44:06 kripke Exp $