XX:<CHIVES.BETA.DOCUMENTATION>USRDEF.DOC.2, 24-Dec-87 23:54:41, Edit by SRA
This document is very out of date and needs to be updated at some
point.  The real documentation resides in the code and in the DSYMS
file (USRDEF.D) that defines the data structures and tokens (q.v.).
================================================================

Rough draft of user message protcol (analog to CMU "RED" protocol).

Protocol version "01" (see word zero protocol field, below).

----------------------------------------------------------------

Messages come in page sized segments.  For convenience in writing the
C code, a chunk is one Twenex page (size matters for 20x IPCF, doesn't
for ITS USR: device mapping).

----------------------------------------------------------------

The first four words of each page are always the same.

word 0:	sixbit "DOMdnn"

	"d" is one of:
		"Q"	query packet/state
		"R"	response packet/state
		"W"	wait state (ITS only, resolver has received
			message but hasn't answered yet)
		"E"	special response indicating that the protocol
			version didn't match; in this case, the client
			code can check the "nn" field to see what
			version of the protocol the resolver speaks.

	"nn" is a two character version encoding.  This document
	defines version "01" of the protocol.

	The idea is that the resolver just check for DOMQnn in the
	first word, where nn is the protocol version number the
	resolver was compiled with.  Likewise the user side code can
	check for DOMAnn.  For this reason, the first word must remain
	the version word, even in subsequent revisions of this
	protocol.

word 1:	page_count ,, page_this

	page_count is the number of pages in this message.
	page_this  is the number of this page in the message.

	This word is used to reassemble long messages.  Normal queries
	and responses should fit on a single page, but some types of
	queries (QCLASS=* QTYPE=MG QNAME="BUG-SCHEME.MC.LCS.MIT.EDU")
	may return very long answers.

words 2 & 3:	unique id

	These two words are a timestamp/id doubleword.  This can be
	anything, its purpose is to give each query a unique id that
	will not come around again for a long time.  On twenex a good
	pair of values to use here might be the results of GTAD% and
	HPTIM%.  On ITS, .GENSYM and .CALL RQDATE might provide
	suitable values.  The only things the resolver will ever do
	with this field is copy it from query to answer and compare
	two instances of this field for equality. 

----------------------------------------------------------------

The rest of each page is message data.  A sequence of pages is viewed
as an address space, with the first page being at offset zero, the
second page at offset 01000, etcetera.  Because of the four word
header at the top of each page, addresses have to skip from 0777 to
1004 and so on.  Tough.

The message data is laid out in a format similar to the network
message format used in RFC883, except that there are no string
compression crocks and integer values are not in the depths of byte
strings. 

Strings and domain names are represented by (18 bit, message relative)
pointers.  All other values appear inline as specified below.

----------------------------------------------------------------

The first four words of the message data specify the query and the
number of answering RRs.  Offsets here are relative to page zero,
since this is still fixed position data appearing in every message.

word 4:	sixbit query operation (one of "QUERY", "IQUERY", "CQRYU",
	"CQRYM", "HOSTS", "HOSTS3", maybe others), or sixbit ".CTL."
	for a control message (defined elsewhere if at all).

word 5:	flags or error code ,, pointer to QNAME.

word 6:	QTYPE ,, QCLASS.

word 7:	count of answering RRs following.

For queries, LH(word 4) contains flags, which are used to control
resolver searches (local data only, must be authoritative, etcetera),
word 6 is zero, and this is the end of the message (except for the
QNAME string, which presumably follows word 6).

For answers, LH(word 4) contains zero of the query was successful,
else an error code [yet to be defined] indicating why the query lost.
Depending on the type of error, word 6 may or may not be zero; I
haven't thought of any reasons why the resolver should return data
while signalling an error, but it can be done if somebody thinks of a
reason.

The query operation is an extension of the opcode used in the domain
spec.  QUERY, IQUERY, CQRYU, and CQRYM correspond to the opcodes
defined in RFC883.  HOSTS and HOSTS3 are provided for kludges that use
the old host tables (in whatever format) as a source of information.
More opcodes may be defined here as needed.

For ".CTL." messages, only word 4 is defined, the rest of the message
is arbitrary and available to the resolver and the program issuing the
control program, by mutual agreement.  At present there are no control
messages defined, but there may be some use for them in the future.
----------------------------------------------------------------

The rest of the message is divided into two portions: RRs and strings.
All the RRs come first, and the strings are just tossed in as
convenient after all the RRs.  Each string may be pointed to by one or
more RRs; how much sharing of identical strings goes on is left up to
the resolver code, and no assumptions should be made about it.  The
strings have to be either 8 bit or nine bit bytes; these are equally
easy for assembly language and 9 is much easier for KCC, so we use 9
bit characters.  Kind of odd, I know, so what?

The first three words of each RR are the same; subsequent words are
dependent on the class and type of RR in question.  It is assumed that
the client code knows how to decipher RRs it is interested in.  [A
listing of the expansions of all known types should be included in
this document; right now the reader is refered to the RDEXPN.C module,
which does the actual expansion from the network format.]

word 0:	rr_length
	the number of items in the rdata portion of the RR can be
	calculated by subtracting the length of the fixed header (3)
	from this value.

word 1:	type ,, class

word 2:	ttl

word 3+: rdata, type and class dependent.

----------------------------------------------------------------

I think that's it.