XX:USRDEF.DOC.2, 24-Dec-87 23:54:41, Edit by SRA This document is very out of date and needs to be updated at some point. The real documentation resides in the code and in the DSYMS file (USRDEF.D) that defines the data structures and tokens (q.v.). ================================================================ Rough draft of user message protcol (analog to CMU "RED" protocol). Protocol version "01" (see word zero protocol field, below). ---------------------------------------------------------------- Messages come in page sized segments. For convenience in writing the C code, a chunk is one Twenex page (size matters for 20x IPCF, doesn't for ITS USR: device mapping). ---------------------------------------------------------------- The first four words of each page are always the same. word 0: sixbit "DOMdnn" "d" is one of: "Q" query packet/state "R" response packet/state "W" wait state (ITS only, resolver has received message but hasn't answered yet) "E" special response indicating that the protocol version didn't match; in this case, the client code can check the "nn" field to see what version of the protocol the resolver speaks. "nn" is a two character version encoding. This document defines version "01" of the protocol. The idea is that the resolver just check for DOMQnn in the first word, where nn is the protocol version number the resolver was compiled with. Likewise the user side code can check for DOMAnn. For this reason, the first word must remain the version word, even in subsequent revisions of this protocol. word 1: page_count ,, page_this page_count is the number of pages in this message. page_this is the number of this page in the message. This word is used to reassemble long messages. Normal queries and responses should fit on a single page, but some types of queries (QCLASS=* QTYPE=MG QNAME="BUG-SCHEME.MC.LCS.MIT.EDU") may return very long answers. words 2 & 3: unique id These two words are a timestamp/id doubleword. This can be anything, its purpose is to give each query a unique id that will not come around again for a long time. On twenex a good pair of values to use here might be the results of GTAD% and HPTIM%. On ITS, .GENSYM and .CALL RQDATE might provide suitable values. The only things the resolver will ever do with this field is copy it from query to answer and compare two instances of this field for equality. ---------------------------------------------------------------- The rest of each page is message data. A sequence of pages is viewed as an address space, with the first page being at offset zero, the second page at offset 01000, etcetera. Because of the four word header at the top of each page, addresses have to skip from 0777 to 1004 and so on. Tough. The message data is laid out in a format similar to the network message format used in RFC883, except that there are no string compression crocks and integer values are not in the depths of byte strings. Strings and domain names are represented by (18 bit, message relative) pointers. All other values appear inline as specified below. ---------------------------------------------------------------- The first four words of the message data specify the query and the number of answering RRs. Offsets here are relative to page zero, since this is still fixed position data appearing in every message. word 4: sixbit query operation (one of "QUERY", "IQUERY", "CQRYU", "CQRYM", "HOSTS", "HOSTS3", maybe others), or sixbit ".CTL." for a control message (defined elsewhere if at all). word 5: flags or error code ,, pointer to QNAME. word 6: QTYPE ,, QCLASS. word 7: count of answering RRs following. For queries, LH(word 4) contains flags, which are used to control resolver searches (local data only, must be authoritative, etcetera), word 6 is zero, and this is the end of the message (except for the QNAME string, which presumably follows word 6). For answers, LH(word 4) contains zero of the query was successful, else an error code [yet to be defined] indicating why the query lost. Depending on the type of error, word 6 may or may not be zero; I haven't thought of any reasons why the resolver should return data while signalling an error, but it can be done if somebody thinks of a reason. The query operation is an extension of the opcode used in the domain spec. QUERY, IQUERY, CQRYU, and CQRYM correspond to the opcodes defined in RFC883. HOSTS and HOSTS3 are provided for kludges that use the old host tables (in whatever format) as a source of information. More opcodes may be defined here as needed. For ".CTL." messages, only word 4 is defined, the rest of the message is arbitrary and available to the resolver and the program issuing the control program, by mutual agreement. At present there are no control messages defined, but there may be some use for them in the future. ---------------------------------------------------------------- The rest of the message is divided into two portions: RRs and strings. All the RRs come first, and the strings are just tossed in as convenient after all the RRs. Each string may be pointed to by one or more RRs; how much sharing of identical strings goes on is left up to the resolver code, and no assumptions should be made about it. The strings have to be either 8 bit or nine bit bytes; these are equally easy for assembly language and 9 is much easier for KCC, so we use 9 bit characters. Kind of odd, I know, so what? The first three words of each RR are the same; subsequent words are dependent on the class and type of RR in question. It is assumed that the client code knows how to decipher RRs it is interested in. [A listing of the expansions of all known types should be included in this document; right now the reader is refered to the RDEXPN.C module, which does the actual expansion from the network format.] word 0: rr_length the number of items in the rdata portion of the RR can be calculated by subtracting the length of the fixed header (3) from this value. word 1: type ,, class word 2: ttl word 3+: rdata, type and class dependent. ---------------------------------------------------------------- I think that's it.