XX:DSYMS.DOC.3, 9-Sep-88 01:34:08, Edit by SRA Pseudo-language for constant and structure definitions. The pseudo-language described in this document is subject to change without notice. Go read the source code for the macro packages if you want to know exactly what's what at any given moment. There are some parameters in the code that need to be assembled/compiled into both assembly code and C code. Rather than use the kludge that JEEVES used (a PASCAL program that dumped out a MACRO source file), we have a common definition file that can be read by both languages. Due to the fact that all three common PDP-10 assemblers and KCC all have macro preprocessors, this is not that difficult. Some care and a restricted set of commands takes care of the rest. All pseudo-ops should be in uppercase, to humor C coding conventions. Any pseudo-op that takes arguments should have them surrounded by parenthesese. Numbers are assumed to be decimal. In contexts where it makes sense, a leading zero means octal, as in C. The character set for symbol names is as in C; the assembler macro packages convert "_" to "." per the usual PDP-10 convention for C code. There are two broad catagories of operators: general and special purpose. General are used to handle comments, simple constants, and structures. Special purpose are used to define tokens with associated text strings, and to define the layout of the RDATA portion of RRs. Symbol names may be arbitrarily long, but only the first six characters are significant. For structure fields only five characters are significant, and there had better not be any conflicts between structure field names (this is due to limitations of the MACSYM DEFSTR macros which are used to implement DSYMS structures in MACRO-10). Only one operator is allowed per line. There is no line continuation, but lines may be of arbitrary length. There is no ";" or other terminator character following an operator. First the general stuff. Comments are kind of gross. Syntax is: COMMENT /* */ Any kind of whitespace is permitted after the "/*", but only space and tab may occur between the "COMMENT" and the "/*". No "/" characters may occur in the comment text. Eg: COMMENT /* * This is a somewhat prettified comment. * Note the odd syntax. */ Numeric constants are defined with the CONST macro. In C these translate to blank enums (no tag name for the enum type, that is). Note that this means that you can't do hairy preprocessor operations with constants defined via DSYMS. Sorry. Limitations of the C preprocessor. Example: CONST(FOO,79) CONST(BAR,0377) Structures. Bracketed by BSTRUCT and ESTRUCT. BSTRUCT takes the name of the structure (for C typing stuff). Defining operators are DFIELD (defines a bitfield, takes field name and width), DFILL (fills out a null bitfield, takes width argument), and DWORD (defines a word aligned entry, args are name and length in words). All fields are unsigneds in C. Bitfields are defined for assembler using the MACSYM DEFSTR macro. Word aligned fields are just word offset symbols. Automatic filling at the end of bitfields is partially supported: it'll round to the next word if the next operator is a DWORD, but won't detect overflow in a DFIELD (DEFSTR probably will, though). DFILL does handle this correctly, including a width of zero (like C). The terms "word" and "halfword" are intentionally left fuzzy. A "word" is a unit of storage sufficient to represent at least 32 bits. A "halfword" is sufficient to represent at least 16 bits. The exact definition is left up to the specific implementation; the assumption is that these quantities will be aligned in whatever way makes sense for efficient use of the hardware. On the PDP-10 these are of course 36 and 18 bits respectively. If this code is ever ported to a machine that uses 32 and 16 bit words and halfwords, watch out for sign extension in the code! To maximize portability, DWORD and DHALF should be used whenever possible, rather than specifying an exact number of bits. BSTRUCT(name) begins a structure definition named "name". "name" is the structure tag name in C, is ignored in assembler. ESTRUCT(name,length) ends a structure named "name", and defines a constant "length" equal to the size of a (struct "name") in "words". This is used extensively by both sides of the code that implements the user interface protocol (USRxxx.C, GTDOM.MAC, whatever). DWORD(name) defines a normal field a "word" long. DWORDS(name, length) defines an array of "words" length long. DHALF(name) defines a "halfword" field (see above). DFILL(length) fills a bitfield (or halfword) in the same way that an unnamed bitfield does in C. Exact behavior is implementation dependent, is assumed to do the "obvious" thing for the hardware. DFILL(0) rounds to the next "word" boundary, as you'd expect. DFIELD(name,length) defines a bitfield "length" bits long. Example: BSTRUCT(foobar) DHALF(a) DFILL(1) DFIELD(b,3) DWORD(c) DHALF(d) DFILL(0) DWORDS(e,1) ESTRUCT Special purpose macros. These have multiple definitions; one set is in DSYMS1.* and defines the "normal" action (usually just generates a named symbol, like CONST(), and ignores the rest of the data), other definitions are in specific modules which need to get at other parts of the data. This technique should be familiar to anybody who has programmed extensively in MACRO-10, eg, see the way JSYS numbers and error values are laid out in MONSYM.MAC. BTUPLE(name) starts the definition of a table of TUPLEs. As currently used, "name" is the name of the struct tblook_table in which the TUPLE strings are stored. ETUPLE ends a TUPLE table. TUPLE(name,value,string) defines a token with an associated numeric value and an associated text string. This is used for a number of things such as tokens for the config file and zone file parsers. ATUPLE(name,string) defines an alias to an existing token. Ie, it associates an additional text string with a token that has already been defined with TUPLE(). This is primarily used to define nicknames for the keywords used in RESOLV.CONFIG. BRDATA(qclass,type) begins the definition of the RDATA portion of RRs matching "qclass" and "type". "qclass" may be QT_ANY ("*"), type must be a real type, not a special qtype. See RFCDEF.D for more details. This is used to keep all the definitions of offsets and RDATA formats in a single place. ERDATA(name) ends an RDATA definition and defines a constant with value equal to the number of fields in this RDATA definition. RDATA(name,format) define an RDATA field with an associated format character (specified in single quotes, eg 'd'). See RRFMT.C for details on what this does. The "normal" macro definitions in DSYMS1 treat an RDATA operation like an entry in an enum list, so with the "normal" macro definitions this defines an offset into the rdata arrays used internal by the CHIVES code and in the messages sent to GTDOM% and other clients. ARDATA(name,nickname) defines a second token "nickname" with value equal to a token "name" that was declared with an RDATA() line. This is ugly but necessary; the intent is to keep the nomaclature consistant within the C code while providing symbol names that are unique within the six-character limit imposed by PDP-10 assemblers. The macro package for C completely ignores ARDATA() lines. That's all for now. More may be added as needed for new features or ports to other systems.