previous | contents | next

Appendix 629

1. Data-types

1.1 We give first a general definition of data-types (1.2). and then two shorter notations, which are the ones commonly used-i-units (1.3) and data-type-names (1.4).

1.2 data-type = (

referent: entity;

referent-expression;

* component-list;

component: data-type;

carrier: i-unit;

format: (component: memory-expression)-list;

information-content: [i] )

A data-type specifies the encoding of a meaning into an information medium. The meaning of the data-type (that which it designates or refers to) is called its referent (or value). The referent may be an entity, ranging from highly abstract (the uninterpreted bit) to highly concrete (the payroll account for a specific type of employee). The encoding of this referent either is directly understood (as when a bit encodes a bit) or must be given by the referent expression in terms of the component data-types.

EXAMPLE binary-floating-point-number : = data-type(

referent: number;

component-list: mantissa, exponent;

referent-expression: mantissa X 2 ­ exponent)

COMMENT Note that in the referent expression the component data-types are taken to designate their values, i.e., a signed fraction and an exponent is an integer. This avoids a clumsier notation in which one could write:

referent(mantissa) x 2­ referent(exponent).

Associated with every data-type is an i-unit, called its carrier, into which all its component data-types can be mapped. The carrier is used in storing the data-type in memories and in transmitting it over links. It must be extensive enough to hold all the component data-types, but it may be larger (having error-checking and -correcting bits, or even unused bits). It need not hold disjointly all the carriers of the component data-types, since packing may occur. However, the component data-types must all have their relative structures preserved (or they cannot be processed). The mapping of the component data-types into the carrier is called the format. It is given as a list that associates to each component a memory expression involving the carrier (see ISP 2 for definition of memory-expression).

EXAMPLE floating-point-number : = data-type (

component-list: mantissa, exponent;

mantissa : 23 b; exponent : 9 b;

carrier: word, 32 b/w;

format:(mantissa: wordá 0:22ñ , exponent: wordá 23:31ñ ))

The five parameters-referent, referent-expression, component-list, carrier, and format-determine a data-type. The information content is simply a useful redundant parameter, which gives the amount of variety of the data-type. An upper bound, of course, is the amount of information in the carrier. A better estimate is the sum of the contents of the component data-types. A true value must take into account the dependencies between components. The efficiency of encoding (under the constraint that the encoding must be into the carrier and that all possible values must be represented, no matter how low their probability of occurrence) is the ratio of the information content to the carrier content.

1.3 data-type : = i-unit
The simplest data-types are i-units. An i-unit as a data-type implicitly determines the five defining parameters given in ISP 1.2. The referent is the uninterpreted i-unit itself (i.e., a word is to be handled only as an uninterpreted unit of information). There is no need for a referent expression. The carrier is the i-unit itself, if it is an i-unit capable of independent storage and transmission in the system. If not, then the carrier is the smallest such i-unit that contains the given i-unit. The component data-types are the first sublevel of structures of the i-unit. There are no components if the i-unit is a base-unit (bit or undecomposable character). If the i-unit is the carrier, no format is needed. If a larger carrier is required, then a mapping is usually implicit (e.g., 1 bit in a word goes into the low-order position; 1 word in a block goes into the first word, etc.). If not, a format must then be given in the regular way.

1.4 data-type : = data-type-name

data-type-name : = i-unit-name÷ simple-name÷

component-name . length-type ÷ precision . data-type-name ÷ component . component . . .

length-type : = array / a÷ string / st ÷ vector / v

precision : = + integer÷ multiple / m÷ quadruple / q ÷ triple / t÷ double / d÷ *single / s÷ half / h÷ fractional / fr

A naming scheme is provided for data-types, which can be used as a basis for abbreviations. Some data-types have arbitrary simple names (e.g., character, floating point numbers); others are named by their value (e.g., integer). Data-types that are iterations of a basic component can be named by the component suffixed by a length-type. The length-type can be array/ a, implying a multidimensional array of fixed but unspecified dimensions; a string/st, implying a single sequence of variable length (on each occur-

previous | contents | next