is physically maintained. The user can choose between virtual and
derived data as a result of considering trade-offs based on: estimated
cost of calculation; frequency of update; estimated cost of storage; and
frequency of access. For example, suppose a file contains a list of
budgets for various projects in a department. The departmental budget
can be calculated as a function of the individual project budgets. This
information might be defined as derived data since it is expected to be
updated infrequently (e.g., once a year), while it is expected to be
accessed relatively often.
Options will be provided which give the user control with regard to when
the calculation of derived data is to be done. These options will be
similar to those provided for control of data validity operations. The
data validation and derived data concepts are similar in that some
operation must be performed on related data. In the case of data
validation, the information derived is the condition of data.
3.9 Internal Representation
To this point, we have discussed only the high level, logical, aspects
of data. Since data, at any given time, must reside on some physical
device a representation of the data must be chosen. In some cases it is
appropriate to leave this choice to the datacomputer system. For
example, the representation of information which is used in the process
of transmitting other data, but which itself resides solely at the
datacomputer may not be of any concern to the user.
However, it is important that the user be capable of controlling the
choice of representation. In any application which requires mostly
transmission of data rather than interpretation of the data by the
datacomputer, the data should be maintained in a form consistent with
the system which communicates with the datacomputer. With respect to
basic types of data, datalanguage will provide most representations
commonly used in systems with which it interacts. For some types (e.g.,
fixed point) this will be accomplished by providing for parametric
(e.g., sign convention, size) description of the representation. In
other cases (e.g., floating point) specific representations will be
offered (e.g., system 360 short floating point, system 360 long floating
point, pdp-10 floating point, etc.).
Another aspect of the internal representation problem regards aggregate
structures. The method chosen to represent aggregate structures may
largely affect the cost of manipulating the data. The user must have
control over this representation since only he has any idea of how the
data is to be used. Datalanguage will provide a variety of
representational options which will allow for efficient implementation
of data structures. This includes the availability of auxiliary
structures, automatically maintained by the data computer system. These
structures can be used to effect efficient retrieval of subsets of data
collections based on the contents of the members (i.e. the common
concept of indices), efficient maintenance of orderings on a collection
of data, maintenance of redundant information for the purpose of data
integrity, and efficient handling of shared data whose behavioral
characteristics are dependent on the path of access. It should be noted
here that, the datalanguage design effort, will attempt to provide
methods whereby the data user can describe the expected use of his data,
so that details of internal representation can be left to the
3.10 Data Attributes and Data Classes
The type of an item determines the operations which are valid on that
item and what they mean. _Data_attributes_ are refinements on the type
of data. The data attributes affect the meaning of operations. For
example, we would like to provide for the option of defining fixed point
items to be scaled. The scale factor, in this case, would be an
attribute of fixed point data. It effects the meaning of operations on
that data. The attribute concept is useful in that it allows information
concerning the manipulation of an item to be associated with the item
rather than with the invocation of all operations on that item.
The attribute concept can be applied to aggregate as well as basic data.
For example, one attribute of a list could define where a new member is
to be inserted. Options might be: insert at the beginning of the list;
insert at the end of the list; or insert in some order based on the
contents of the member. Adding a new member to a list with one of the
above attributes could be done by issuing a simple insert request
without having to specify where the new member is to be inserted.
The _data_class_ concept is actually the inverse of the data attribute
concept. A data class is a collection of data types. The data class
concept allows for definition of operations, independent of specific
type of an item. For example, by defining the data class arithmetic to
be composed of fixed point and floating point types of data, the
comparison operators (_equal_, _less_than_, etc.) can be defined to
operate on arithmetic data, independent of whether it is fixed or
floating point. Also the concept of data aggregate can be seen as a
class encompassing directories, lists, etc. As there are operations
defined on arithmetic data, there are also operations defined on
The inverse relationship between data classes and data attributes is
very strong. For example, the concept of list can be seen as a data
class, encompassing all types of lists (e.g., lists of integers, lists
of character strings, etc.), independent of the types of their members.
The type of a list's members (e.g., integer, character string, etc.) are
then seen as attributes. Data attributes and classes are also relative
concepts. While the concept of list can be viewed as a data class, it
can also be seen as an attribute, relative to the concept of data
3.11 Data Description
A _data_description_ is a statement of the properties (see discussion of
attributes) of a data item. Examples of properties which are recorded
in a description are: the name of an item; its size; its data type; its
internal representation; privacy information; etc.
Datalanguage will contain mechanisms for specifying data descriptions.
These descriptions will be processed by the data computer, and used
whenever the data item is referenced. The user will be able to
physically create data only by first specifying their descriptions. The
properties of a description can be divided into groups according to
their function. Some have the function of specifying details of
representation, which will not be of interest to most users, while
others, such as the name are of almost universal interest.
All user data is a part of some larger (user or system) data structure.
The structures containing data establish a path of access to the data.
In the process of following this path the datacomputer system must
accrue a complete description of the data item. For example, the
description of a data item of a directory may be found associated with
that node of the directory. Members of a list or array are described as
part of the description of the list or array. We must dispose of two
seeming exceptions. First, while aspects of data may (on user request)
be left to the system, those aspects are still described, they are
described by the system. As discussed above, some data will be, to some
degree, self describing (e.g. members of mixed lists). However, it is
fully described in some encompassing structure, in that a method of
determining the full description is described.
It is worth noting here that the sooner a complete description is found
in the path of access, the more effective the datacomputer is likely to
be in processing requests which manipulate a data item. However, the
ability to have data whose complete description does not exist at high
levels of the access path provides greater flexibility in the definition
of data structures.
3.12 Data Reference
Data cannot be manipulated unless it can be referenced. In the same way
that data cannot exist without its being described, it cannot exist
unless there is a path of access to the data. The method of data
reference is to define the path of access to the data. As mentioned
above, there is a method of referencing any item relative to the data
aggregate which contains it. Nodes of directories and components of
structs are referenced via the name associated with the node or
component. Members of arrays are referenced via the index associated
with the member. Members of lists are referenced via some method of
specifying the position of the member or by uniquely identifying the
member by content. To reference any arbitrary data item the path of
access must be fully defined by either explicit or implicit definition
of each link in the chain. In the case of virtual data there is an
extra implicit link in the chain, that being the method employed to
obtain the data from other data items. It should be noted also that if
pointers are provided (see discussion on general relational
capabilities) they can also serve as a link in the chain of access to an
The design of datalanguage will ease the problem (and reduce the cost)
of referencing data items by providing methods whereby part of the
access path can be implicitly defined. For example, datalanguage will
provide a concept of "context". During the course of interacting with
the datacomputer, levels of context can be set up so that data can be
referenced directly, in context. For example, on initiating a session
the user may (in fact will probably be required to) define a directory
which will be the context of that session. All items subordinate to
this directory can be referenced directly in this context. Another
feature will be partial qualification. Each level of struct need not be
mentioned in order to reference an item embedded in a deep nest of
structs. Only those intermediate levels which are sufficient to
uniquely identify the item need be specified.
In this section we discuss the builtin functions of datalanguage which
are of central importance in manipulating data. Functions which operate
on items, functions which operate on aggregates, primitive functions and
high-level functions are discussed.
Of the primitives which operate on items, those of most interest are
assignment, comparisons, logicals, arithmetics and conversion functions.
Primitive assignment transfers a value from one item to another; these
items must be of the same type. When they are of different types,
either conversion must be performed, or some non-primitive form of
assignment is involved.
The comparison operators accept a pair of items of the same type, and
return a boolean object which indicates whether or not a given condition
obtains. The type determines how many different conditions can be
compared for. A pair of numeric items can be compared to see which is
greater, while a pair of uninterpreted items can be compared only for
equality. In general, a concept of "greater than" is builtin for a
datatype only if it is a very widely applied concept. The comparison
operators are used in the construction of inclusion conditions when
defining subsets of aggregate data.
The result of a comparison operation is a boolean item: one whose value
is either TRUE or FALSE. Logical primitives are provided and
generalized boolean functions can be constructed from them. With
logical and comparison operators, complex conditions for inclusion of
objects in sets can be specified.
Arithmetic operators will be available for the manipulation of numeric
data. Here, we are not interested in generalized computation, but in
applications of arithmetic in data selection, space allocation,
subscript calculation, iteration control, etc.
Conversion is an important part of generalized data translation, and we
are interested in providing a substantial builtin conversion facility.
In particular, we will want to provide an efficient system routine for
each "standard" or widely-used conversion function. Of particular
importance are conversions to and from character string data; in
character string representation of, for example, numeric items, there
are many possible formats corresponding to a single data type.
Conversion between character sets and dealing with padding and
truncation are viewed as conversion problems.
There are two principal classes of primitive operators defined on
aggregates: those related to data reference (see previous section) and
those which add and delete components. Changing an existing component
is accomplished through assignment, and is an operation on the
component, not the aggregate.
Addition and deletion of components is defined only for aggregates which
are not inherently static in composition. Thus one can add a component
to a LIST, but not to an ARRAY. To specify deletion it is necessary to
specify which component is to be deleted, and from which aggregate (in
the case that it is shared). Addition requires specification of new
component, aggregate, and sometimes auxiliary information. For example,
some aggregate types would permit addition of new components anywhere in
the structure; in these a position must be indicated, relative to any
Often it is desirable to operate on some of the members of a list, or to
treat a group of members as a list in its own right. For example, it
might be common to transmit to a remote program for analysis, the
medical history of patients developing heart disease before the age of
30. These may be just a few of the members of a large list of patients.
In this case, the operation to be performed is transmission to the
remote system; this operation is performed on several members of the
list of patients. The ones to be transmitted are thought of as a _set_;
the set is specified as containing all the members of a given list
satisfying two conditions: (1) age less than 30, and (2) has heart
Sets can be defined explicitly, or implicitly simply with appropriate
reference mechanisms. _Definition_ of a set is distinct from
_identification_of_membership_, which is distinct from
_access_to_membership_. Definition involves specifying the candidates
for set membership and specifying a rule by which members of the set can
be distinguished from non-members; for example, an inclusion condition
such as "under 30 with heart disease". Identification involves
effective application of the rule to all candidates for membership.
When the membership has been identified, it can be counted, but the data
itself has not necessarily been accessed. When a member is accessed, its
contents can be operated on.
Primitives to accomplish each of these operations on a set will be
provided; however, it will ordinarily be optimal for the datacomputer to
determine when each step should be performed. To enable users to
operate at a level at which the datacomputer can optimize effectively,
higher-level operators on sets will be provided. Some of these are
logical operators, such as union and intersection. These input and
output sets. Also available is an operator which complements a set
(since the definition establishes all possible candidates, a set always
has a well-defined complement).
These higher level operators can be applied to any defined set; the set
members need not be identified or accessed. The system will perform
such operations without actually accessing members if it can.
Some of the other operators on sets are counting membership,
partitioning a set into a set of sets, uniting a set of sets into a set.
A set can be used to reference another set, providing there is a well-
defined way to identify members of the second set given the first set.
For example, a set C may contain all the children doing poorly in
school. A set F may be defined, where the members of F are the records
about families having a child in set C.
Some other useful operations on sets are: adding all the members of a
set to an aggregate, deleting all the members of a set (frequently such
a massive change can be performed far more efficiently than the same set
of changes individually requested), changing all the members of a set in
a given way.
A set can be made into a list, by actually accessing each member and
physically collecting them.
Some of the operations on lists are: concatenation of lists into larger
lists, division of a list into smaller lists, sorting a list, merging a
pair of ordered lists (preserving order).
This is not intended to be a full enumeration of high-level operations,
but to be suggestive. We are planning to build in high-level functions
for operations which are used very commonly, and can be implemented
within the system significantly better than they can be implemented by
users in the language. For most of the functions mentioned here,
considerable knowledge is accumulated on good implementations. In
particular, the techniques used for inverted file access provide many
set operations to be performed without actual access to the data.
The control features of datalanguage are to the basic operations as data
aggregates are to the basic data items. Control features are used to
create complex requests out of the basic requests provided by
Conditional requests allow the user to alter the normal request flow by
specifying that certain requests are to be executed under certain
conditions. In general datalanguage will provide the ability to chose
at most one of a number of requests to be made based on some set of
conditions or the value of some item. In its simplest form the
conditional allows for optional execution of a given request.
Iterative requests cause a request (called the body) to be executed a
fixed or variable number of times or until a given condition is met.
Datalanguage will provide iterative requests that will allow for similar
manipulations to be performed on all members of some aggregate structure
as well as the standard type of iterative request based on counters. By
providing a capability of directly expressing manipulations on
aggregates which require processing all of the items subordinate to the
aggregate, the datacomputer can be more efficient in processing user
requests. For example, a user defined conversion process which operates
on character strings, can be implemented far more efficiently if the
datacomputer is explicitly informed that the process requires sequential
processing of the characters. Datalanguage will also provide for
parallel iteration. For example, the user will be able to specify
operations which require sequencing through two or more lists in
parallel. This would be done if the contents of one file were to be
updated based on a file of correction information.
Compound requests are collections of requests which act as one. They
are primarily provided to allow for the conditional performance of or
iteration on more than one statement. Compound requests also provide
request reference points which can be used to control the request
processing flow. That is, compound requests can be "named". The
datalanguage user will be able to specify control information which will
conditionally cause a compound request to be exited. By providing
naming, the user may cause any number of previously entered compound
requests to be exited.
We do not intend to provide the traditional _goto_ capability. By not
including a goto request, the chances for efficient operation (via
optimization) of the datacomputer are increased. We also hope, in this
way, to force the datalanguage user to specify his data manipulations in
a clear sty1e.
Two forms of the compound request will be provided, ordered and
unordered. In the unordered case the user is informing the datacomputer
that the requests can be performed in any order. This should allow the
datacomputer to perform more efficiently and might even allow for
During a session with the datacomputer it is likely that a user will
find a need for temporary data. That is, data which is used to
remember, for a short term, information which is needed for the
processing of requests. This short term might be a session or a small
part of a session. Datalanguage will provide a temporary data facility.
Temporary data will be easy to create, use and dispose of. This will be
accomplished by allowing the system to (optionally) make many decisions
regarding the data. For example the representation of a temporary
integer item will often be of no concern to the user. Some features
which are provided for permanent data will be deemed irrelevant with
regard to temporary data.
Temporary data will be associated with a collection of requests in what
will be called a block. A block will be no different than a compound
request with the exception that data is defined with the requests which
compose it and is automatically created on entrance to the block and
destroyed on exiting the block.
The goals of datalanguage are to provide facilities of data structure at
two levels. At one level the user may take advantage of high level data
capabilities which will do much of his data management work
automatically and which allows for the data computer to operate more
effectively in some cases since it has been given control of the data.
At another level, however, features are provided which allow the user to
describe his application in terms of primitive concepts. In this way
the datacomputer user may compose a large variety of data constructs and
has great flexibility with respect to the manipulations he can perform
on his data. Also by interacting with the datacomputer at the primitive
level, the user can exercise a good deal of control over the methods
employed by the datacomputer which may result in more effective usage of
resources for non-standard applications. Datalanguage will provide
features which allow the user to create an environment whereby the
datacomputer system appears to provide features especially tailored to
The control features discussed above allow the user to extend the
operations available on data by appropriate composition of the
operations. Datalanguage will provide a method of defining a composite
request to be a new request (called a _function_). In this way a new
operation on specific data can be defined once and then used repeatedly.
In order that the user may define general operations, datalanguage will
provide functions which can be parameterized. That is, functions will
not only be able to operate on specific data but may be defined to work
on any data of a specific type. This capability will not be limited to
basic data types (e.g. integers) or even specific aggregate types (e.g.
array of integers) but will also include the ability to define functions
which operate on classes of data. For example, functions can be defined
which operate on lists independent of the type of the list members.
Also provided, will be the ability to expand and modify existing
functions as well as creating new functions. This includes expanding
the types of data for which a function is defined or modifying the
behavior of a function for certain types of data.
As with operations, the data aggregates discussed above allow the user
to extend the primitive data types by appropriate composition. For
example, a two dimensional array of integers can be created by creating
an array of arrays of integers. The situation for data types is
analogous to that of operations. Datalanguage will provide the ability
to define a composition of data to be a new data type. Also the
capability of defining general data structures will be provided by
essentially parameterizing the new data definition. This would allow
the general concept of two dimensional array to be defined as an array
of arrays. Once defined, one could create two dimensional arrays of
integers, two dimensional arrays of booleans, etc. As with functions
there is also a need to expand or modify existing data types. One might
want to expand the attributes which apply to a given data type, in that
he might want to add new attributes, or add new choices for the existing
The control features can be extended also. Special control features
might be needed to process a data structure in a special way or to
process a user defined data structure. For example, if a tree type data
structure has been defined in terms of lists of lists, the user might
like to define a control function which causes a specified operation to
be performed on each item of a specified tree. As with data types and
functions, there is a need to be able to modify and extend existing
control features as well as the ability to create new ones.
Datalanguage will provide the ability to treat data descriptions and
operations in much the same way that data is treated. One can describe
and manipulate descriptions and operations in the same way that he can
describe and manipulate data. It is impossible to talk about data types
without consideration of operations and equally as impossible to talk
about operations without an understanding of the data types they operate
on. In order for the user to be able to effect the behavior of the
datacomputer system, the design of datalanguage will include a
definition of the operational cycle of the datacomputer. Precise
definitions of all aspects of data (data attributes, data classes,
relationship of aggregates to their subordinate items, etc.) in terms of
their interaction with datalanguage operations will be made. In this
way the datacomputer can offer tools which will give the datacomputer
user the ability to be an active participant in the design of the
datalanguage which he uses.
4. A Model for Datalanguage Semantics
For the purpose of defining and experimenting with language semantics
and with language processing techniques, we are developing a model
The principal elements of the model are the following:
(1) A set of primitive functions
(2) An environment in which data objects can be created, manipulated and
deleted, using the primitives
(3) A structure for the representation of collections of data values,
their descriptions, their relationships, and their names.
(4) An interpreter which executes the primitives
(5) A compiler which inputs requests in a very simple language, performs
binding and macro expansion operations, and generates calls to the
internal semantic primitives.
If our modeling efforts are successful, the model will evolve until it
accepts a language like the datalanguage whose properties we have
described in sections 2 and 3 of this paper. Then the process of
writing the final specification will simply require reconciliation of
details not modeled with structure that has been modeled. One rather
large detail which we may never handle within the model is syntax; in
this case reconciliation will be more involved; however, we firmly
believe that the semantic structure should determine the syntax rather
than the opposite, so we will be in the proper position to handle the
By constructing a model for each of the elements listed above, we are
"implementing" the language as we design it, in a very loose sense. In
effect, we work in a laboratory, rather than working strictly on paper.
Since we aren't concerned with the performance or usability of the
datacomputer we are building in the laboratory, we are able to build
without becoming involved with some of the most time-consuming concerns
of an implementor. However, because we are building and tinkering,
rather than simply working on paper, we do get some of the advantages
that normally come with the experience of implementing one's ideas.
The model datacomputer is a program, developed in ECL, using the EL1
language. Presently we are interested in the process of developing the
program, not running it. Our primary requirement is to have, in advance
of the existence of datalanguage, a well-defined and flexible notation
in which to specify data structures, function definitions and examples.
EL1 is convenient for this. Having a program which actually works and
acts like a simple datacomputer is really a by-product of specifying
semantics in a programming language. It is not necessary for the
program to work, but it does provide some nice features. It enhances the
"laboratory" effect, by doing such things as automatically compiling
strings of primitives, displaying the state of the environment in
complicated examples, automatically discovering inconsistencies (in the
form of bugs), and so on.
There are two major reasons that EL1 is a convenient notation for
specifying datalanguage semantics. One is that the languages have a
certain amount in common, in both concepts and in goals in data
description. (In part, this is because EL1 itself has been a good
source of ideas in attacking the datalanguage problem). Both languages
emphasize operations on data, independent of underlying representation.
A second reason that EL1 is a convenient way to specify datalanguage, is
that EL1 is extensible; in fact, many primitive functions could be
embedded directly into EL1 by using the extension facilities. At times,
we have chosen to embed less than we could, to expose problems of
interest to us.
So far, the model has been useful primarily in exposing design issues
and relationships between design decisions. Also, because it includes
so many of the elements of the full system (compiler, interpreter,
environment, etc.), it encourages a fairly complete analysis of any
In presenting the model in this section, we have chosen to emphasize
ideas and examples, rather than formal definitions in EL1. This is
because the ideas are more permanent and relevant at this point (the
formalisms are changing rather frequently) and because we imagine people
reading the formal definitions only to get at the ideas. The formal
definitions may be interesting in themselves when the language is
complete; at this point they are probably of interest only to us.
The section is organized into a large number of sub-sections. The first
few are concerned with the basic concepts of data objects, descriptions,
and relationships between objects. We then discuss primitive semantic
functions and present informal definitions and examples in sections 4.7
and 4.8. Section 4.9 is a brief discussion of compilation,
interpretation and the execution cycle. Section 4.10 provides a fairly
elaborate example of how primitive functions can be combined to do
something of interest: a selective retrieval by content. The last two
sections wrap up with discussions of high-level functions and some
An _object_ has a name, a description, and a value. It can be related to
The _name_ is a symbol, which can be used to access the object from
The _description_ is a specification of properties of the object, many
of which relate to the meaning or the representation of the value.
The _value_ is the information of ultimate interest in the object.
The relationships between objects are hierarchical. Each object can be
related directly to at most four other objects, designated as its
_parent_, its _child_, its _left_sibling_, and its _right_sibling_.
This specific concept of relationship is all that has been built in to
the model to date. One of our primary objectives in the future is to
experiment with more general relationships among objects.
A description has the components _name_, _type_ and _type-
dependent_parameters_. It can be related hierarchically to other
descriptions, according to a scheme similar to the one described for
objects in 4.1.
The _name_ has a role in referencing, as in the case of objects.
_Type_ is an undefined, intuitive idea for which we expect to develop a
precise meaning within datalanguage(see section 3.10 for some of the
ideas about this). In terms of the present model, it simply means one
of the following: LIST, STRUCT, STRING, BOOL, DESC, DIR, FUNC, 0PD.
Each of these refers to a sort of value corresponding to common ideas in
programming (with the exception of OPD, which is explained in section
4.7), and on which certain operations are defined.
Examples of _type-dependent_parameters are the two items needed to
define a STRING: size option and size. A STRING is a sequence of
characters; the size of the STRING is the number of characters in it.
If a STRING has a fixed size, then size option is FIXED and size is the
number of characters it always contains. If a STRING has a varying
size, then size option is VARYING, and size is its maximum (clearly, it
might also have a minimum in a more refined scheme).
When the description of an object has a type of STRING, it is commonly
said that the object is a STRING.
The value is the data itself.
An object of type BOOL can have only either the value TRUE or the value
An object of type STRING has values such as 'ABC', 'JOHN', or 'BOSTON'.
Each value has a representation, in bits. Thus a BOOL is represented by
a single bit, which will be a 'one' to represent TRUE and a 'zero' to
4.4 Some examples
Here are some examples of structures involving objects, descriptions,
and values. In these explanations and drawings, the objective is to
convey some ideas about these primitive structures; considerable detail
is omitted in the drawings in the interest of clarity.
Figure 4-1 shows two objects. X is of type string and has value 'ABC'.
Y is of type bool and has value TRUE.
The idea of a dir is similar to the idea of a file directory in most
systems. A directory is a place where one can store named objects,
freely adding and deleting them. The entries in the directory are all
objects whose parent is that directory. Figure 4-3 shows a more rigidly
structured group of objects. Here we have R, a struct, and A and B, a
pair of strings. Note that the boxes labeled 'object' in figure 4-3
bear precisely the same relationships to one another as those labeled
'object' in 4-2. However, there are two conditions which hold for 4-3
but do not hold for 4-2: (1) the value of R contains the values of A and
B, and (2) the descriptions of R, A and B are all related.
Structs have the following properties: (1) name and description of each
component in the struct is established when the struct is created, and
(2) in a value of the struct, the order of occurrence of component
values is fixed.
We could imagine an 'object' box for each string in L. Each of these
boxes would point to its respective string and to the common description
of these strings. Instead, we think in terms of creating such boxes as
we need them.
4.5 Definitions of types
Following are some more precise definitions of types, in terms of the
present model. These serve the purpose of establishing more firmly the
semantics of our structure of objects, descriptions and values; however,
they should not be thought of as providing a definition for the
completed language specification.
An object of type STRING has a value which is a sequence of characters
An object of type BOOL has a value which is a truth value (TRUE or FALSE
-- figure 4-1).
An object of type DIR has subordinate objects, each having its own
description and value. Subordinate objects can be added and deleted at
will (figure 4-2).
An object of type STRUCT has subordinate objects, each of which has a
description which is subordinate to the STRUCT's description, and a
value contained in the STRUCT's value. The number, order and
description of components is fixed when the STRUCT is created (figure
An object of type LIST may be thought of as having imaginary subordinate
objects, whose existence is simulated by the use of appropriate
techniques in processing the LIST. Each of these has the same
description, which is subordinate to the description of the LIST. Each
has a distinct value, contained in the value of the LIST. In fact, only
the LIST object, the LIST and component descriptions, and the values
exist (figure 4-4).
An object of type DESC has a description as its value. This value is
the same sort of entity which serves as the description of other
An object of type FUNC has a function call as its value. We will be
able to say more about this after functions have been discussed.
An object of type OPD has an operation descriptor as its value. (see 4.7
4.6 Object environment
There are three categories of objects in the model datacomputer. These
are p/objects, t/objects, and i/objects.
P/objects are permanent objects created explicitly with language
functions. They correspond to the idea of stored data in the real
datacomputer. There are three special objects. These are special only
in that they are created as part of initializing the environment, rather
than as the result of executing a language function. These are named
STAR, BLOCK and TOP/LEVEL. All three are of type DIR.
An object is a p/object if it is subordinate to STAR; it is a t/object
if it is subordinate to BLOCK. TOP/LEVEL is subordinate to BLOCK. (see
figures 4-5 and 4-6).
| _____________ |
| | STAR | |
| |_____________| |
| NAME | ____________
| _____________ | | ________ |
| | ____|_|______\| | DIR | |
| |_____________| | /| |________| |
| DESCRIPTION | | TYPE |
| _____________ | |____________|
| | | | DESCRIPTION
| |_________|___| |
| CHILD | |
STAR and p/objects
T/objects are temporary objects, also created explicitly with language
functions. However, these correspond to user-defined temporaries, both
local to requests and "top-level" (i.e. not local to any request, but
existing until deletion or logout.)
I/objects are internal, system-defined objects whose creation and
deletion is implicit in the execution of some language function.
I/objects are hung directly off of function calls (objects of type
FUNC), and are always local to the execution of such function calls.
They correspond to the notions of (1) literal, and (2) compiler- or
4.7 Primitive Language Functions
Here we discuss the primitive language functions presently implemented
in the model and likely to be of most interest. In this section, the
emphasis is on relating functions to one another. Section 4.8 contains
more detail and examples.
_Assign_ operates on a pair of objects, called the target and the
source. The value of the source is copied into the value of the target.
Figure 4-7 shows a pair of objects, X and Y, before and after execution
of an assignment having X as target and Y as source. Presently,
assignment is defined only for objects of type BOOL and objects of type
STRING. The objects involved must have identical descriptions.
A class of primitive functions for manipulating LISTs is defined. These
are called _listops_. All listops input a special object called an
_operation_descriptor_ or OPD.
To accomplish a complete operation on a LIST, a sequence of listops must
be executed. There are semantic restrictions on the composition of such
sequences, and it is best to think of the entire sequence as one large
operation. The state of such an operation is maintained in the OPD.
Refer back to figure 4-4. There is one box labeled "object" in this
picture; this box represents the list as a whole. To operate on any
given member we need an object box to represent that member. Figure 4-8
shows the structure with an additional object box; the new box
represents one member at any given moment. Its value is one of the
components of the LIST value; its description is subordinate to the LIST
description. In 4-8, the name of this object is M.
In 4-8 we have enough structure to provide a description and value for
M, and this is sufficient to permit the execution of operations on M as
an item. However, there is no direct link between object M and object
L. The structure is completed by the addition of an OPD, shown in
The OPD establishes the object relationship, and contains information
about the sequence of primitive listops in progress. When sufficient
information is maintained in the OPD, we have in 4-9 a structure which
is adequate for the maintenance of the integrity of the LIST and of the
global list operation. In addition to LIST and member pointers, the OPD
contains information indicating: (1) which suboperations are enabled for
the sequence, (2) the current suboperation, (3) the instance number of
the current LIST member, (4) an end-of-list indicator. The
suboperations are add/member, delete/member, change/member and
get/member. All apply to the current member. Only suboperations which
have been enabled at the beginning of a sequence may be executed during
that sequence; eventually, the advance knowledge of intentions that is
implied by this will provide important information for concurrency
control and optimization.
Presently, an OPD relates a single member object to a single LIST
object. This imposes an important restriction on the class of operation
sequences which can be expressed. Any LIST transformation requiring
simultaneous access to more than one member must be represented as more
than one sequence. (And we do not yet solve the problems implied in
concurrent execution of such sequences, even when both are controlled by
Any transformation of a LIST can still be achieved by storing
intermediate results in temporary objects; however, it is certainly more
desirable to incorporate the idea of multiple current members into the
semantics of the language, than it is to use such temporaries. An
important future extension of the listops will deal with this problem.
There are six listops: listop/begin, listop/end, which/member,
end/of/list, open/member and close/member.
Listop/begin and listop/end perform the obvious functions of beginning
and terminating a sequence of listops. Listop/begin inputs LIST and
member objects, an OPD, and a specification of suboperations to enable.
It initializes the OPD, including establishment of the links to LIST and
MEMBER objects. After the OPD-LIST-member relationship has been
established, it is only necessary to supply the OPD and auxiliary
parameters as input to a listop in the sequence. From the OPD everything
else can be derived.
Listop/end clears the OPD and frees any resources acquired by
Which/member establishes the current member for any suboperations. This
is either the first LIST member, the last LIST member, or the next LIST
member. This listop merely identifies which member is to be operated
on; it does not make the contents of the member accessible.
Open/member and close/member bracket a suboperation. The suboperation
is indicated as an argument to open/member. Open/member always
establishes a pointer from the member object to the member value;
close/member always clears this pointer. In addition, each of these
listops may take some action, depending on the suboperation.
The details of the action would be dependent on the representation of
the LIST in storage, the size of a LIST member, and choices made in
Between execution of the open/member and the close/member, the data is
accessible. It can always be read; in the case of the add/member and
change/member suboperations, it can also be written into.
End/of/list tests a flag in the OPD and returns an object of type BOOL.
The value of the object is the same as the value of the flag; it is TRUE
if a get/member, change/member or delete/member would be unsuccessful
due to a which/member having moved "beyond the end". T his listop is
provided so that it is possible to write procedures which terminate
conditionally when all members have been processed.
Get/struct/member provides the ability to handle STRUCTs. Given a
STRUCT object which points to the STRUCT value, it will establish a
pointer from a given member object to the member value. (The pointer it
establishes is represented by a dashed line in figure 4-10).