[SCL] Representing constraints that go beyond EBNF

John F. Sowa sowa at bestweb.net
Sat Nov 15 12:14:16 CST 2003


Murray,

EBNF has been designed to represent ordered context-free languages,
which include many important computer languages.  But we also have
to represent languages that may have unordered constituents (graphs)
and languages with context-sensitive constraints (first-order logic
and all Algol-like languages).

MA> ... Without boring you all with the details, a Topic Map is
 > not a linear sequence; it's very much a complex, unordered graph.
> We were still able to use EBNF to specify the XML syntax, and I'm
> sure that the graph theoretic model behind that (which is called
> the Reference Model, a current ISO project) is also capable of
> being expressed in EBNF. So I'm really not certain where you'd
> have to rely on "tricks" to get EBNF to do what you want with the
> SCL abstract syntax, or XCL/CLML for that matter.

In mathematics, a common representation for a graph consists of
a set of nodes and a set of arcs.  Since sets are by definition
unordered, no further comment is needed.

If EBNF is used to specify graphs, some additional statement is
necessary to ensure that some (perhas not all) of the ordering
imposed by EBNF is ignored.  One appraoch is to say "The implicit
ordering in the following sequence has no semantic relevance"
or "All permutations of the following sequence are considered
equivalent."  I noticed that the ISO EBNF document does not have
such a provision, and we may need to add one.

For common programming languages, the most important context
sensitive constraint is the association between the declaration
of a name (of a variable, constant, relation, or other entity)
and the use of that name in some subsequent statement.  That
constraint is usually enforced by means of symbol tables.

For first-order logic (in KIF, CG, or predicate calculus notation),
the quantifiers have an effect that is similar to the declarations
in a programming language.  And the rules for scope of quantifiers
happen to create a nested structure that is isomorphic to the
rules for scope of variables in Algol-like languages.  Similar
problems arise in both kinds of languages, and similar techniques,
such as symbol tables, can be used to address them.

When a compiler finds a declaration (implicit or explicit) for
any named entity, it adds the name and associated information
about the type of entity to a symbol table, which is tied to
the current level of nesting.  On each subsequent occurrence
of the name within the scope defined by the nesting, it checks
the symbol table to ensure that the current use is consistent
with the declaration.

Since EBNF does not make provision for a symbol table, we can adopt
the same approach that is used to define programming languages:

  1. Use EBNF to specify the constext-free constraints.

  2. State additional context-sensitive constraints in statements
     that are outside the EBNF specifications.

These issues are very similar to the problems that came up when
we were talking about defining an XML version of CGIF.  There is
no problem with defining the context-free issues, but CGIF has
the same context-sensitive constraints as FOL and Algol.  Those
issues arise in relating the defining occurrence of a coreference
label to the bound occurrences of that label.

Since XSLT, like EBNF, does not make any provision for enforcing
or even specifying such constraints, the same issues come up
in MathML.  Following is a web page that discusses them:

    http://www.w3.org/TR/2003/NOTE-mathml-bvar-20031110/
    Bound Variables in MathML

John



More information about the SCL mailing list