[SCL] Representing constraints that go beyond EBNF
John F. Sowa
sowa at bestweb.net
Mon Nov 17 18:11:21 CST 2003
Murray,
That was the whole point of my previous note (copy below):
MA> I'm guessing I'm missing something here. What is it that EBNF
> can't provide?
EBNF can only specify context-free languages, and all versions of
FOL (including KIF, CGIF, & SCL) are context-sensitive languages.
And they are context sensitive for exactly the same reason that
all Algol-like languages (i.e., Pascal, PL/I, C, Java, etc.) are
context sensitive.
The basic problem is the bound variables (which are the same
things that cause problems for MathML). You can't specify bound
variables in either EBNF or XSLT. You can identify certain things
as variables, but you can't state or check the constraints on how
bound variables are associated with their defining occurrences.
Note that other people have discovered the limitations of EBNF (or
the XSLT version of it) because they have been defining Schematron
and XMLCL (XML Constraint Language) to get around the limitations.
My recommendation is to use SCL to define both EBNF and the SCL
constraints that are not expressible in EBNF. Then the entire SCL
definition would be formally defined in SCL.
John
___________________________________________________________________
Murray,
EBNF has been designed to represent ordered context-free languages,
which include many important computer languages. But we also have
to represent languages that may have unordered constituents (graphs)
and languages with context-sensitive constraints (first-order logic
and all Algol-like languages).
MA> ... Without boring you all with the details, a Topic Map is
> not a linear sequence; it's very much a complex, unordered graph.
We were still able to use EBNF to specify the XML syntax, and I'm
sure that the graph theoretic model behind that (which is called
the Reference Model, a current ISO project) is also capable of
being expressed in EBNF. So I'm really not certain where you'd
have to rely on "tricks" to get EBNF to do what you want with the
SCL abstract syntax, or XCL/CLML for that matter.
In mathematics, a common representation for a graph consists of
a set of nodes and a set of arcs. Since sets are by definition
unordered, no further comment is needed.
If EBNF is used to specify graphs, some additional statement is
necessary to ensure that some (perhas not all) of the ordering
imposed by EBNF is ignored. One appraoch is to say "The implicit
ordering in the following sequence has no semantic relevance"
or "All permutations of the following sequence are considered
equivalent." I noticed that the ISO EBNF document does not have
such a provision, and we may need to add one.
For common programming languages, the most important context
sensitive constraint is the association between the declaration
of a name (of a variable, constant, relation, or other entity)
and the use of that name in some subsequent statement. That
constraint is usually enforced by means of symbol tables.
For first-order logic (in KIF, CG, or predicate calculus notation),
the quantifiers have an effect that is similar to the declarations
in a programming language. And the rules for scope of quantifiers
happen to create a nested structure that is isomorphic to the
rules for scope of variables in Algol-like languages. Similar
problems arise in both kinds of languages, and similar techniques,
such as symbol tables, can be used to address them.
When a compiler finds a declaration (implicit or explicit) for
any named entity, it adds the name and associated information
about the type of entity to a symbol table, which is tied to
the current level of nesting. On each subsequent occurrence
of the name within the scope defined by the nesting, it checks
the symbol table to ensure that the current use is consistent
with the declaration.
Since EBNF does not make provision for a symbol table, we can adopt
the same approach that is used to define programming languages:
1. Use EBNF to specify the constext-free constraints.
2. State additional context-sensitive constraints in statements
that are outside the EBNF specifications.
These issues are very similar to the problems that came up when
we were talking about defining an XML version of CGIF. There is
no problem with defining the context-free issues, but CGIF has
the same context-sensitive constraints as FOL and Algol. Those
issues arise in relating the defining occurrence of a coreference
label to the bound occurrences of that label.
Since XSLT, like EBNF, does not make any provision for enforcing
or even specifying such constraints, the same issues come up
in MathML. Following is a web page that discusses them:
http://www.w3.org/TR/2003/NOTE-mathml-bvar-20031110/
Bound Variables in MathML
John
More information about the SCL
mailing list