[SCL] Re: CSL
Pat Hayes
phayes at ihmc.us
Mon Feb 16 11:13:38 CST 2004
>Here are some more comments on the syntax.
Many thanks.
>
>================
>
>In "1. Character classification"
>replace
> S = { space | tab | line | page | return }
> lexbreak = S | "(" | ")"
>by
> white = space | tab | line | page | return
> lexbreak = white | "(" | ")"
>and in "3. Lexical syntax"
>add
> S = white, {white}
>just before the first use of S in the definition of "open".
>
>Rationale: The present definition of "S" is wrong for two reasons:
>(1) it is not a character class (it is a sequence a characters);
>(2) it includes the empty sequence of white-space characters, which
>defeats the purpose.
Sound of hand smacking forehead. Will do.
> So, instead define "white" as the class
>of white-space characters and correctly define "S" as nonempty
>white-space in the lexical syntax section.
>
>================
>
>In "2. Lexical analysis"
>delete
> seqvarchar = alpha | digit | other | nonascii
>
>Rationale: seqvarchar is not being used (it's probably a relic of syntax
>development).
Done. Right, it was a relic.
>================
>
>In "3. Lexical Syntax"
>replace
> name = specialname | ( ( alpha | other ), {wordchar} ) - reservedElement
>by
> name = specialname | ( (namechar - digit - "@"), {namechar} ) -
>reservedElement
>or equivalently by
> name = specialname | ( ( alpha | other | nonascii ), {namechar} ) -
>reservedElement
>
>Rationale: wordchar was not consistently replaced by namechar as it should
>have been. nonascii is added as a possibility for the first character of
>a name; without this, for example, Greek characters or Japanese
>words cannot be
>used as a constant or variable names without prepending an ascii character.
Right, excellent point. The syntax began as an ASCII syntax and was
adapted to Unicode, obviously with bugs.
>
>================
>
>In "4. Core expression syntax"
>replace
> termseq = {S?, term}, S?, seqvar?
>by
> termseq = {S, term}, (S, seqvar)?
>also, correct the repeated definition of termseq in section "SCL Kernel"
Whoops.
>
>Rationale: You did not make the replacement I suggested and your substitution
>is wrong in some cases; it allows terms that are names to be adjacent to
>each other without intervening white-space. Likewise, seqvar must be
>separated from a preceding term that's a name.
Indeed. ( Hmm, maybe it would have been easier to put explicit commas
into the syntax, as Tanel suggested. Any observations on that idea?)
>
>================
>
>My reaction to the white-space problem is also a sigh.
:-)
>There's a policy decision to be made (or already made?) about whether
>(1) to be very precise in the formal syntax about white-space, which
>is painful,
>detailed work; the syntax will be rather ugly and difficult to understand
>(with extra nonterminals to indicate whether white-space may be required to
>their left and/or right)
>or
>(2) specify a nice syntax with terms always separated by white-space
>and insist that white-space be used even when its absence would not
>result in ambiguity
>or
>(3) specify the same nice syntax with terms always separated by
>white-space but
>specify in the text that it is allowed to squeeze out some of the white-space;
>however, this would preclude using this syntax specification to automatically
>generate parsers that forgive white-space omission.
There is also the option I used originally, which is
(4)to have a 'low-level' syntax which just carves up the character
stream into a stream of lexical items, then define the logical syntax
with lexical items (words and parentheses, basically) as the
primitive items. But then the whole thing isn't written in EBNF (each
half is, but ...) and some of the group feel strongly that we should
give a strict ISO EBNF syntax, and I guess I kind of agree that would
be best.
One option is to do 2 or 4 in the text and 1 in an appendix. Two sighs.
>
>================
>
>quotedstrings and numerals are specialnames are names and thus can be used
>to construct terms like ('abc' x) and (123 x y). Is this a feature or a
>bug?
Ugly feature, was the idea. Ie deliberate, but only in order to keep
the syntax simpler. However, now you point out the possibility its
obviously a bug, since it is ambiguous with the wrapper syntax.
>
>There are syntax rules that include
> term = ('quoted-string' term)
>and
> sentence = ('quoted-string' sentence)
>which appear to be wrappers for documenting terms and sentences.
Yes, they are, but there seems to be no reason to have separate
syntax classes for wrapped sentences and terms.
>
>In the current syntax,
> ('abc' x)
>can be parsed as either
> open quoted-string, term close -> term
>or
> open term, term close -> term
Right, I need to fix this. The second should be ruled out, or else we
need an explicit 'comment' marker for the wrapper syntax. I'm leaning
to the latter, since the former means that we would have to give
strings a special status of 'non-relation special names' which is
really against the spirit of the syntax.
Pat
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32501 (850)291 0667 cell
phayes at ihmc.us http://www.ihmc.us/users/phayes
More information about the SCL
mailing list