[SCL] Approach to a concrete syntax

Murray Altheim m.altheim at open.ac.uk
Wed May 21 19:41:11 CDT 2003


pat hayes wrote:
>>  >> Murray Altheim wrote:
>>  >> In the concrete syntax of XCL right now there are three separate
>>  >> elements
>>  >>
>>  >>   <quantifier />
>>  >>   <connective />
>>  >>   <predicate />
>>  >>
>>  >> that do to some extent the same thing, which is to take each other
>>  >> and <term> elements as content.
>>  >
>>  > ? Quantifiers require a bound variable (or list of them), connectives
>>  > apply to sentences, and predicates apply to terms (but in a 
>> semantically
>>  > distinct way, which needs therefore to be syntactically marked in some
>>  > way.)
>>
>> Understood. So we probably need distinct element types to enforce
>> what we declare as content.
>
> NOt sure if I follow that way of putting it, but I think yes.

Well, you seem to disagree with this further on, so I'll assume you
didn't understand what I meant, or that I was unclear, or both. (Logical?)

>>  >> Each is typed by a 'name' attribute,
>>  >> which should perhaps be a 'type' attribute (I used to call it this).
>>  >> A good reason to not call it 'type' is that this might be confused
>>  >> with the typing of the variable (where I'm talking here about the
>>  >> typing of the quantifier, e.g., "forall" vs. "exists").
>>  >
>>  > I am finding this very hard to follow, I confess. Familiar words 
>> seem to
>>  > be morphing new meanings in every sentence.
>>
>> Pat, remember that not everyone uses language the same way as you do.
> 
> Well, Im using it in the way that is familiar to anyone who knows a 
> modicum of basic logic. SCL is aimed primarily at such people.

Then you're writing the SCL spec for yourselves, which would be pretty
useless. I thought this was aimed at providing a framework for things
like the "Semantic Web" and for XML applications and... well, many of
those folks are like me (but haven't even bothered to RTFM as I'm
trying to do). All I'm asking is that you please note that I'm not
unusual, you are. And if I'm unusual, it's in my jumping head first
into your world. The TimBLs and the Dan Connollys are likely just like
me -- we had some background (perhaps) in logical training, but we
didn't study it in depth. I know logic from TTL and CMOS circuitry
I was playing with in the late 70's. I had a section on it in CompEng
before I decided to study art. Et cetera.

SCL needs to be steeped in the expertise of you guys, as formally
specified as you feel it needs to be. But it should probably be aimed
at people you think will actually use it. Sorry for the marketing talk,
but if this is only for logicians talking to logicians, I'm in the wrong
room.

>>  I'm
>> new to the field, and still learning. What are the meanings of "familiar"
>> words to you do not necessarily have the same meanings as they do to me.
>> They're not morphing, we're just learning to communicate with each other,
>> i.e., I'm trying to learn your meanings. This is part of 
>> communication, eh?
>
> Fair enough. OK, let me list some distinctions.
> Logic consists of sentences. They fall into a few basic types, depending 
> on what their primary syntactic form is. They can be atomic or boolean 
> or quantified. Atoms consist of a relation symbol and a sequence of 
> terms. (Terms have a similar syntax: they are either names or variables 
> or consist of a function symbol plus a sequence of terms.) Booleans are 
> boolean combinations of other sentences joined by and, or, not, implies, 
> iff. Quantified sentences consist of a quantifier, a set of bound 
> variables and another sentence. (This ignores a few complications 
> involving sequence variables and quantifier restrictions, but it gets 
> the basic idea.) None of these is anything like any of the others, but 
> their syntax is mutually recursive.

Yes, thanks. This is the basic understanding I have, though it's nice
to see it all in one place. John's KR book and "Intelligent Knowledge-
Based Systems" by I.S. Torsun have also been helpful.

>>  >> So we have things like:
>>  >>
>>  >>   <quantifier name="forall" />
>>  >>   <connective name="implies" />
>>  >>   <predicate name="equals" />
>>  >>
>>  >> The set of known types for these are called "reserved words", i.e,
>>  >> if somebody sees one of them in an XCL document we all know what
>>  >> they mean.
>>  >
>>  > BUt what about all the predicates (relations would be better) which 
>> are
>>  > not reserved? Equals is a rare exception (there will only be a few of
>>  > these in SCL).
>>
>> That's the purpose of having the PSIs. By defining a language that has
>> "plug-in slots" for concepts,
>
> An SCL language *is* a set of names (which I think is what you mean by 
> concepts). So I don't see the purpose of having slots to put them in.

Concepts exist in our heads. Names are identifiers for concepts. I
used "concepts" because "names" is too general to convey what I was
trying to say.

>>  we define "reserved words" or PSIs for
>> those we consider canonical parts of SCL, and allow other people to
>> create their own. So if you want meet(i,j), you create a PSI
>
> No!  We should not require users to restrict themselves to any 
> particular syntax for logical names. If they want to use PSIs, then 
> obviously we can allow that; but we must not require it. Similarly for 
> URIs.

I think we're talking past each other, AFAIK. I'm not trying to restrict
anything, I'm trying to open things up.

>> such as
>>
>>     http://pathayes.org/psi/myPredicates.html#meet
>>
>> and publish a short document describing the PSI in human-readable
>> language (letting the world know what it means)
> 
> It is unlikely that any such document can possibly capture the formal 
> meaning. Again, English comments are a useful feature which we should 
> allow as an option, but not require.

The document at that URL contains an ID "meet". When you point your
web browser at that URL, the page loads and the document scrolls
down to something called "meet", where there is a definition of what
the predicate meet(i,j) means. A formal definition. The URL is used
as an identifier for the concept "meet(i,j)". If you as the publisher
want to say that that URL is stable, i.e., for other people to use
as an identifier too, then you call that URL a PSI. Then when two
people want to refer to "meet(i,j)", they use

     http://pathayes.org/psi/myPredicates.html#meet

>> . Then use it as:
>>
>>      <predicate>
>>         <type 
>> xlink:href="http://pathayes.org/psi/myPredicates.html#meet"/>
>>         ...
>>      </predicate>
>>
>>  >> For interchange, each has an associated PSI, since
>>  >> "exists" and "forall" are names within the XCL namespace.
>>  >
>>  > What sense of 'namespace' is this?
>>
>> Not the XML Namespace (which I'll endeavour to always capitalize), but
>> the set of names used in the XCL markup language, i.e., the specific
>> elements and attributes, plus any reserved words or PSIs (they are
>> semantically identical in XCL, the former shorthands for the latter,
>> for ease of use. In RDF there are no shorthands; you always use the
>> full URLs).
>>
>>  >>  It would
>>  >> be poor practice to mix datatypes
>>  >
>>  > ??When did datatypes come into the discussion??
>>
>> When we began talking about XML markup. The string "exists" is a string,
>> whereas "http://purl.org/xcl/1.0/#exists" should not be interpreted as
>> a string, but rather as a URI.
>
> I have it on good W3C authority that URIs are strings. They certainly 
> seem to be strings to me: RFC 2396 defines URIs as character sequences 
> and XML Schema Part 2 defines XSD strings the same way.

Well, admittedly URIs are not cacti either. They may be composed of
strings, as are numerals, but they're *interpreted* as URIs. In the
specific context of an XML document, one needs to know whether to
interpret any specific string as a decimal number, a hexidecimal
number, a date, a URI, etc.

I'm looking at XSD and I see a lot of different datatypes. If I have
an XML attribute that has some content, say

    <a b="d3" />

I need to be able to know how to interpret 'd3'. 'd3' may be a hex
number, it may be a filename. That's what I meant by datatype. If you
look at what XSD was trying to accomplish, it was to type data in XML
markup.

DTDs weren't originally designed to constrain document content, but
rather markup, so they have few features for constraining content.
That's where RELAXNG and XML Schema come in. But absent such features,
we (in the HTML WG) still *specified* what kind of content each
attribute was supposed to have. It is typed by either a "notation"
or a "datatype", depending on how it is to be used in the markup.

   http://www.w3.org/TR/xhtml-modularization/dtd_module_defs.html#a_module_XHTML_Datatypes

What you see there is a poor man's version of XSD (I was the poor man).

>> That's all I meant, just that if you use
>> XML attributes to contain things, they should only contain one kind of
>> thing, one kind of datatype. Requiring processors to figure out what
>> kind of thing is in an attribute is very poor practice.
> 
> We seem to be in different wavelengths. Are you using 'datatype' in the 
> XML Schema part 2 sense, or some other sense? If the former, I cannot 
> make sense of what you are saying here. Right now, SCL has NO datatypes 
> in it, anywhere. We are here talking about syntax, not datatypes.

I am talking about syntax. If I have

    <quantifier type="d3">
      ...
    </quantifier>

I need to know the datatype of "d3". Is it a string or is a URI? I
don't want it to be both. I am designing markup and I need to be
very specific. Since on the one hand it might be:

    <quantifier type="forall">
      ...
    </quantifier>

and on the other it might be

    <quantifier type="http://purl.org/cl/scl/1.0/#forall">
      ...
    </quantifier>

I'm trying (not very well) to say that not specifying the datatype
of the 'type' attribute would be very poor practice. It can't be
both a string and a URI. SCL processors wouldn't know how to
interpret/process it. So what I'm suggesting is that rather than
use an attribute, we break it off into an element, well, actually
two elements (showing a more appropriate expression of the previous
two examples):

    <quantifier>
      <type>forall</type>
      ...
    </quantifier>

    <quantifier>
      <typeRef xlink:href="http://purl.org/cl/scl/1.0/#forall"/>
      ...
    </quantifier>

I can design an XML schema (a DTD, RELAXNG grammar or XML Schema
document) to define this easily. There's no ambiguity. In DTD
syntax this would be (ignoring the "..." content for now):

   <!ELEMENT quantifier  ( type | typeRef ) >
   <!ELEMENT type  ( PCDATA )* >
   <!ELEMENT typeRef  EMPTY >
   <!ATTLIST typeRef
       xlink:href    CDATA    #REQUIRED
   >

This declares three element types, <quantifier>, <type>, and <typeRef>.
The content of <quantifier is one <type> OR one <typeRef>. The former
contains character data (PCDATA), the latter is a linking element
that has an attribute containing a URI. If we were using RELAXNG or
XML Schema this would look a hell of a lot more complicated, but we
could constrain the contents of <type> and <typeRef> so that a validator
would flag an error if it found the wrong datatype. Actually, there's
a trick in DTDs called "enumerated values" where we could change <type>
from having element content to having an attribute, then constrain the
attribute to having a fixed set of name tokens, such as "forall" or
"exists". Then even XML 1.0 validators using DTDs could flag errors.

I'm guessing this is more than you wanted to know, but I'm trying to
be clear.

> <snip>
> 
>>  > Why do you need to get into this complexity of naming a
>>  > quantifier? I would be quite happy to have
>>  > (forall (?x)(P ?x))
>>  > rendered into XML as something like
>>
>> [I've rewritten the whitespace so I can see it better.]
>>
>>     <forall>
>>       <boundvar>?x</boundvar>
>>         <body>
>>           <atom>
>>             <rel>P</rel>
>>               <termseq>?x</termseq>
>>           </atom>
>>         </body>
>>      </forall>
>>
>> That's fine, and you might be happy, but what about those people
>> who want more than two quantifiers? There are more than two, right?
>
> Wrong. In SCL syntax there are two.

In the base SCL syntax there are two, but if you're creating an XML
syntax for a language designed to be extended into other forms of
expression, other forms of logic, there's many more than two. I'm
suggesting a means of allowing constraints that would allow a
restriction to the exact set of quantifiers, connectives and
predicates in SCL, but also allow it to be extended. SCL processors
could have two modes (easily), one that flagged an error if it
came across a name it didn't know, and one that allowed other names.

>> There's #$thereExistsAtMost and perhaps dozens of others.
> 
> No, none of these are quantifiers in SCL. We may add numerical 
> quantifiers, but then there will still only be four at most.

I'm designing an XML syntax that supports SCL but can be extended.
I'm trying to provide a means of extension that isn't ugly and
doesn't require people to create their own markup in order to
extend the base language.

>>  So what
>> I'm proposing is not hardwiring the quantifier type as an element
>> name, and having it specified either as a "reserved name" or a PSI.
>> The idea of using "reserved names" is not some XML thing, it's just
>> an idea of a shorthand for SCL since I'm guessing you all would
>> rather not be typing long URLs all the time, and there are a small
>> set of initial types.
>>
>> So, these two examples would be identical semantically. You probably
>> don't need the "?" in front of the x in <boundvar> since it can be
>> implied in the context.
> 
> A distinctive prefix for variables is a widely used convention. I would 
> abandon it only very reluctantly and under the pressure of a very cogent 
> reason.

The reason is that the notation is not classical FOL, it's XML. In
XML, if you have an "x" inside of a <boundvar> element, you always
know it's preceded in the classical FOL expression by a "?". If
someone wrote a translator to go between classical and SCL, they'd
have no difficulty with this: you add the [implied] question mark
in going to classical, you remove it in SCL since it's implied by
the containing element, <boundvar>. Anything in a <boundvar> would
have a "?" in front of it, correct? Then it's unnecessary.

>> I prefer your use of an element for bound
>> variables since using attributes requires you define a fixed set
>> of them, whereas you can always just add another <boundvar> element
>> if you're using elements. The specifid PSIs and element names below
>> aren't anything I'm married to, just using for the examples:
>>
>>     <quantifier>
>>       <type>forall</type>
>>       <boundvar>x</boundvar>
>>         <body>
>>           <atom>
>>             <rel>P</rel>
>>               <termseq>?x</termseq>
>>           </atom>
>>         </body>
>>     </quantifier>
>>
>>     <quantifier>
>>       <typeRef xlink:href="http://purl.org/cl/scl/1.0/#forall"/>
>>       <boundvar>x</boundvar>
>>         <body>
>>           <atom>
>>             <rel>P</rel>
>>               <termseq>?x</termseq>
>>           </atom>
>>         </body>
>>     </quantifier>
>>
>> My idea in XCL was to have no external links in Level 1 (as in
>> the first example), and enable external links in Level 2 (as in
>> the second example).
> 
> What levels are you referring to here?

Have you read or skimmed the XCL specification I posted?

   http://purl.org/xcl/1.0/

Level 1 is what you might think of as "pure" SCL. It has no linking
features and can't be extended. Level 2 includes some simple linking
features, allowing the use of PSIs so that SCL can be extended. They're
in the same spec as two "levels" because I want it to be clear that
there's a very clear path from SCL Level 1 to SCL Level 2, that
Level 2 processors can process Level 1 documents, and a Level 1
processor will know what kind of failures might happen when
accidentally encountering a Level 2 document. It's easy then for
developers to write error handling, etc. It's a clear extension
path, whereas the W3C approach is usually "design your own custom
markup and put it in the document wherever you like". So we'd have
SCL and then suddenly there's some weird element or attribute that
the processor (and the user) has no idea what it means. Like if you
save to HTML from MS Word and look at what it produces. Same problem.

>> And then associate the reserved word "forall"
>> with the PSI "href="http://purl.org/cl/scl/1.0/#forall" so that
>> Level 1 and 2 have the same interpretation.
>>
>> Now, I realize we're not talking Level 1 and 2 here, but the idea
>> can be the same. I think SCL should enable external links so that
>> people can define their own quantifiers, connectives and predicates
>> by defining their own PSI sets. This would enable the entire world
>> of other logics to be built upon SCL by use of PSI sets, which all
>> would come *later*. We'd just define the initial set for SCL v1.0.
> 
> You seem to be engaged in an entirely different project, to provide a 
> notational framework for al the world's logical syntaxes. That might be 
> an interesting thing to tackle, but it is not the SCL project.

No, I'm designing a very simple syntax for SCL that has a clear path
for being extended. I'm not doing any extending. I'm not sure where
that's in conflict with what you think SCL should be, but if SCL
can only be extended by custom XML syntax then it'll be a mess. I'm
just trying to minimally plan for the known future. If you don't want
any planned, known future beyond SCL-as-XML, no SCL-for-the-web, then
I'll bow out of the project. You don't need me. Chris has it well
in hand, the SCL proposal he has now will do just fine. From John's
admonition for me to halt all neural activity, I probably should just
take his advice and just leave.

>> BTW, why do we need <body>? Couldn't it be safely eliminated? We
>> can enforce element order in XML, so <atom> elements always must come
>> after <boundvar>, etc.
> 
> Well, true, but then we can eliminate almost all the XML tags, for that 
> matter. All we actually need are the LISP-stype brackets. I thought the 
> whole idea of putting these tags in was to enable an engine to extract 
> the relevant syntactic parts without doing any parsing. (?) It is often 
> important to be able to identify the body of a quantified statement.

In LISP there are things that could be considered extraneous too. I'm
only suggesting that in XML you don't need container elements for
things that always occur, or always occur in a specific order. That
the constraint languages for XML won't need <body> if the contents of
<body> always contain only a fixed set of elements. Now, I'm not
necessarily advocating getting rid of <body>, just questioning whether
it's actually necessary in that context. I'd have to look at Chris'
proposal again. I think his syntax made pretty good sense.

> <snip>
> 
>>  >
>>  > I would be happier with an XML syntax which followed, rather than
>>  > re-defined or re-conceptualized, the SCL abstract syntax. The primary
>>  > purpose of concrete syntaxes is to adequately express the 
>> categories and
>>  > relationships which the spec attaches, in terms of a model theory, to
>>  > the abstract syntax. The general form of the syntax is a fairly
>>  > well-developed logical flower at this stage, both in the 
>> generalizations
>>  > it makes and the ones it does not make. It would be possible to
>>  > conceptualize a quantifier as a kind of predication on a set, for
>>  > example, with an expression as a parameter; or as a functional 
>> operator,
>>  > or in many other ways; but each of these would require re-thinking the
>>  > entire SCL semantic apparatus in a new and alien way. We have done 
>> that
>>  > thinking and do not want to take it apart and re-do it at this stage.
>>
>> Okay, I'm fine with that. The only alteration to the syntax that Chris
>> has proposed that I'd make would be to generalize the quantifiers,
>> connectives, and predicates to take a typing parameter
>
> Can you say why? That seems a slightly crazy idea to me, I have to 
> confess. That set - quantifiers, connectives and relations - doesn't 
> seem like the kind of class I would be tempted to generalize over. It 
> reads like 'sheep, fish and pieces of copper', you know?

I think you're misunderstanding me. I'm not suggesting generalizing
over the three of them as a group, but as three individual classes of
things. So where Chris has:

    <forall>
    <exists>
    <and>
    <or>
    <equal>
    ...

I'd have:

    <quantifier>
      <type>forall</type>
    </quantifier>

    <quantifier>
      <type>exists</type>
    </quantifier>

    <connective>
      <type>and</type>
    </connective>

    <connective>
      <type>or</type>
    </connective>

    <predicate>
      <type>equal</type>
    </predicate>
    ...

These are lexically different but semantically identical. The reason
for the added complexity is so that we can (either in the main SCL
syntax or in a "Level 2") allow as an alternate to <type> that <typeRef>
element I described above, allowing the language to be extended via PSIs.

>> so that rather
>> than hardwire their names/types (?), we'd use PSIs and perhaps "reserved
>> words" that act as shorthands for the PSIs (this latter is still an
>> experimental concept, but would not be hard to implement for developers;
>> we could at some later time even develop a syntax for declaring one's
>> own shorthands, but that's not necessary really).
>>
>>  >> This gets back to that idea I expressed in the telecon about there
>>  >> being simply one fundamental graph relation, which I think Pat, you
>>  >> corrected me in saying it was an n-tuple relation. In either case
>>  >> I think the current XCL syntax is probably overcomplicated in having
>>  >> too many elements for predicates, that there needs be only one.
>>  >
>>  > It is important that the syntax clearly identify symbols used in
>>  > relation position from those used in function positions. It also 
>> should
>>  > allow the same symbol to be used in both kinds of position. It also
>>  > should allow an arbitrary term to occur in either position. All of 
>> this
>>  > is part of the abstract syntax.
>>
>> I *think* the XCL syntax I've proposed allows this. If not, I need to
>> understand this paragraph better. What would be the difference between
>> "relation position" to "function position"? (an example would be
>> appreciated).
>
> Relation position is the head of an atomic sentence, function position 
> is the head of a term. For example in
> (and (IsParent (father Joe)) (foo baz))
> isParent occurs in relation position, father in function position. Joe 
> and baz don't occur in either.

Okay, thanks. That was clear.

Pat, you know logic. I have great respect for your knowledge of logic.
I've been running into your papers and seeing your name pop up while
I'm doing my Ph.D. research, and pretty much everyone I've talked to
has tremendous respect for your abilities.

Now, I happen to know markup. There are some people who know more about
markup than I do, but I know who most of them are. There's not many of them.
Now, I'm hoping we can work together on marrying logic and markup in a way
that the web community will immediately grok and want to play with, will
want to incorporate in their projects. It needs to work for expressions
of basic logic, but it also needs to form a foundation for any other
XML expressions on top of basic logic. OpenCyc has planned but not yet
delivered their XML syntax. It *could* be based on SCL "Level 2", i.e.,
"SCL-on-the-web", but probably not on "SCL-in-XML" without being
extended by custom syntax. You said earlier

 > You seem to be engaged in an entirely different project, to provide a
 > notational framework for al the world's logical syntaxes. That might be
 > an interesting thing to tackle, but it is not the SCL project.

and maybe that's true. Maybe it is a different project. But what I'm
trying to do then is coordinate *my* project with yours, so that we
don't see a bunch of custom syntaxes when all we need is one basic
syntax. That's not a bad thing, and it's EASILY doable. My XCL Level 2
is basically it, and in its current incarnation it's actually more
than is even needed. I haven't had a chance to update it since Chris
posted his latest SCL draft, but I think there could easily be harmony
between the two projects, enough that they could be one project.

And that's not a bad thing, even if you're not interested in it.

Murray

...........................................................................
Murray Altheim                         http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK                    .

   Jessica Lynch became an icon of the war, and the story of her capture by
   the Iraqis and her rescue by US special forces will go down as one of the
   most stunning pieces of news management yet conceived. It provides a
   remarkable insight into the real influence of Hollywood producers on the
   Pentagon's media managers, and has produced a template from which America
   hopes to present its future wars.  -- The Guardian, 15 May 2003
   http://www.guardian.co.uk/g2/story/0,3604,956127,00.html




More information about the Scl mailing list