[SCL] Approach to a concrete syntax

Murray Altheim m.altheim at open.ac.uk
Thu May 22 04:03:43 CDT 2003


pat hayes wrote:
>> Murray Altheim wrote:
[...]
>> All I'm asking is that you please note that I'm not
>> unusual, you are. And if I'm unusual, it's in my jumping head first
>> into your world. The TimBLs and the Dan Connollys are likely just like
>> me -- we had some background (perhaps) in logical training, but we
>> didn't study it in depth.
>
> Actually Dan C, knows logic quite well, he was taught it by an old 
> colleague of mine. But you know, I have never studied logic in DEPTH 
> (like Chris Menzel has, for example). Im not talking about grokking 
> logic like a professional logician, just knowing the stuff I used to 
> teach in logic 101.

Well, maybe I'm making this out to be more complicated than it is. I
don't know how well Dan knows logic, but I certainly have a knowledge
of what might be called Logic 101, from back in probably 1979-1980. I
haven't used it since, so most of it is gone-from-the-brain. I can't
pretend to have spent time writing FOL sentences on the chalk board,
but I probably have enough knowledge to be dangerous. The biggest problem
I'm having here is communication, not ability in logic.

I've also never seen a formal statement of the requirements that SCL
is trying to solve, so perhaps neither knows what the other is about.
I've tried to write my requirements in a form that would enable someone
to understand them, and the motivation and direction of my angle on the
project. That's why I bothered writing the XCL spec.

[...]
>> Concepts exist in our heads. Names are identifiers for concepts. I
>> used "concepts" because "names" is too general to convey what I was
>> trying to say.
>
> Well, concepts is a very fuzzy notion, very hard to pin down. 
> Philosophical; best avoided, if you ask me. (For example, in our heads?? 
> My goodness, there is a world of debate in that assumption.) Names, on 
> the other hand, are just pieces of syntax.

Ah. I guess we'll just have to differ on the philosophy then. I hardly
consider names as just pieces of syntax, nor do I afford them any sense
of being isolated from their interpretation. But from the perspective
of a syntax, okay, they're just identifiers. We'll leave "for what?"
out of it.

[...]
>>> No!  We should not require users to restrict themselves to any 
>>> particular syntax for logical names. If they want to use PSIs, then 
>>> obviously we can allow that; but we must not require it. Similarly 
>>> for URIs.
>>
>> I think we're talking past each other, AFAIK. I'm not trying to restrict
>> anything, I'm trying to open things up.
> 
> What I mean is, that SCL should not require any particular format for 
> identifiers (logical names) . If someone wants to use 'father' (six 
> ascii characters) as a relation name, then the fact that that string 
> isnt a URI or a PSI should not be a barrier to their using it.  Some SCL 
> languages will be web languages, but others will be use in other ways 
> and may have other reasons for imposing their own syntactic conventions 
> on names.

Topic Maps solves this by considering that whatever that "ur" thing
is we're talking about, one can assign it zero or more names, and these
can be in any form whatsoever (even links to external things that can't
be expressed in characters, like graphical icons, images, QuickTime
movies, etc.). If we're still talking some separation between expressed
syntax (XML) and semantics, at the syntactic level things in XML are
referred to generally by identifier. The identifiers (IDs) can be XML name
tokens (which have a restricted syntax, e.g., no whitespace), URIs as
links, query expressions, etc.

>>>> such as
>>>>
>>>>     http://pathayes.org/psi/myPredicates.html#meet
>>>>
>>>> and publish a short document describing the PSI in human-readable
>>>> language (letting the world know what it means)
>>>
>>>
>>> It is unlikely that any such document can possibly capture the formal 
>>> meaning. Again, English comments are a useful feature which we should 
>>> allow as an option, but not require.
>
>> The document at that URL contains an ID "meet". When you point your
>> web browser at that URL, the page loads and the document scrolls
>> down to something called "meet", where there is a definition of what
>> the predicate meet(i,j) means. A formal definition.
>
> This is going outside our topic to some extent, but this account just 
> doesn't work. First, the ways in which browsers should interact with 
> reasoning is highly controversial and labile, and  it would be very 
> inadvisable for us to assume that it will be globally settled in this 
> kind of a way. For one thing, its not clear that SW formalisms will have 
> anything to do with browsers at all; SW engines might use HTTP for 
> completely different purposes. Second, most formal names (predicates or 
> otherwise) simply do not have definitions: the whole idea of a 
> 'definition' does not fit well with logical semantics. Third, even if it 
> had one, there is no utility to be had from writing it in English and 
> making it readable by a human being watching a browser screen: the 
> entire point of the semantic web is that meanings of this kind of markup 
> must be accessible by software, without human invention.

I clearly don't understand your point of view. There is no world in
which humans are not involved in interpretation that is separated
from interpretation. Software is just an intermediary written by humans,
and the software used in processing SCL is not just a telephone connection
used as a carrier for information, it's reasoning is based on whatever the
programmer understood when writing it. This reasoning is expressed using
identifiers for concepts. When I put my ATM card in the machine, it uses
an identifier for "Murray's bank account". It's not meaningless.

I don't believe meaning exists in machines. Do you? If there's meaning,
it's only in interpretation.

>> The URL is used
>> as an identifier for the concept "meet(i,j)".
>
> Fourth, there isn't a coherent notion of a "concept" which can be given 
> an identifier.

Then how does anything get expressed as "x"? What is the connection between
"x" and the thing it represents?  Is "x" a name/identifier for something?
Or does "x" represent nothing? So in this example, I used "meet()" because
I know you know what it means -- you defined it. If I point a browser at
a definition of "meet()" we can talk about it. Hey, I haven't even pointed
a browser at it and we both know what it's about. I can refer to it as
John Sowa did in his KR book [KR] on page 114 as "Allen and Hayes (1985)"
and you know what I'm talking about. Well, "[KR]" and "Allen and Hayes
(1985)" are identifiers, names for things. We can assign them URLs and
let people write software to reason about them.

>> If you as the publisher
>> want to say that that URL is stable, i.e., for other people to use
>> as an identifier too, then you call that URL a PSI. Then when two
>> people want to refer to "meet(i,j)", they use
>>
>>     http://pathayes.org/psi/myPredicates.html#meet
> 
> It seems to me that this can be done just by using URI conventions, as 
> in RDF, no?

Yes. PSIs are *just* URIs. The *only* difference is that the person
publishing them (telling the world they exist) says that they are stable
enough to be used as identifiers, that their meaning won't change within
some reasonable period. And there are some recommendations on publishing
metadata with them. But they're just URIs.

[...]
>>>>  > ??When did datatypes come into the discussion??
>>>>
>>>> When we began talking about XML markup.
>
> I didn't think we were talking about markup at all. What we are looking 
> for is an XML syntax for SCL, not an SCL-style markup language. (Have I 
> misunderstood what you mean by 'markup'? )

How do you differentiate "XML" from "markup"? XML is a form of markup. If
you express SCL as XML, you're expressing it in an XML markup language.

>>>> The string "exists" is a string,
>>>> whereas "http://purl.org/xcl/1.0/#exists" should not be interpreted as
>>>> a string, but rather as a URI.
>>>
>>>
>>> I have it on good W3C authority that URIs are strings. They certainly 
>>> seem to be strings to me: RFC 2396 defines URIs as character 
>>> sequences and XML Schema Part 2 defines XSD strings the same way.
>>
>>
>> Well, admittedly URIs are not cacti either. They may be composed of
>> strings, as are numerals, but they're *interpreted* as URIs.
> 
> Oh, sure: but we are talking here about syntax, right? And in the syntax 
> they are strings (of a particular form, but still strings).

That particular form is what I'm calling a "datatype", i.e., what the
string is interpreted as. In XML it's important to differentiate them
so that URIs can be treated as URIs, not strings. They're not the same
thing.

>> In the
>> specific context of an XML document, one needs to know whether to
>> interpret any specific string as a decimal number, a hexidecimal
>> number, a date, a URI, etc.
>
> BUt this would only be an issue for us if we planned to use these 
> constructs in the SCL syntax, right? And we do not (currently).

I'm planning on using URIs. But I'm on a diffent project it seems.
So is Tanel, since he plans to work in RDF/RDFS, which is completely
littered with URIs.

[...]
>>>> That's all I meant, just that if you use
>>>> XML attributes to contain things, they should only contain one kind of
>>>> thing, one kind of datatype. Requiring processors to figure out what
>>>> kind of thing is in an attribute is very poor practice.
>>>
>>> We seem to be in different wavelengths. Are you using 'datatype' in 
>>> the XML Schema part 2 sense, or some other sense? If the former, I 
>>> cannot make sense of what you are saying here. Right now, SCL has NO 
>>> datatypes in it, anywhere. We are here talking about syntax, not 
>>> datatypes.
>>
>> I am talking about syntax. If I have
>>
>>    <quantifier type="d3">
>>      ...
>>    </quantifier>
>>
>> I need to know the datatype of "d3".
> 
> Why? I repeat, this is syntax. Syntax CONSISTS of character strings, or 
> graphical marks arranged in patterns, or whatever: but it consist of 
> MARKS on a surface.  The syntax of XML is a string of characters 
> arranged in various ways using <,> and / rather heavily, for example: 
> but it is still a string of characters. Hypertext and markup (like TeX) 
> rather blur this issue by using syntax to describe the form of other 
> syntax, but the kind of XML useage we are considering isn't markup of 
> other text: it just IS text, pure and simple.

I suppose before we go on we must mutually understand the difference
in your mind between "XML" and "markup". XML is not just a string of
characters, its an interpretation of a string of characters as a
combination of markup characters and data characters. There is a distinct
differentiation between markup and content, for example, the "<" and ">"
are interpreted differently than "a" or "b".

If SCL's XML expression is text, pure and simple, then you're claiming
that "<" and ">" are not to be interpreted as markup characters? I
clearly don't follow you.

> I do not express a universal quantifier by *describing* it: I 
> distinguish it by having a syntactic convention for writing universal 
> quantifiers, which might be a special symbol (inverted-uppercase-A) or 
> it might be a special reserved word ('forall') or it might be a 
> graphical convention of some kind, or it might be a particular XML tag 
> surrounding it. Datatypes are only meaningful if we are using strings to 
> describe some other domain (numerals to describe numbers, for example), 
> but that should arise here only if we were designing a metalanguage for 
> describing syntax rather than designing an actual syntax.

Well, I only brought up datatypes to differentiate between strings and
URIs. URIs are not simply strings, they're strings interpreted as
according to a specification for URIs, namely the IETF RFC.

>> Is it a string or is a URI? I
>> don't want it to be both.
> 
> 
> I don't see how you can avoid its being both, given the definitions in 
> XML Schema part 2.

Where in XML Schema part 2 does this shield you from strings and URIs
being distinct datatypes? In XSD parlance, strings and URIs have different
value spaces and different lexical spaces. One overlaps the other, but
in XML they are given context and are differentiated.

>> I am designing markup and I need to be
>> very specific. Since on the one hand it might be:
>>
>>    <quantifier type="forall">
>>      ...
>>    </quantifier>
>>
>> and on the other it might be
>>
>>    <quantifier type="http://purl.org/cl/scl/1.0/#forall">
>>      ...
>>    </quantifier>
 >
> So is your point that these would be different?

They are identical semantically, to be interpreted absolutely identically.

>> I'm trying (not very well) to say that not specifying the datatype
>> of the 'type' attribute would be very poor practice.
> 
> First, we do not need this attribute. Second, I fail to see why it would 
> be poor practice, since we only need two values of the attribute in any 
> case.
> 
>> It can't be both a string and a URI.
> 
> Why not?

Are we talking XML?

>> SCL processors wouldn't know how to
>> interpret/process it. So what I'm suggesting is that rather than
>> use an attribute, we break it off into an element, well, actually
>> two elements (showing a more appropriate expression of the previous
>> two examples):
>>
>>    <quantifier>
>>      <type>forall</type>
>>      ...
>>    </quantifier>
>>
>>    <quantifier>
>>      <typeRef xlink:href="http://purl.org/cl/scl/1.0/#forall"/>
>>      ...
>>    </quantifier>
>> I can design an XML schema (a DTD, RELAXNG grammar or XML Schema
>> document) to define this easily. There's no ambiguity. In DTD
>> syntax this would be (ignoring the "..." content for now):
>>
>>   <!ELEMENT quantifier  ( type | typeRef ) >
>>   <!ELEMENT type  ( PCDATA )* >
>>   <!ELEMENT typeRef  EMPTY >
>>   <!ATTLIST typeRef
>>       xlink:href    CDATA    #REQUIRED
>>   >
>>
>> This declares three element types, <quantifier>, <type>, and <typeRef>.
>> The content of <quantifier is one <type> OR one <typeRef>. The former
>> contains character data (PCDATA), the latter is a linking element
>> that has an attribute containing a URI. If we were using RELAXNG or
>> XML Schema this would look a hell of a lot more complicated, but we
>> could constrain the contents of <type> and <typeRef> so that a validator
>> would flag an error if it found the wrong datatype. Actually, there's
>> a trick in DTDs called "enumerated values" where we could change <type>
>> from having element content to having an attribute, then constrain the
>> attribute to having a fixed set of name tokens, such as "forall" or
>> "exists". Then even XML 1.0 validators using DTDs could flag errors.
>>
>> I'm guessing this is more than you wanted to know, but I'm trying to
>> be clear.
> 
> 
> All of this seems to be solving a problem (in fact an entire series of 
> problems) which should never have been allowed to arise in the first 
> place. They are artifacts of XML, and have no relevance to SCL.

I thought we were trying to design an XML syntax for SCL. I'm trying
to solve the issues that arise in designing such a syntax. That certainly
seems relevant to me. No, they aren't relevant to SCL as an abstract
syntax. They are artifacts of XML. We are discussing an XML syntax for SCL.
Am I in the wrong room?

>>> <snip>
>>>
>>>>  > Why do you need to get into this complexity of naming a
>>>>  > quantifier? I would be quite happy to have
>>>>  > (forall (?x)(P ?x))
>>>>  > rendered into XML as something like
>>>>
>>>> [I've rewritten the whitespace so I can see it better.]
>>>>
>>>>     <forall>
>>>>       <boundvar>?x</boundvar>
>>>>         <body>
>>>>           <atom>
>>>>             <rel>P</rel>
>>>>               <termseq>?x</termseq>
>>>>           </atom>
>>>>         </body>
>>>>      </forall>
>>>>
>>>> That's fine, and you might be happy, but what about those people
>>>> who want more than two quantifiers? There are more than two, right?
>>>
>>>
>>> Wrong. In SCL syntax there are two.
>>
>>
>> In the base SCL syntax there are two, but if you're creating an XML
>> syntax for a language designed to be extended into other forms of
>> expression, other forms of logic
> 
> 
> We aren't doing that. We are providing a single logic which we think is 
> rather general in its application, and we need an XML syntax for that 
> logic.
> 
>> , there's many more than two.
> 
> 
> No, there are two. (Or maybe four, if we allow numericals.) There will 
> never be many more. There have been two quantifiers in logic since 
> Peirce and Frege were writing; arguably since Aristotle. There is 
> absolutely no need to make the set of quantifiers extendable.
> 
>> I'm
>> suggesting a means of allowing constraints that would allow a
>> restriction to the exact set of quantifiers, connectives and
>> predicates in SCL, but also allow it to be extended.
> 
> 
> There is absolutely no need to allow for extending the quantifiers and 
> connectives, any more than English needs to allow for new grammatical 
> forms. Extending the set of predicates is another matter, but there the 
> error is to label them as 'predicate' in the syntax in the first place.
> 
> (It might make sense to allow for extending SCL by entirely other 
> classes of syntactic operators. For example, someone might want to add 
> modalities to the language, or have a special comment attachment syntax 
> for encoding detailed provenance information, or to allow terms with 
> bound variables such as lambda-expressions . But these would be entirely 
> new classes in the syntax, not new types of connective or quantifier.)

The idea with XCL (my proposal for SCL expressed in XML, i.e., an
XML markup language expressing SCL), is that the Level 1 syntax is
an exact expression of SCL with nothing else. XCL Level 2 is the
XML framework upon which modal and other forms of logic can sit.
They're not in conflict, and because Level 1 syntax has a Level 2
isomorphism, there's no reason why XCL Level 2 can't be taken
entirely offline and I'll just take over the known world by myself,
thankyouverymuch. But the expression of SCL-in-XCL shouldn't preclude
being extended. I'm just trying to coordinate what you little people
do with what I'm doing while taking over the world.

>> SCL processors
>> could have two modes (easily), one that flagged an error if it
>> came across a name it didn't know, and one that allowed other names.
>>
>>>> There's #$thereExistsAtMost and perhaps dozens of others.
>>>
>>>
>>> No, none of these are quantifiers in SCL. We may add numerical 
>>> quantifiers, but then there will still only be four at most.
>>
>> I'm designing an XML syntax that supports SCL but can be extended.
> 
> Well, that is a different project. I don't myself think it is a very 
> worthwhile one, by the way, but it is certainly very different in any case.

Then I'm probably in the wrong room, as I said previously.

>> I'm trying to provide a means of extension that isn't ugly and
>> doesn't require people to create their own markup in order to
>> extend the base language.
> 
> I am not particularly interested in allowing users to extend the base 
> language. If people ever plan to extend the language then they will 
> presumably use new syntax categories, and the XML markup of their 
> extended languages will reflect that extra structure. But we cannot 
> pre-guess what that will be or what it will mean.

My point is that with a very minor variation in syntax (not even in
using XLink or anything fancy, just moving some stuff from being
attributes to being child elements), my megalomaniacal fantasies can
be lived out to their fullest.

>>>>  So what
>>>> I'm proposing is not hardwiring the quantifier type as an element
>>>> name, and having it specified either as a "reserved name" or a PSI.
>>>> The idea of using "reserved names" is not some XML thing, it's just
>>>> an idea of a shorthand for SCL since I'm guessing you all would
>>>> rather not be typing long URLs all the time, and there are a small
>>>> set of initial types.
>>>>
>>>> So, these two examples would be identical semantically. You probably
>>>> don't need the "?" in front of the x in <boundvar> since it can be
>>>> implied in the context.
>>>
>>> A distinctive prefix for variables is a widely used convention. I 
>>> would abandon it only very reluctantly and under the pressure of a 
>>> very cogent reason.
>>
>> The reason is that the notation is not classical FOL, it's XML.
> 
> That is irrelevant; the point is to make it easy to transfer syntactic 
> forms between different syntaxes. If the XML syntax is the only one 
> which writes variables differently, what is gained by being 
> idiosyncratic? Its not as though one or two extra characters are likely 
> to be noticed in the deluge of tag names, after all.

No, that's correct. I was just pointing out that the "?" is entirely
unnecessary. You yourself have said this is all to be interpreted by
computers. We can assume the upside down "A" is interpreted as a "forall",
and <boundvar>x</boundvar> is interpreted as "?x" would be in classical
FOL. In the XML syntax you don't need the question mark.

[...]
> Well, maybe, but please tell us what your project is, and then we can 
> discuss it. For a start, why do you think that anyone will want to 
> extent the quantifiers and connectives, and what role do you see XSCL 
> having in markup?

I've clearly published my project and referred to it a number of times.
I've tried to describe it repeatedly. XCL Level 1 is an expression of
SCL in XML. XCL Level 2 is an extension of Level 1 that allows for
web things like links, URIs, etc. Both you and Tanel apparently think
you can express SCL in RDF. Well, with RDF you are already in "Level 2"
territory, with links, URIs, etc. All I'm trying to do is have a basic
SCL-in-XML syntax that is in great harmony with its Level 2 syntax,
i.e., that they are semantically *identical*, with *nothing* added by
the move in levels, a smooth transition. You won't get that with RDF,
as you've added the entire mess of RDF/RDFS semantics to it. Whether
you believe it or not.

>> And that's not a bad thing, even if you're not interested in it.
>
> I am primarily interested, here, in getting this project done. If 
> aligning it with yours will slow it down or distract it from its goals, 
> that *is* a bad thing, I'm afraid. But let us not prejudge the issue, by 
> all means.

If by "this project" you mean SCL, then I'm only an encumbrance. If by
an XML expression of SCL, that's what I've been proposing.

I'm increasingly thinking I'm in the wrong room.

Murray

...........................................................................
Murray Altheim                         http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK                    .

   "We can now hypothesise with some confidence that those apparently
    happy, calm Buddhist souls one regularly comes across in places
    such as Dharamsala, India, really are happy," said Professor Owen
    Flanagan, of Duke University in North Carolina. -- BBC News
    http://news.bbc.co.uk/1/hi/health/3047291.stm




More information about the Scl mailing list