[SCL] SCL and XML syntax translation questions
pat hayes
phayes at ai.uwf.edu
Tue Dec 24 19:45:40 CST 2002
>Hi,
>
>Wanted to ask your opinion on how should the SCL-lisp <-> SCL-XML
>transformation look like.
Quick aside comment: we ought to think in terms of mappings to and
from the abstract syntax. Same issues apply, of course.
>I think we should get a basic agreement
>on this before we attempt to fix the specifics of SCL-lisp
>syntax and semantics.
I hope we can do the abstract syntax + MT more or less independently
of the surface syntaxes.
>Here are some naive alternative solutions for discussion:
I am not an XML expert, but I think it is essential to keep in mind
that we need to have at least one syntax which allows SCL-XML to be
used alongside, and integrated with, the XML surface syntax which has
already been chosen for RDF and (hence) for OWL.
RDF (and all the subsequent languages) have essentially distinct
syntaxes for binary predications and others. Unary predications are
subsumed under binary ones by using an explicit 'apply' relation
(called rdf:type) , and higher arities can be handled by using lists,
eg R(a, b, c, d) could be glossed as having the structure R(a, L) &
L=[ b, c, d] . (In fact, the RDF actually translates as
(exists (L)( R(a, L) & L=[ b, c, d] )) with the list described
using a conjunction of existential assertions about the dotted-pair
components of the list, in this case something like
(exists (L, L1, L2)(R(a, L) & first(L,b) & rest(L, L1) & first(L1, c)
& rest(L1, L2) & first(L2,d) & rest (L2, nil) )
An alternative strategy used in RDF (and recommended by the RDF
'primer') would be to resort to a case-grammar style, and use
auxiliary binary relations so that it comes out like this:
(exists (x) (R(a, x) & R1(x, b) & R2(x, c) & R3(x, d) )
I KNOW these are ugly, but for all its faults the use of some
reduction to the binary case is both widely practiced and an existing
standard, and we need to be able to integrate it and work with it.
Pat
>To transform a formula
>
>(implies
> (forall (?x)
> (P (succ ?x)) )
> (P (succ 0)) )
>
>we could have smth like (several different encoding methods):
>
>1) Some untyped nested list translation
> with NO function->tag conversion, but a special
> treatment for varlist (I'd probably prefer this
> format, since it is least restrictive and does
> not convey more information than the lisp syntax):
>
> <term>
> <term>implies</term>
> <term>
> <term>forall</term>
> <varlist>
> <var>?x</var>
> </varlist>
> <term>
> <term>P</term>
> <term>
> <term>succ</term>
> <term>?x</term>
> </term>
> </term>
> </term>
> <term>
> <term>P</term>
> <term>
> <term>succ</term>
> <term>0</term>
> </term>
> </term>
> </term>
>
>To be really faithful to list syntax we should perhaps
>drop the <varlist> tag and simply write:
>
> <term>
> <term>?x</term>
> </term>
>
>Otherwise the parser/converter must KNOW which functions
>are quantifiers and which not. This might be hard to
>achieve for exotic logics and specific quantifiers people may
>want to use.
>
>2) General nested list translation with function->tag conversion and
> exception for varlist. Here the parser/converter again
> needs to know which functions are quantifiers and which not.
>
> <implies>
> <forall>
> <varlist>
> <var>?x</var>
> </varlist>
> <P>
> <succ?>?x</succ>
> </P>
> </forall>
> <P>
> <succ>0</succ>
> </P>
> </implies>
>
>3) Some kind of a rather strongly typed nested list translation
> which explicitly contains information about the formula structure
> apparently NOT present in SCL-lisp syntax (observe that
> it would not be so easy for higher order case where we cannot
> have <literal fun="P"> anymore, since fun may be a complex term).
>
> Here the parser/converter needs to know an awful lot, putting
> serious restrictions on the practical usage of exotic logics:
>
> <prop fun="implies">
> <quantifier fun="forall">
> <varlist>
> <var>?x</var>
> </varlist>
> <literal fun="P">
> <term fun="succ">
> <atomicterm term="?x"/>
> </term>
> </literal>
> </quantifier>
> <literal fun="P">
> <term fun="succ">
> <atomicterm term="0"/>
> </term>
> </literal>
> </prop>
>
>
>4) Something completely different?
>
>
>About the varlist and variable/constant encodings:
>
>Varlist:
>
>It seems to me that the variable list is the only set
>construction in the formula syntax: for other lists the first
>element of a list is a distinguished function/predicate/connective/etc
>term and the order of elements normally matters.
>
>This seems to require a special treatment in the XML syntax as well.
>
>The XML syntax must also allowed typed variables to be
>present in the list, preferably a mix of
>typed/untyped like in (?x (int ?y) ?z).
>
>We could, in principle, remove the problem by requiring
>a quantifier to bind a SINGLE variable, but I guess this
>will be cumbersome to use and is not compatible with KIF
>and CL ideology.
>
>The first example above proposes an alternative dumb solution
>of treating (?x (int ?y) ?z) as just another term in XML.
>
>
>Variable/constant encodings:
>
>BTW, is there a reason to REQUIRE that variables start with ?
>or is it just a cozy convention we do not have to force?
The latter. Lexical markers for variables should be considered an
idiosyncracy of the surface syntax, and we can choose them to suit
ourselves. The chief considerations to bear in mind are consistency
with existing standards, ease of writing parsers, and familiarity, in
that order, I would suggest.
>
>If we prohibit open formulas at top level, then we do not need vars to
>start with ?. In "ordinary" theorem provers which take
>clausal input (hence open formulas) the variable/constant
>distinction is mostly done using different conventions, like
>Prolog-style capital/noncapital or Otter-style first character
>convention of x,y,z,u,v,w.
I don't think either of thse can be usefully extended to a larger
setting, however, since for example we have to allow users to write
urirefs as constant names, and there is no prohibition against a
uriref starting with any ascii symbol. So things like reserved
alphabets and case distinctions cannot be used as a general mechanism.
>If open formulas at top level are
>prohibited, then we do not need to force any conventions
>upon other people in this respect.
That would be ideal.
Strictly speaking there is no absolute need to even make a syntactic
distinction between variables and nonvariables: any name that is
bound by a quantifier is considered to be a 'variable' and otherwise
it isn't as far as the CL model theory is concerned. However, I don't
think that implementers will be very happy with such a language.
(What is your view??)
Pat Hayes
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32501 (850)291 0667 cell
phayes at ai.uwf.edu http://www.coginst.uwf.edu/~phayes
s.pam at ai.uwf.edu for spam
More information about the Scl
mailing list