[SCL] SCL spec

Murray Altheim m.altheim at open.ac.uk
Tue May 20 04:53:34 CDT 2003


[I'm sending two messages to the list because I wrote them yesterday
and my wonderful SMTP server wouldn't accept them. I realize today
that John has replied to one of my earlier messages, and will follow
up on that in a separate message dated today, 20 May. This is 1 of 2.]

pat hayes wrote:
  >> pat hayes wrote:
  >>
  >>>> [small correction to my last message to avoid any confusion.]
  >>>>
  >>>> I wrote:
  >>>> [...]
  >>>>
  >>>>> 4. Finally, as I've done in my XCL proposal, all components of
  >>>>>    the language that may be referred to conceptually either in
  >>>>>    discussions or more particularly by tools should be given a
  >>>>>    unique identifier. Topic Maps introduced a concept called
  >>>>>    Published Subject Indicators (PSIs) that are basically just
  >>>>>    stable URIs used as identifiers. The idea here (which would
  >>>>>    be useful in Topic Map, RDF, or any other web-related system)
  >>>>>    is that if two things identify themselves as having the same
  >>>>>    PSI, they are semantically equivalent.
  >>>
  >>> Oh dear, that is a VERY strong statement (and it seems to me rather
  >>> naive, to be honest.) Semantically equivalent in what language, with
  >>> what model theory? Do you mean they denote the same entity in all
  >>> possible models? (Please don't say yes to that question, at least not
  >>> quickly: see the uri at w3c.org email archive for some heated
  >>> discussions.) This whole matter of there being a global semantics for
  >>> URIs seems to me to be woefully under-analyzed at present, and almost
  >>> everything written about it is rather naive. The REST model (and the
  >>> diluted version assumed in RFC 2396) isn't up the task without being
  >>> extended, for sure.
  >>
  >> Perhaps I'm being unclear. It's not that machines ever declare two
  >> things as having identity, but rather people.
  >
  > People almost never achieve semantic equivalence in normal
  > communication. Attempts to approximate it rapidly become unwieldy and
  > expensive (eg legal terminology, technical in-house terminology,
  > mathematical notations).

Now, even in my naivete I would not expect a concept to be equivalent
in all possible models, but only within specific interpretations. Now,
if you and I are designing systems and we wish to intercommunicate a
specific concept, then we use a symbol to represent that concept. We
do this all the time with natural language, which of course is full of
ambiguity (which is often a good thing, otherwise we probably couldn't
communicate at all). So if we both agree to use an identifier, we can
do so either "correctly" (for whatever purpose we have in mind) or
"erroneously". Am I right in thinking that in your opinion we can never
do so "correctly"?

  >> There are no semantics
  >> whatsoever in URIs, in my opinion.
  >
  > I disagree, but I think the semantics is subtle and has not yet been
  > adequately formalized.

This is the position that I believe TimBL takes, which I think is wrong.
URIs are strings used as identifiers. They in themselves have no meaning,
just as any identifier absent any context (absent any interpretable
context) have no meaning. "dog" does not mean anything specific absent
a context. The three characters d+o+g are just characters until they
are interpreted. I doubt that the attempt to adequately formalize URIs
will ever come to any fruition. As you may know, the W3C has spent man-years
trying.

Now, the reason "dog" is ambiguous is that there are multiple definitions
and multiple interpretations, depending on context. That's unavoidable in
natural language, and probably to some degree in formal languages too.
A PSI is an attempt (and perhaps only an attempt) to create a formal
definition for a concept.

  >> They're just strings used as
  >> identifiers (some of which may resolve to something, but that for
  >> the purposes of this discussion is not important
  >
  > It is when we have things like imports statements, or ontologies which
  > rely on Qname syntax to resolve meanings.

I don't see how Qname syntax is any different than PSIs in resolving
meanings. There's the same level of ambiguity. PSIs define a concept
within a namespace (i.e., the namespace created by the base URI of
the PSI).

  >> ). This isn't some
  >> instance of nominalism, this is unambiguous identifiers within
  >> specific domain namespaces being used to identify concepts.
  >>
  >>>>>  For example, if OWL
  >>>>>    has a concept "http://www.w3.org/2002/07/owl#" and my Ceryle
  >>>>>    project has a concept
  >>>>> "http://purl.org/ceryle/psi/authoring/#Thing",
  >>>>
  >>>>
  >>>> [...]
  >>>>
  >>>> Cut and pasting I neglected to add "Thing" to the OWL identifier.
  >>>> This should be "http://www.w3.org/2002/07/owl#Thing" being compared
  >>>> with "http://purl.org/ceryle/psi/authoring/#Thing". Sorry if this
  >>>> caused any confusion.
  >>>>
  >>>> Without this feature it's difficult (at a tool level) to claim
  >>>> identity between systems for even things such as "true" or "false".
  >>>
  >>>
  >>> Why would you expect it to even be meaningful to claim identity
  >>> between *systems* ? Of course it is difficult to claim identity
  >>> across systems which have different underlying semantics: it ought to
  >>> be difficult, because it is meaningless.  That is one of the main
  >>> purposes of the application of SCL to the Web, to try to provide a
  >>> globally coherent semantic framework which can underlie the various
  >>> surface notations.
  >>
  >>
  >> Again, I was unclear. The purpose of PSIs is to identify concepts
  >> unambiguously.
  >
  > That seems like an impossible goal. We know that no finite expression
  > can completely identify the concepts of arithmetic, such as 'zero'.

Then why do we bother with computer-based ontologies at all? I don't
expect we shall ever live in a perfect world; we're all just making
do with the best there is, and trying to improve that "best there is".
So if I can't rely on a definition of "zero" I certainly can't rely
on a definition of "dog", and the basis of the "Semantic Web" is bogus.

Right?

  >> If you and I have two systems, and I'm trying to
  >> map the interpretation of concepts within your system to the
  >> interpretation of concepts within mine (e.g., I can determine that
  >> my "forall" is the same as your "forall" by reading the documentation
  >> of both systems, therefore their two PSIs identifying the concepts
  >> can be mapped as equivalent).
  >
  > But you will almost never find that.  For example, the notion of 'class'
  > changes when one moves from RDFS to OWL.

Yes, and a designer who read the definition of "class" in RDFS and OWL
and equated the two would be making a mistake. Since "class" isn't
defined in both as simply a name token but as a URL, we can distinctly
define them as not equivalent. We can do that unambiguously. By the
same token, if I as a designer *want* to define my "Thing" as having
the same semantics as Cyc's #$Thing, I can state that my PSI is
equivalent to Cyc's PSI. We then have some basis upon which to
communicate, i.e, we've agreed upon some terminology. Yes, there are
some problems with that, but I don't know how else to communicate. We
all must agree upon terms at some point or there's no point in talking.

I'm still learning your language, and it might be naive of me to try,
but if I give up, I go home. Right?

  >>>> [perhaps someone can answer me privately on this question: I can't
  >>>> find "true" or "false" among OWL or RDFS. Is it there and I'm
  >>>> missing it, or not? Thanks.]
  >>>
  >>> Not explicitly. What do you expect those names to mean? There is a
  >>> 'universally true' class, ie the universe class, which is
  >>> rdfs:Resource in RDFS and owl:Thing in OWL; similarly the
  >>> 'universally false' (empty) class in OWL is owl:Nothing. Different
  >>> versions of OWL take different views on the relationship between
  >>> rdfs:Resource and owl:Thing, by the way. In OWL-Full they are the
  >>> same; in OWL-DL, owl:Thing is a subset of rdfs:Resource, because the
  >>> OWL-DL universe doesn't contain classes or properties, for example.
  >>>
  >>> These languages do not have names for what in logic would be called
  >>> the truth-values themselves.
  >>
  >> Fair enough. I was thinking in terms of truth values, not classes.
  >
  > There is a natural mapping from truthvalues in logics to classes in a
  > description logic.

Are they identical semantically? If so, an XML expression of logics and
an XML expression of DL could use PSIs to describe that identity.

  >> In Cyc
  >> (of which I'm most familiar) it defines #$True as
  >>
  >>    An instance of #$TruthValue. #$True is logical truth in Cyc; this is
  >>    the abstract logical notion--not to be confused with Lisp's T, nor
  >>    with the English word `true'.
  >
  > That is not a definition; it is a textual comment.

It's the only definition I have available to me from Cyc, so if I agree
with it, I consider it a definition. That's my perogative to do in
intercommunication, as it always must be.

  >> So if in my Ceryle system I implement an ontology using a root concept
  >> of "Thing" and I agree with Cyc's #$Thing, shouldn't it be possible to
  >> consider them equivalent
  >
  > Almost certainly no. It would be a lot of work to demonstrate that they
  > were, for sure. Certainly Cyc's useage in which truth-values are
  > themselves items in the universe of discourse is not mirrored in SCL
  > (though it could be in a CYC-oriented SCL ontology, I guess)

Perhaps I'm digressing by using my own project Ceryle as an example,
and maybe using something so inherently complex as #$Thing as an example
is not a good idea either. But if Cyc has #$TruthValue, #$True and #$False,
I can use those concepts in my system and claim identity with my own
concepts. That's all I was trying to establish.

If I begin to at the ground level begin to develop an "knowledge-based"
system that at its core uses well-known concepts from other systems, and
I can identify theirs and my concepts via fixed identifiers, it should
be possible (difficult perhaps) to provide inferencing between expressions
in both systems.

  >> , and create a map equivalency between them
  >> (i.e., I as a human state the relation so that my computer will consider
  >> them equivalent)?
  >
  > You as a human can do whatever you like, but it would be a very bad
  > (dangerous) idea to write software on that basis.

Isn't that exactly what the "Semantic Web" folks are doing?

  >> Would you say that Cyc's set of constants for #$implies, #$not, #$and,
  >> #$or, #$forAll, #$thereExists, would these have identity (semantically)
  >> with the SCL list of tokens, "implies", "not", "and", "or", "forall",
  >> "exists"?
  >
  > Definitely not.

Why not? (an explanation here would be very helpful, as I would have thought
that at this sort of fundamental level, they would be identical)

  >> If so (and this I would imagine can be determined by reading
  >> the documentation of each), thenn (is there an equivalent to iff?) we
  >> can map the PSIs between them:
  >>
  >>          http://purl.org/xcl/1.0/#implies
  >>    maps to
  >>          http://www.cyc.com/cycdoc/vocab/fundamental-vocab.html#True
  >>
  >>          http://purl.org/xcl/1.0/#exists
  >>    maps to
  >>
  >> http://www.cyc.com/cycdoc/vocab/fundamental-vocab.html#thereExists
  >>
  >>   et cetera.
  >
  >
  > Well, yes, but we can do this with any other URIs or indeed any other
  > identifiers at all.  What functionality does the PSI provide in particular?

It's just an identifier. The difference is that there's a framework
for publishing sets of URLs (as PSIs) that establishes some metadata
about the sets, author, revision info, plus they are by definition
supposed to have some human-readable documentation, whereas URLs are
just simply identifiers. PSIs are supposed to be (relatively) stable
too (as declared by its publisher), whereas a URL doesn't necessarily
have any stability. Obviously, a PSI from chevySales.com won't be as
trustworthy as a PSI from the US Library of Congress. Trust is something
we all have to deal with in intercommunication.

  >> assuming the publishers of Cyc decided to use their HTML URLs
  >> as PSIs. This would simply be a matter of them declaring that to
  >> be the case, and to conform to the PSI recommendations/best
  >> practices, to publish some metadata about the set. You can see
  >> this at:
  >>
  >>   http://purl.org/xcl/1.0/#psimeta
  >>
  >> Now, I'm perhaps (okay, likely) naive, but I'm not sure what else
  >> one can do to state the equivalence between two concepts.
  >
  > I would presume that the relationship between them would be stated in an
  > SCL ontology of some kind. The simplest 'mapping' would just a bunch of
  > equations, though that is unlikely to be adequate in this case.

Well, for the entire set of Cyc that might be difficult, but if you're
trying to map between Cyc and SUMO, you'll need to be able to identify
the concepts you're trying to map, which is where PSIs could come in.

Murray

...........................................................................
Murray Altheim                         http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK                    .

     Jessica Lynch became an icon of the war, and the story of her capture by
     the Iraqis and her rescue by US special forces will go down as one of the
     most stunning pieces of news management yet conceived. It provides a
     remarkable insight into the real influence of Hollywood producers on the
     Pentagon's media managers, and has produced a template from which America
     hopes to present its future wars.  -- The Guardian, 15 May 2003
     http://www.guardian.co.uk/g2/story/0,3604,956127,00.html






More information about the Scl mailing list