[SCL] XML question

pat hayes phayes at ihmc.us
Fri Jan 23 14:14:54 CST 2004


>I think we've answered most of the questions, including my reply
>to John, but I'll cover a few that weren't, below.

Yep, thanks for the input. Youve convinced me about most of this. 
However, I still have one question.

>
>pat hayes wrote:
>[...]
>>>I generally avoid putting anything like document
>>>content within attribute values, so single and double quotes are usually
>>>not an issue. Since you mention specifically the idea of putting SCL
>>>within an HTML document as attribute values, you're taking a fairly big
>>>risk, in that a misplaced quote will do an unending amount of harm.
>>
>>Well, ANY misplaced illegal character will do a lot of harm, right? 
>>A misplaced " can render any piece of XML illegal.  So what's 
>>special here?
>
>If you consider the two places that you can put document content, within
>attribute values, and within element content (i.e., between the start
>and end markup tags), there are fewer practical restrictions on the latter.
>You don't type ampersands that often, but single and double quotes show
>up a lot. Mistakes in attribute content are hard to track, whereas mistakes
>in element content are usually pretty simple to locate. A misplaced quote
>in element content does no harm, since it's not interpreted.

....

>>Well, maybe. I'd still like to have a shot at it, if it can be done 
>>without too much pain. The world badly need a way to put semantic 
>>markup into Web pages without breaking browsers.
>
>Then do it the right way by embedding actual semantic XML markup,
>rather than something that will require a specialized parser. If
>you embed an XML version of SCL, the existing browser tools that
>don't choke on things they don't understand, and can accept XML,
>will be able to process it. That's the way of the future, we all
>hope.

Well, maybe, but its certainly not the way of the present. Maybe I 
havnt made this clear. The point of my third goal is to be able to 
insert SCL content into HTML in such a way that the SCL is invisible 
to a browser - is not rendered by the browser - but is available to 
other tools linked to the browser. How can this be done using XML? If 
I simply include some random XML element inside XHTML, browsers 
render the element content. Every browser Ive tried this on insists 
on rendering all the attribute content, even if it is enclosed in 
meaningless (from XHTML's point of view) elements. That is exactly 
what I want it to NOT do.  Putting it into an attribute value is the 
only way Ive found of doing this.

If there is some obvious (to you) technique that I am missing, please 
enlighten me.

>[...]
>>>>Im sure that the XML community has come to an agreement on a 
>>>>suitable best practice to follow in  a case like this, and would 
>>>>appreciate any guidance or input.
>>>
>>>I think in order to answer your question better I'd need to know
>>>what the design goal is, really.
>>
>>
>>I have three. The main one is to be able to transmit a variety of 
>>surface syntaxes for SCL inside XML, safely. That is, it ought to 
>>be easy to write small pieces of code which will 'dump' some SCL 
>>surface syntax into a standard XML format which is XML-legal and 
>>can be sent via an XML pipe to an XML parser which will then spit 
>>out the original SCL surface syntax. This ought to be a useful 
>>general-purpose way to use XML to communicate SCL from one place to 
>>another without needing to translate it or do any complicated SCL 
>>parsing of one surface syntax into another. What I have in mind 
>>here is something like a set of attributes which can be used 
>>specify the syntactic form, then just enclosing the surface syntax 
>>as PCDATA text inside suitable elements. Call this SCL-in-XML.
>
>Apart from escaping '<' and '&' (and while not technically necessary,
>'>' also), you can put anything you like within element content. So
>you'd only have to escape two or three characters if you don't try
>putting things in attribute values. If you're going to put things in
>attribute values, I'd recommend that tools do conversions of that
>sort in both directions (into attribute values, and out of attribute
>values).
>
>>The second one, related to the first, is to invent something like 
>>your XCL: that is, a 'standard' XML syntax for SCL itself, using 
>>XML elements to exhibit the SCL syntax structure appropriately. 
>>Call this XCL for now. This will be one of the SCL surface 
>>syntaxes, so it ought to be possible to include XCL inside 
>>SCL-in-XML, but it also has a special status in that it can be the 
>>'official' way to describe the abstract syntax, so all other 
>>surface forms are required to be parsable as XCL. So one way, which 
>>may well be the 'official' way, to transmit SCL is to parse your 
>>surface syntax into XCL and then send that: possibly with some 
>>header information saying what surface form it come form originally.
>
>I don't exactly follow why you'd want to put XCL within an XML
>document in escaped form. I'd just put it in directly as XCL markup.
>I'm guessing that's what you really mean here.

Yes.

>But as I responded
>to John, if the SCL syntax is specified in EBNF grammar, you basically
>get a parser for free, since there are EBNF parser generators (for
>at least Java, but I'm guessing for perl, python, and probably other
>languages as well).
>
>>BTW, this might well be related to the XMI model of the SCL core 
>>syntax that was completed recently. I'll get this up on the website 
>>ASAP.
>
>Good.
>
>>And then there is the third one, which is a minor goal to be able 
>>to include the SCL core surface syntax (the KIF-like syntax in the 
>>document) inside XHTML attributes without breaking a web browser. 
>>This is kind of independent of the first two (I think) and is a 
>>private hobbyhorse of mine.
>
>As I said above, this is probably not too difficult, but you'll
>probably just want a blanket policy of just converting all markup
>characters using tools. The possibility of making a mistake in
>either hand-authoring or eyeball-interpreting a raw XML document
>containing that kind of content is a bit like reading URL query
>strings. There's definitely people that can do it (I can), but it's
>ugly.
>
>>>If you want to be able to put SCL
>>>into an XML attribute value, unescaped, then you want to avoid any
>>>markup characters if possible, including single and double quotes.
>>>Since that's really impractical
>>
>>I don't quite see why you think it is impractical. We are defining 
>>SCL core syntax ourselves, and its not that hard to define it so 
>>that it doesn't use any of the XML markup characters. So, once so 
>>defined, what is impractical? Its awkward and can be a bit of a 
>>bugger to hand-author when you want to encode arbitrary strings, 
>>but lots of SCL won't be using strings in any case.
>
>If you're going to avoid XML markup characters, it's not impractical.
>It's just that defining a syntax that avoids [<>&'"] is going to make
>things difficult. Why avoid single and double quotes in your SCL just
>to do this?

Well, SCL only needs one kind of quote, so it can leave the other for 
XML to use; and it doesnt need <>& at all. The issue only comes up if 
someone wants to make an SCL assertion *about* a string which might 
contain these symbols (for example, about a piece of XML markup)..

>(maybe I'm not understanding you correctly here)
>
>>>, you (as you say) begin to rely on
>>>XML authoring tools to escape the contents. This can be a real
>>>difficulty for authoring, but it certainly is the solution. Now,
>>>if XCL is an XML markup language in its own right, you'd just put
>>>the XCL into the document as XML markup, using XML Namespaces.
>>
>>Right. The reason for the third goal is to be able to link the SCL 
>>to HTML anchors; but like I say, this is a private hobbyhorse. The 
>>first two are more important.
>
>There's probably better ways to do this, using the XCL syntax
>instead. Do you mean link the HTML-to-SCL, or SCL-to-HTML?

I want to be able to use SCL (in some syntactic form) as markup 
inside HTML in such a way that it is clearly linked to some 
particular part of the HTML but is invisible to any browser, ie does 
not appear visibly on the Web page. Rather in the way that href links 
are invisible, if you take my meaning.

>If
>you want to be able to point from SCL/XCL to HTML anchors (say,
>for purposes of obtaining documentation on an SCL entity),

The point is not to document the SCL entity, but rather to use the 
SCL to provide semantic information (ie of use to a software agent) 
about some entity named in the HTML text. For example, I would like 
to be able to put an SCL 'wrapper' around the word 'Paris" occurring 
in some HTML text, where the associated SCLcontent might provide the 
information to a SW inference engione that this was the capital of a 
country called "France".  And I want to be able to do this without 
changing the visible appearance of the HTML in any way, so that such 
markup can be added to existing web pages without major changes and 
without interfering with their function as human-readable web pages.

>  this
>can be done in SCL proper using its own linking syntax

SCL doesn't have any linking syntax.

>, and in
>the XCL using a proper linking syntax like XLink, or by creating
>a proprietary linking syntax, like <xcl:link href="uri"/>.

Yes, but how is it going to get done *in XHTML* ??

Pat


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes at ihmc.us       http://www.ihmc.us/users/phayes



More information about the SCL mailing list