[SCL] XML question
Bill Andersen
andersen at ontologyworks.com
Thu Jan 22 19:29:03 CST 2004
Ok, this all sounds reasonable.
So, what about strings in SCL itself. Would the verbotene characters
need also to be excluded from SCL formulas like
(name company101 "Proctor & Gamble")
such that the above could not be sent inside an XML CDATA element?
That seems to be a high price to pay, but I guess one could work around.
On Jan 22, 2004, at 8:05 PM, pat hayes wrote:
>> Showing my ignorance of XML, is there not an escape character inside
>> quoted strings that can be used to have XML reserved characters
>> ignored?
>
> Not really. XML is absolute that XML text cannot contain '<' or '&' AT
> ALL except as part of XML markup, or in a special 'Cdata' block (
> which I don't trust since it can get eliminated by the first XML
> parser.) I can see that that this is kind of inevitable in any
> notation that uses characters to encode both text and markup.
>
>> If so, is not that the easier route to go rather than slaving SCL to
>> what can go in an XML string?
>
> Well, I don't like that either, but I think its better to swallow a
> little slavery in order to get the benefits of cooperation. A bit like
> marriage, now I come to think of it.
>
> Pat
>
>> On Jan 22, 2004, at 4:06 PM, pat hayes wrote:
>>
>>> Does anyone have a good answer to the following question? Its really
>>> about the design principles of XML.
>>>
>>> In writing the SCL core syntax, I had in mind that it ought to be
>>> possible to include chunks of SCL core inside XML documents without
>>> the XML barfing. Since XML reserves the characters '<' and '&' and
>>> uses ' " ' for quoting, my first instinct was to simply ban these
>>> characters from appearing anywhere in the SCL core syntax: then one
>>> can take any piece of SCL core text, stick double-quotes on either
>>> side of it, and plonk it down as for example an attribute value in
>>> XHTML, and nothing breaks (this would be neat for example when
>>> attaching SCL as markup to a web page, since the SCL would be
>>> invisible in the HTML but visible to processors).
>>>
>>> But what about an SCL string which might contain any character?
>>> Well, XML allows one to include any character in *parsed* character
>>> data by escaping the bad characters using entity references. That
>>> handles going from SCL-in-XML back into SCL. But how about going
>>> from SCL into SCL-in-XML? The use of the XML escaping seems to
>>> require that any software which creates XML - for example, something
>>> which wants to transmit some SCL text between SCL engines using
>>> SCL-in-XML - must perform a kind of XML-unparsing step to replace
>>> every occurrence of '<' or '&' by the entity reference.
>>>
>>> My question boils down to this. Do I *need* to keep the surface
>>> syntax of SCL Core "XML-safe" in the sense that it is guaranteed to
>>> simply never contain the characters less-than or ampersand (or
>>> double-quote, in fact) ? This can be done, but is a pain, and
>>> requires SCL to have its own character-escaping conventions
>>> different from XML's conventions. (It can't use the XML conventions,
>>> since then XML itself will alter the SCL string encodings.)
>>>
>>> Or is this being inappropriately fussy, since XML tools are already
>>> capable of handling text which is not "XML-safe" in this way, and
>>> automatically doing the transformations to and from the XML-escaped
>>> forms? In which case I should just ignore XML's character
>>> restrictions when thinking about the SCL syntax itself, and rely on
>>> generic XML tools and conventions to faithfully handle the parsing
>>> and coding in and out of the XML syntax.
>>>
>>> Or, should I use the CDATA feature of XML? This seems to have been
>>> designed for cases like this, but I have the sense that CDATA is
>>> rarely used in XML-based conventions, and wonder if there is any
>>> good reason why not. It rather worries me that an XML processor is
>>> apparently allowed to remove all traces of whether a piece of text
>>> was originally in a CDATA section or not. I would like XML to
>>> transmit any SCL-in-XML faithfully, and if XML parsers may remove
>>> some of the critical encoding information then this seems to
>>> introduce some fragility into the transaction.
>>>
>>> Im sure that the XML community has come to an agreement on a
>>> suitable best practice to follow in a case like this, and would
>>> appreciate any guidance or input.
>>>
>>> Pat
>>>
>>>
>>>
>>> --
>>> ---------------------------------------------------------------------
>>> IHMC (850)434 8903 or (650)494 3973 home
>>> 40 South Alcaniz St. (850)202 4416 office
>>> Pensacola (850)202 4440 fax
>>> FL 32501 (850)291 0667 cell
>>> phayes at ihmc.us http://www.ihmc.us/users/phayes
>>>
>>>
>>> _______________________________________________
>>> SCL mailing list
>>> SCL at philebus.tamu.edu
>>> http://philebus.tamu.edu/mailman/listinfo/scl
>>>
>> --
>> Bill Andersen (andersen at ontologyworks.com)
>> Chief Scientist
>> Ontology Works, Inc. (www.ontologyworks.com)
>> 1132 Annapolis Road, Suite 104,
>> Odenton, MD 21113
>> Office: 410-674-7600
>> Cell: 443-858-6444
>> Fax: 410-674-6075
>
>
> --
> ---------------------------------------------------------------------
> IHMC (850)434 8903 or (650)494 3973 home
> 40 South Alcaniz St. (850)202 4416 office
> Pensacola (850)202 4440 fax
> FL 32501 (850)291 0667 cell
> phayes at ihmc.us http://www.ihmc.us/users/phayes
>
>
>
--
Bill Andersen (andersen at ontologyworks.com)
Chief Scientist
Ontology Works, Inc. (www.ontologyworks.com)
1132 Annapolis Road, Suite 104,
Odenton, MD 21113
Office: 410-674-7600
Cell: 443-858-6444
Fax: 410-674-6075
More information about the SCL
mailing list