[SCL] XML question
pat hayes
phayes at ihmc.us
Thu Jan 22 19:05:08 CST 2004
>Showing my ignorance of XML, is there not an escape character inside
>quoted strings that can be used to have XML reserved characters
>ignored?
Not really. XML is absolute that XML text cannot contain '<' or '&'
AT ALL except as part of XML markup, or in a special 'Cdata' block (
which I don't trust since it can get eliminated by the first XML
parser.) I can see that that this is kind of inevitable in any
notation that uses characters to encode both text and markup.
> If so, is not that the easier route to go rather than slaving SCL
>to what can go in an XML string?
Well, I don't like that either, but I think its better to swallow a
little slavery in order to get the benefits of cooperation. A bit
like marriage, now I come to think of it.
Pat
>On Jan 22, 2004, at 4:06 PM, pat hayes wrote:
>
>>Does anyone have a good answer to the following question? Its
>>really about the design principles of XML.
>>
>>In writing the SCL core syntax, I had in mind that it ought to be
>>possible to include chunks of SCL core inside XML documents without
>>the XML barfing. Since XML reserves the characters '<' and '&' and
>>uses ' " ' for quoting, my first instinct was to simply ban these
>>characters from appearing anywhere in the SCL core syntax: then one
>>can take any piece of SCL core text, stick double-quotes on either
>>side of it, and plonk it down as for example an attribute value in
>>XHTML, and nothing breaks (this would be neat for example when
>>attaching SCL as markup to a web page, since the SCL would be
>>invisible in the HTML but visible to processors).
>>
>>But what about an SCL string which might contain any character?
>>Well, XML allows one to include any character in *parsed* character
>>data by escaping the bad characters using entity references. That
>>handles going from SCL-in-XML back into SCL. But how about going
>>from SCL into SCL-in-XML? The use of the XML escaping seems to
>>require that any software which creates XML - for example,
>>something which wants to transmit some SCL text between SCL engines
>>using SCL-in-XML - must perform a kind of XML-unparsing step to
>>replace every occurrence of '<' or '&' by the entity reference.
>>
>>My question boils down to this. Do I *need* to keep the surface
>>syntax of SCL Core "XML-safe" in the sense that it is guaranteed to
>>simply never contain the characters less-than or ampersand (or
>>double-quote, in fact) ? This can be done, but is a pain, and
>>requires SCL to have its own character-escaping conventions
>>different from XML's conventions. (It can't use the XML
>>conventions, since then XML itself will alter the SCL string
>>encodings.)
>>
>>Or is this being inappropriately fussy, since XML tools are already
>>capable of handling text which is not "XML-safe" in this way, and
>>automatically doing the transformations to and from the XML-escaped
>>forms? In which case I should just ignore XML's character
>>restrictions when thinking about the SCL syntax itself, and rely on
>>generic XML tools and conventions to faithfully handle the parsing
>>and coding in and out of the XML syntax.
>>
>>Or, should I use the CDATA feature of XML? This seems to have been
>>designed for cases like this, but I have the sense that CDATA is
>>rarely used in XML-based conventions, and wonder if there is any
>>good reason why not. It rather worries me that an XML processor is
>>apparently allowed to remove all traces of whether a piece of text
>>was originally in a CDATA section or not. I would like XML to
>>transmit any SCL-in-XML faithfully, and if XML parsers may remove
>>some of the critical encoding information then this seems to
>>introduce some fragility into the transaction.
>>
>>Im sure that the XML community has come to an agreement on a
>>suitable best practice to follow in a case like this, and would
>>appreciate any guidance or input.
>>
>>Pat
>>
>>
>>
>>--
>>---------------------------------------------------------------------
>>IHMC (850)434 8903 or (650)494 3973 home
>>40 South Alcaniz St. (850)202 4416 office
>>Pensacola (850)202 4440 fax
>>FL 32501 (850)291 0667 cell
>>phayes at ihmc.us http://www.ihmc.us/users/phayes
>>
>>
>>_______________________________________________
>>SCL mailing list
>>SCL at philebus.tamu.edu
>>http://philebus.tamu.edu/mailman/listinfo/scl
>>
>--
>Bill Andersen (andersen at ontologyworks.com)
>Chief Scientist
>Ontology Works, Inc. (www.ontologyworks.com)
>1132 Annapolis Road, Suite 104,
>Odenton, MD 21113
>Office: 410-674-7600
>Cell: 443-858-6444
>Fax: 410-674-6075
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32501 (850)291 0667 cell
phayes at ihmc.us http://www.ihmc.us/users/phayes
More information about the SCL
mailing list