[SCL] XML question

Bill Andersen andersen at ontologyworks.com
Thu Jan 22 19:29:03 CST 2004


Ok, this all sounds reasonable.

So, what about strings in SCL itself.  Would the verbotene characters 
need also to be excluded from SCL formulas like

    (name company101 "Proctor & Gamble")

such that the above could not be sent inside an XML CDATA element?

That seems to be a high price to pay, but I guess one could work around.

On Jan 22, 2004, at 8:05 PM, pat hayes wrote:

>> Showing my ignorance of XML, is there not an escape character inside 
>> quoted strings that can be used to have XML reserved characters 
>> ignored?
>
> Not really. XML is absolute that XML text cannot contain '<' or '&' AT 
> ALL except as part of XML markup, or in a special 'Cdata' block ( 
> which I don't trust since it can get eliminated by the first XML 
> parser.) I can see that that this is kind of inevitable in any 
> notation that uses characters to encode both text and markup.
>
>>  If so, is not that the easier route to go rather than slaving SCL to 
>> what can go in an XML string?
>
> Well, I don't like that either, but I think its better to swallow a 
> little slavery in order to get the benefits of cooperation. A bit like 
> marriage, now I come to think of it.
>
> Pat
>
>> On Jan 22, 2004, at 4:06 PM, pat hayes wrote:
>>
>>> Does anyone have a good answer to the following question? Its really 
>>> about the design principles of XML.
>>>
>>> In writing the SCL core syntax, I had in mind that it ought to be 
>>> possible to include chunks of SCL core inside XML documents without 
>>> the XML barfing. Since XML reserves the characters '<' and '&' and 
>>> uses ' " ' for quoting, my first instinct was to simply ban these 
>>> characters from appearing anywhere in the SCL core syntax: then one 
>>> can take any piece of SCL core text, stick double-quotes on either 
>>> side of it, and plonk it down as for example an attribute value in 
>>> XHTML, and nothing breaks (this would be  neat for example when 
>>> attaching SCL as markup to a web page, since the SCL would be 
>>> invisible in the HTML but visible to processors).
>>>
>>> But what about an SCL string which might contain any character? 
>>> Well, XML allows one to include any character in *parsed* character 
>>> data by escaping the bad characters using entity references. That 
>>> handles going from SCL-in-XML back into SCL.  But how about going 
>>> from SCL into SCL-in-XML? The use of the XML escaping seems to 
>>> require that any software which creates XML - for example, something 
>>> which wants to transmit some SCL text between SCL engines using 
>>> SCL-in-XML - must perform a kind of XML-unparsing step to replace 
>>> every occurrence of '<' or '&' by the entity reference.
>>>
>>> My question boils down to this.  Do I *need* to keep the surface 
>>> syntax of SCL Core "XML-safe" in the sense that it is guaranteed to 
>>> simply never contain the characters less-than or ampersand (or 
>>> double-quote, in fact) ? This can be done, but is a pain, and 
>>> requires SCL to have its own character-escaping conventions 
>>> different from XML's conventions. (It can't use the XML conventions, 
>>> since then XML itself will alter the SCL string encodings.)
>>>
>>> Or is this being inappropriately fussy, since XML tools are already 
>>> capable of handling text which is not "XML-safe" in this way, and 
>>> automatically doing the transformations to and from the XML-escaped 
>>> forms? In which case I should just ignore XML's character 
>>> restrictions when thinking about the SCL syntax itself, and rely on 
>>> generic XML tools and conventions to faithfully handle the parsing 
>>> and coding in and out of the XML syntax.
>>>
>>> Or, should I use the CDATA feature of XML? This seems to have been 
>>> designed for cases like this, but I have the sense that CDATA is 
>>> rarely used in XML-based conventions, and wonder if there is any 
>>> good reason why not. It rather worries me that an XML processor is 
>>> apparently allowed to remove all traces of whether a piece of text 
>>> was originally in a CDATA section or not. I would like XML to 
>>> transmit any SCL-in-XML faithfully, and if XML parsers may remove 
>>> some of the critical encoding information then this seems to 
>>> introduce some fragility into the transaction.
>>>
>>> Im sure that the XML community has come to an agreement on a 
>>> suitable best practice to follow in  a case like this, and would 
>>> appreciate any guidance or input.
>>>
>>> Pat
>>>
>>>
>>>
>>> --
>>> ---------------------------------------------------------------------
>>> IHMC	(850)434 8903 or (650)494 3973   home
>>> 40 South Alcaniz St.	(850)202 4416   office
>>> Pensacola			(850)202 4440   fax
>>> FL 32501			(850)291 0667    cell
>>> phayes at ihmc.us       http://www.ihmc.us/users/phayes
>>>
>>>
>>> _______________________________________________
>>> SCL mailing list
>>> SCL at philebus.tamu.edu
>>> http://philebus.tamu.edu/mailman/listinfo/scl
>>>
>> --
>> Bill Andersen (andersen at ontologyworks.com)
>> Chief Scientist
>> Ontology Works, Inc. (www.ontologyworks.com)
>> 1132 Annapolis Road, Suite 104,
>> Odenton, MD 21113
>> Office: 410-674-7600
>> Cell: 443-858-6444
>> Fax: 410-674-6075
>
>
> -- 
> ---------------------------------------------------------------------
> IHMC	(850)434 8903 or (650)494 3973   home
> 40 South Alcaniz St.	(850)202 4416   office
> Pensacola			(850)202 4440   fax
> FL 32501			(850)291 0667    cell
> phayes at ihmc.us       http://www.ihmc.us/users/phayes
>
>
>
--
Bill Andersen (andersen at ontologyworks.com)
Chief Scientist
Ontology Works, Inc. (www.ontologyworks.com)
1132 Annapolis Road, Suite 104,
Odenton, MD 21113
Office: 410-674-7600
Cell: 443-858-6444
Fax: 410-674-6075



More information about the SCL mailing list