[SCL] XML question

Murray Altheim m.altheim at open.ac.uk
Fri Jan 23 16:31:25 CST 2004


pat hayes wrote:
>>I think we've answered most of the questions, including my reply
>>to John, but I'll cover a few that weren't, below.
> 
> 
> Yep, thanks for the input. Youve convinced me about most of this. 
> However, I still have one question.
[...]
>>Then do it the right way by embedding actual semantic XML markup,
>>rather than something that will require a specialized parser. If
>>you embed an XML version of SCL, the existing browser tools that
>>don't choke on things they don't understand, and can accept XML,
>>will be able to process it. That's the way of the future, we all
>>hope.
> 
> Well, maybe, but its certainly not the way of the present. Maybe I 
> havnt made this clear. The point of my third goal is to be able to 
> insert SCL content into HTML in such a way that the SCL is invisible 
> to a browser - is not rendered by the browser - but is available to 
> other tools linked to the browser. How can this be done using XML? If 
> I simply include some random XML element inside XHTML, browsers 
> render the element content. Every browser Ive tried this on insists 
> on rendering all the attribute content, even if it is enclosed in 
> meaningless (from XHTML's point of view) elements. That is exactly 
> what I want it to NOT do.  Putting it into an attribute value is the 
> only way Ive found of doing this.
> 
> If there is some obvious (to you) technique that I am missing, please 
> enlighten me.

It's fairly simple. You use a stylesheet:

   <?xml version="1.0"?>
   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
                         "xhtml1-transitional.dtd">
   <html xmlns="http://www.w3.org/1999/xhtml">
     <head>
       <title>Document Title</title>
       <style type="text/css">
         .hide { display : none }
       </style>
     </head>
     <body>
       <h1>Document Title</h1>
       <p>Do you see this text?</p>
       <p class="hide">You shouldn't see this text.</p>
     </body>
   </html>

You should only see one sentence -- the second one should be
hidden by your browser. If the browser supports CSS correctly,
and the user isn't ignoring author-based style directives.
People who are hard of sight, don't like funky web colours, are
using voice browsers, etc. may see this content.

But this has not really solved the problem. How do all the tools
that are supposed to harvest this content know *where* in the
document it is? That it's not some example of SCL markup about
some other document? How to know what to do with perhaps dozens
of small SCL fragments scattered throughout a document? I'm
hoping we're not trying to reinvent JavaScript...

The solution to this is to use XML, to use XCL embedded in some
agreed-upon method that tools will know to look for. Just hiding
the content from the browser is not the big problem. It's telling
the world that it's there, how to interpret it, what it means
in the specific context of use. The Dublin Core solved this for
their own application, and it took several years to iron out the
niggling details. It's not that simple, though not rocket science
either:

    Expressing Dublin Core in HTML/XHTML meta and link elements
    http://dublincore.org/documents/2003/11/30/dcq-html/

That might be the way to do it. But they're embedding specific
content that is not a dynamic language like SCL, it's just some
static elements, i.e., their problems are simple compared to what
you're asking for.

>>[...]
>>If you're going to avoid XML markup characters, it's not impractical.
>>It's just that defining a syntax that avoids [<>&'"] is going to make
>>things difficult. Why avoid single and double quotes in your SCL just
>>to do this?
> 
> Well, SCL only needs one kind of quote, so it can leave the other for 
> XML to use; and it doesnt need <>& at all. The issue only comes up if 
> someone wants to make an SCL assertion *about* a string which might 
> contain these symbols (for example, about a piece of XML markup)..

XML uses both types of quotes and requires escaping of both within
attribute values.

[...]
> I want to be able to use SCL (in some syntactic form) as markup 
> inside HTML in such a way that it is clearly linked to some 
> particular part of the HTML but is invisible to any browser, ie does 
> not appear visibly on the Web page. Rather in the way that href links 
> are invisible, if you take my meaning.

As I said above, the issue isn't hiding it. That's easy in conformant
browsers. It's putting it somewhere that is unambiguous and harvestable
by tools. You don't for example want all of the SCL examples that might
appear in email messages that appear in list archives to be interpreted
by harvesting tools. You've got to mark the SCL in some way that states
that it is SCL. The way to do that in XML is with XML Namespaces. Now,
instead of embedding SCL, embed XCL and you've solved *most* of those
problems. There's still interpreting where a specific SCL fragment is,
why it's there, what it means in a specific markup context.

I made a proposal that was completely ignored (so far as I know) by
the W3C, which pretty much solved the problem on embedding metadata
in XHTML for the "Semantic Web":

   Augmented Metadata in XHTML
   http://www.altheim.com/specs/meta/NOTE-xhtml-augmeta.html

My proposal essentially said that the embedded metadata was always
to be interpreted as being metadata on the parent element of the
embedding location. If it's in <head> it's to be interpreted as being
metadata on the entire document.

It was perhaps too simple a solution for the million dollar question...

>[...]
>>If you want to be able to point from SCL/XCL to HTML anchors (say,
>>for purposes of obtaining documentation on an SCL entity),
> 
> The point is not to document the SCL entity, but rather to use the 
> SCL to provide semantic information (ie of use to a software agent) 
> about some entity named in the HTML text. For example, I would like 
> to be able to put an SCL 'wrapper' around the word 'Paris" occurring 
> in some HTML text, where the associated SCLcontent might provide the 
> information to a SW inference engione that this was the capital of a 
> country called "France".  And I want to be able to do this without 
> changing the visible appearance of the HTML in any way, so that such 
> markup can be added to existing web pages without major changes and 
> without interfering with their function as human-readable web pages.
> 
>> this can be done in SCL proper using its own linking syntax
> 
> SCL doesn't have any linking syntax.

If you refer to URIs or to any constructs in other SCL documents,
you have a linking syntax by definition. There will I anticipate
be such a thing in the XCL syntax at least (otherwise, what good
is it for making assertions about things on the Net?).

>>, and in
>>the XCL using a proper linking syntax like XLink, or by creating
>>a proprietary linking syntax, like <xcl:link href="uri"/>.
>
> Yes, but how is it going to get done *in XHTML* ??

That remains to be seen. Hiding is only one small part of the
problem. I still would recommend embedding XCL, not SCL. There's
still a few other issues, but we take things in steps, eh?

Murray

......................................................................
Murray Altheim                    http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK               .

  "At the Fresno event, even some of the handpicked guests expressed
   skepticism about the state selling $15 billion in bonds to balance
   the budget. A few said the state could look harder for more cuts
   to the government bureaucracy -- but nevertheless said they would
   defer to Schwarzenegger's judgment for now."
   http://www.sfgate.com/cgi-bin/article.cgi?file=/c/a/2004/01/21/MNG7L4E7IT1.DTL

   Defer to Arnold's judgment?!



More information about the SCL mailing list