[Top] [All Lists]

Re: [ontolog-forum] OWL and lack of identifiers

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Waclaw Kusnierczyk <Waclaw.Marcin.Kusnierczyk@xxxxxxxxxxx>
Date: Wed, 18 Apr 2007 21:40:43 +0200
Message-id: <4626743B.3060202@xxxxxxxxxxx>
Not that I disagree.    (01)

vQ    (02)

Peter F Brown wrote:
> Wacek:
> The concept of "well-formedness" in XML is now well established even if 
>formal logicians poke holes in the terminological inconsistency.
> The problem is a hangover from HTML where many HTML documents, though 
>syntactically incorrect, were still parsed without browsers falling over. It 
>is precisely the high fault-tolerance of most browsers that keeps the Web 
>turning today. If only well-formed documents passed muster, the Web as we know 
>it would be a very different place. Some would argue that would be a good 
>thing but if the problem was to have been addressed it should have been done 
>so 15 years ago, not now. As it is, millions of mere mortal non-programmers 
>grokked the basics of writing HTML and getting sites up and running, and that 
>was considered more important than logical purity. That's why repeated 
>attempts to lock the door after the horse had bolted, failed.
> XML (re-)introduced the concept of "well-formed" in order to redress the 
>balance but alas there are many tools out there that still generate crap code, 
>whether in HTML, XHTML or another XML application.
> "Well-formedness" therefore, although tautological, serves as a useful badge 
>of conformance for those who do try to make the effort.
> Peter
> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx 
>[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Waclaw Kusnierczyk
> Sent: 18 April 2007 00:51
> To: edbark@xxxxxxxx
> Cc: Ontolog Forum
> Subject: Re: [ontolog-forum] OWL and lack of identifiers
> One more terminological note:
> It is confusing to say that an XML document is well-formed in the sense 
> of its conformance to the XML syntax, since:
> - this way of speaking is suggestive of that there can be a 
> non-well-formed XML document (where 'well-formed' still refers to the 
> XML syntax), while
> - a document that does not conform to the XML syntax simply is not an 
> XML document.
> It makes sense to speak of a valid or invalid XML document, though, with 
> 'valid' referring to another syntax definition (xslt schema/dtd, say). 
> But, again, to say that an xslt document is valid, with 'valid' 
> referring to xslt syntax, is equally redundant.
> vQ
> Ed Barkmeyer wrote:
>> Wacek,
>> you wrote:
>>> I was actually curious whether you will refer to the XML example.
>>> The use of 'well-formed' and 'valid' here are synonymous in that both 
>>> are used to speak of a document's conformance with a grammar;  
>> With that definition, the terms are equivalent, yes.  One can distinguish:
>>   well-formed/valid with respect the XML grammar
>> from
>>   well-formed/valid with respect to the DTD/schema
>> But XML chooses to use 'well-formed' for the first, and 'valid' for the 
>> second.
>>> equally well it might be said that a document is valid as a XML 
>>> document and well-formed as an RDF document.  This is just a 
>>> terminological convention.
>> Exactly.
>>> The issue with IANA is a bit different, in that a domain name may be 
>>> registered and unregistered, i.e., its status as valid or not may 
>>> change, though the rules of the language do not change, in general.  
>> An interesting point.  This is a big difference between a registry of 
>> values and an enumerated list.  The validity of a symbol is 
>> time-dependent, and related to events out of the context of the usage.  
>> So my earlier analogy with local symbol definitions, which are solely 
>> dependent on other elements of the context of use, is inappropriate.
>>> On the other hand, a document that is XML-well-formed and XXX-valid 
>>> remains well-formed and valid, unless the grammars are redefined.  
>>> (You might say that a change in IANA registry is, analogously, a 
>>> change in the languages rules, if you insist.)
>> I agree that is stretching the point.  The time-dependency is, to me, a 
>> good reason for discarding the idea that "hostname validity", in the 
>> sense of being registered, is "syntactic".
>>> The issue is that in the case of XML the use of 'well-formed' for the 
>>> one and 'valid' for the other is well-defined and documented.  In the 
>>> case of URLs, I am not sure whether there is an officially established 
>>> convention of calling a URL 'well-formed' if it conforms to the RFC 
>>> and calling it 'valid' if it is registered;  maybe there is, but maybe 
>>> this is only wishful thinking.
>> Upon examination, RFC 1123 refers to "syntactically valid" hostnames as 
>> those conforming to the production rules.  Beyond that, the rest of the 
>> terminology is all wrapped up with the DNS protocols and failure modes.  
>> There really isn't a concept of "valid hostname"; the concept is "DNS 
>> lookup succeeds". This is further complicated by the fact that the DNS 
>> folk believe their mechanism can work for anything that supports URI 
>> syntax (as long as the names are short enough), and therefore DNS is not 
>> limited to hostname lookups.
>> So I have to admit that there is nothing "syntactic" about the validity 
>> of a hostname beyond its lexical rules.  After that, there is a process 
>> that succeeds or fails in one of several ways.  The function of the 
>> hostname in a URL is to identify one or more Internet hosts that MAY be 
>> able to provide the service that maps the URL to a resource.  And the 
>> mapping of hostname to host is properly seen as just part of the complex 
>> process of performing the URL-to-resource mapping.
>> I also have to admit that until just now, I hadn't looked at RFC 1123 
>> for years, and I clearly should have, before creating a concept not 
>> actually supported by the standards.  My apologies to all.
>> -Ed
>> P.S. I realize that this is one of those areas in which the "grey hair" 
>> is not useful.  A certain implementation, with which I used to be 
>> intimately familiar (15 years ago), talks about "valid host names", but 
>> the standards don't.  More importantly, that experience predates HTTP as 
>> the dominant Internet protocol;  the concept URL was just coming into 
>> existence.  And in that time, one's tool did the hostname lookup to find 
>> the IP address, and then inaugurated the application-specific protocol.  
>> So "hostname validity" was a concept in its own right.  But the URL/URI 
>> has made the "hostname" issue just a technical element of the "valid 
>> URL" concept.  And in this "identifier" context, separating out the 
>> "hostname" is inappropriate -- it is now a "lower-level concern".  I 
>> just didn't realize until now that I'm still carrying old baggage in 
>> this area.
>     (03)

Wacek Kusnierczyk    (04)

Department of Information and Computer Science (IDI)
Norwegian University of Science and Technology (NTNU)
Sem Saelandsv. 7-9
7027 Trondheim
Norway    (05)

tel.   0047 73591875
fax    0047 73594466
------------------------------------------------------    (06)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (07)

<Prev in Thread] Current Thread [Next in Thread>