Not that I disagree. (01)
vQ (02)
Peter F Brown wrote:
> Wacek:
>
> The concept of "well-formedness" in XML is now well established even if
>formal logicians poke holes in the terminological inconsistency.
>
> The problem is a hangover from HTML where many HTML documents, though
>syntactically incorrect, were still parsed without browsers falling over. It
>is precisely the high fault-tolerance of most browsers that keeps the Web
>turning today. If only well-formed documents passed muster, the Web as we know
>it would be a very different place. Some would argue that would be a good
>thing but if the problem was to have been addressed it should have been done
>so 15 years ago, not now. As it is, millions of mere mortal non-programmers
>grokked the basics of writing HTML and getting sites up and running, and that
>was considered more important than logical purity. That's why repeated
>attempts to lock the door after the horse had bolted, failed.
>
> XML (re-)introduced the concept of "well-formed" in order to redress the
>balance but alas there are many tools out there that still generate crap code,
>whether in HTML, XHTML or another XML application.
>
> "Well-formedness" therefore, although tautological, serves as a useful badge
>of conformance for those who do try to make the effort.
>
> Peter
>
> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
>[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Waclaw Kusnierczyk
> Sent: 18 April 2007 00:51
> To: edbark@xxxxxxxx
> Cc: Ontolog Forum
> Subject: Re: [ontolog-forum] OWL and lack of identifiers
>
> One more terminological note:
>
> It is confusing to say that an XML document is well-formed in the sense
> of its conformance to the XML syntax, since:
>
> - this way of speaking is suggestive of that there can be a
> non-well-formed XML document (where 'well-formed' still refers to the
> XML syntax), while
> - a document that does not conform to the XML syntax simply is not an
> XML document.
>
> It makes sense to speak of a valid or invalid XML document, though, with
> 'valid' referring to another syntax definition (xslt schema/dtd, say).
> But, again, to say that an xslt document is valid, with 'valid'
> referring to xslt syntax, is equally redundant.
>
> vQ
>
>
> Ed Barkmeyer wrote:
>> Wacek,
>>
>> you wrote:
>>
>>> I was actually curious whether you will refer to the XML example.
>>> The use of 'well-formed' and 'valid' here are synonymous in that both
>>> are used to speak of a document's conformance with a grammar;
>> With that definition, the terms are equivalent, yes. One can distinguish:
>> well-formed/valid with respect the XML grammar
>> from
>> well-formed/valid with respect to the DTD/schema
>> But XML chooses to use 'well-formed' for the first, and 'valid' for the
>> second.
>>
>>> equally well it might be said that a document is valid as a XML
>>> document and well-formed as an RDF document. This is just a
>>> terminological convention.
>> Exactly.
>>
>>> The issue with IANA is a bit different, in that a domain name may be
>>> registered and unregistered, i.e., its status as valid or not may
>>> change, though the rules of the language do not change, in general.
>> An interesting point. This is a big difference between a registry of
>> values and an enumerated list. The validity of a symbol is
>> time-dependent, and related to events out of the context of the usage.
>> So my earlier analogy with local symbol definitions, which are solely
>> dependent on other elements of the context of use, is inappropriate.
>>
>>> On the other hand, a document that is XML-well-formed and XXX-valid
>>> remains well-formed and valid, unless the grammars are redefined.
>>> (You might say that a change in IANA registry is, analogously, a
>>> change in the languages rules, if you insist.)
>> I agree that is stretching the point. The time-dependency is, to me, a
>> good reason for discarding the idea that "hostname validity", in the
>> sense of being registered, is "syntactic".
>>
>>> The issue is that in the case of XML the use of 'well-formed' for the
>>> one and 'valid' for the other is well-defined and documented. In the
>>> case of URLs, I am not sure whether there is an officially established
>>> convention of calling a URL 'well-formed' if it conforms to the RFC
>>> and calling it 'valid' if it is registered; maybe there is, but maybe
>>> this is only wishful thinking.
>> Upon examination, RFC 1123 refers to "syntactically valid" hostnames as
>> those conforming to the production rules. Beyond that, the rest of the
>> terminology is all wrapped up with the DNS protocols and failure modes.
>> There really isn't a concept of "valid hostname"; the concept is "DNS
>> lookup succeeds". This is further complicated by the fact that the DNS
>> folk believe their mechanism can work for anything that supports URI
>> syntax (as long as the names are short enough), and therefore DNS is not
>> limited to hostname lookups.
>>
>> So I have to admit that there is nothing "syntactic" about the validity
>> of a hostname beyond its lexical rules. After that, there is a process
>> that succeeds or fails in one of several ways. The function of the
>> hostname in a URL is to identify one or more Internet hosts that MAY be
>> able to provide the service that maps the URL to a resource. And the
>> mapping of hostname to host is properly seen as just part of the complex
>> process of performing the URL-to-resource mapping.
>>
>> I also have to admit that until just now, I hadn't looked at RFC 1123
>> for years, and I clearly should have, before creating a concept not
>> actually supported by the standards. My apologies to all.
>>
>> -Ed
>>
>> P.S. I realize that this is one of those areas in which the "grey hair"
>> is not useful. A certain implementation, with which I used to be
>> intimately familiar (15 years ago), talks about "valid host names", but
>> the standards don't. More importantly, that experience predates HTTP as
>> the dominant Internet protocol; the concept URL was just coming into
>> existence. And in that time, one's tool did the hostname lookup to find
>> the IP address, and then inaugurated the application-specific protocol.
>> So "hostname validity" was a concept in its own right. But the URL/URI
>> has made the "hostname" issue just a technical element of the "valid
>> URL" concept. And in this "identifier" context, separating out the
>> "hostname" is inappropriate -- it is now a "lower-level concern". I
>> just didn't realize until now that I'm still carrying old baggage in
>> this area.
>>
> (03)
--
Wacek Kusnierczyk (04)
------------------------------------------------------
Department of Information and Computer Science (IDI)
Norwegian University of Science and Technology (NTNU)
Sem Saelandsv. 7-9
7027 Trondheim
Norway (05)
tel. 0047 73591875
fax 0047 73594466
------------------------------------------------------ (06)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (07)
|