One more terminological note: (01)
It is confusing to say that an XML document is well-formed in the sense
of its conformance to the XML syntax, since: (02)
- this way of speaking is suggestive of that there can be a
non-well-formed XML document (where 'well-formed' still refers to the
XML syntax), while
- a document that does not conform to the XML syntax simply is not an
XML document. (03)
It makes sense to speak of a valid or invalid XML document, though, with
'valid' referring to another syntax definition (xslt schema/dtd, say).
But, again, to say that an xslt document is valid, with 'valid'
referring to xslt syntax, is equally redundant. (04)
vQ (05)
Ed Barkmeyer wrote:
> Wacek,
>
> you wrote:
>
>> I was actually curious whether you will refer to the XML example.
>> The use of 'well-formed' and 'valid' here are synonymous in that both
>> are used to speak of a document's conformance with a grammar;
>
> With that definition, the terms are equivalent, yes. One can distinguish:
> well-formed/valid with respect the XML grammar
> from
> well-formed/valid with respect to the DTD/schema
> But XML chooses to use 'well-formed' for the first, and 'valid' for the
> second.
>
>> equally well it might be said that a document is valid as a XML
>> document and well-formed as an RDF document. This is just a
>> terminological convention.
>
> Exactly.
>
>> The issue with IANA is a bit different, in that a domain name may be
>> registered and unregistered, i.e., its status as valid or not may
>> change, though the rules of the language do not change, in general.
>
> An interesting point. This is a big difference between a registry of
> values and an enumerated list. The validity of a symbol is
> time-dependent, and related to events out of the context of the usage.
> So my earlier analogy with local symbol definitions, which are solely
> dependent on other elements of the context of use, is inappropriate.
>
>> On the other hand, a document that is XML-well-formed and XXX-valid
>> remains well-formed and valid, unless the grammars are redefined.
>> (You might say that a change in IANA registry is, analogously, a
>> change in the languages rules, if you insist.)
>
> I agree that is stretching the point. The time-dependency is, to me, a
> good reason for discarding the idea that "hostname validity", in the
> sense of being registered, is "syntactic".
>
>> The issue is that in the case of XML the use of 'well-formed' for the
>> one and 'valid' for the other is well-defined and documented. In the
>> case of URLs, I am not sure whether there is an officially established
>> convention of calling a URL 'well-formed' if it conforms to the RFC
>> and calling it 'valid' if it is registered; maybe there is, but maybe
>> this is only wishful thinking.
>
> Upon examination, RFC 1123 refers to "syntactically valid" hostnames as
> those conforming to the production rules. Beyond that, the rest of the
> terminology is all wrapped up with the DNS protocols and failure modes.
> There really isn't a concept of "valid hostname"; the concept is "DNS
> lookup succeeds". This is further complicated by the fact that the DNS
> folk believe their mechanism can work for anything that supports URI
> syntax (as long as the names are short enough), and therefore DNS is not
> limited to hostname lookups.
>
> So I have to admit that there is nothing "syntactic" about the validity
> of a hostname beyond its lexical rules. After that, there is a process
> that succeeds or fails in one of several ways. The function of the
> hostname in a URL is to identify one or more Internet hosts that MAY be
> able to provide the service that maps the URL to a resource. And the
> mapping of hostname to host is properly seen as just part of the complex
> process of performing the URL-to-resource mapping.
>
> I also have to admit that until just now, I hadn't looked at RFC 1123
> for years, and I clearly should have, before creating a concept not
> actually supported by the standards. My apologies to all.
>
> -Ed
>
> P.S. I realize that this is one of those areas in which the "grey hair"
> is not useful. A certain implementation, with which I used to be
> intimately familiar (15 years ago), talks about "valid host names", but
> the standards don't. More importantly, that experience predates HTTP as
> the dominant Internet protocol; the concept URL was just coming into
> existence. And in that time, one's tool did the hostname lookup to find
> the IP address, and then inaugurated the application-specific protocol.
> So "hostname validity" was a concept in its own right. But the URL/URI
> has made the "hostname" issue just a technical element of the "valid
> URL" concept. And in this "identifier" context, separating out the
> "hostname" is inappropriate -- it is now a "lower-level concern". I
> just didn't realize until now that I'm still carrying old baggage in
> this area.
> (06)
--
Wacek Kusnierczyk (07)
------------------------------------------------------
Department of Information and Computer Science (IDI)
Norwegian University of Science and Technology (NTNU)
Sem Saelandsv. 7-9
7027 Trondheim
Norway (08)
tel. 0047 73591875
fax 0047 73594466
------------------------------------------------------ (09)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (010)
|