ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] OWL and lack of identifiers

To: edbark@xxxxxxxx
Cc: Ontolog Forum <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Waclaw Kusnierczyk <Waclaw.Marcin.Kusnierczyk@xxxxxxxxxxx>
Date: Tue, 17 Apr 2007 04:04:05 +0200
Message-id: <46242B15.9090700@xxxxxxxxxxx>


Ed Barkmeyer wrote:
> Waclaw Kusnierczyk wrote:
> 
>> Ed Barkmeyer wrote:
> 
>>> If the URL doesn't mean that, it is not a "valid" HTTP URL, because 
>>> it does not satisfy the requirements of RFC 2616.  So the example URI 
>>> that Waclaw offers:
>>>    'http://www.nonsense.no'
>>> is NOT a "valid" URL.  It may satisfy the grammar requirements, but 
>>> it probably does not satisfy the syntactic requirement for 
>>> www.nonsense.no to be a valid domain name (i.e., a sequence of 
>>> characters registered as a domain name through the Internet cascading 
>>> directory scheme).  And it surely doesn't satisfy the requirement for 
>>> it to refer to an accessible resource.  So it is likely to be 
>>> *syntactically invalid* and it has *no valid interpretation*.
> 
>> Agreed.  I should have written *syntactically valid* instead of 
>> (syntactically) *valid*.  And according to the interpretation of 
>> 'valid'  that you advocate here, one can make sure that a string is a 
>> valid url only by actually connecting to the host and receiving a 
>> response.
> 
> We are splitting some hairs here.  I said, very carefully, that
>  'http://www.nonsense.no'
> is "grammatically valid" per RFC 2616 -- it satisfies the production rules.
> And your use of "syntactically valid" only means "satisfies the 
> production rules".
> 
> The problem is whether 'www.nonsense.no' is a valid instance of 'host', 
> which depends on whether 'nonsense.no' is registered with IANA to 
> correspond to an allocated IP address.  If it is registered, then I can 
> agree that this is syntactically valid.  If is not registered, one can 
> argue that it is not even "syntactically valid", because valid domain 
> names are determined by IANA registry, not by a production rule.  That 
> is, the requirement is not just that it is well-formed, but rather that 
> it is in a particular list of valid strings.  By way of analogy, 
> consider a language in which symbols must be declared before they are 
> used. If the parser encounters a well-formed lexical object that should 
> be a declared symbol but the "symbol table lookup" fails (the symbol was 
> never declared), is that a "syntax error"?  Is that text "syntactically 
> valid"?  (I don't know that the formal languages community has 
> consistent terminology in this area, but I haven't written a compiler in 
> over 20 years.  Maybe they do now.)    (01)

We may be splitting hairs here.    (02)

A formal language is a set of strings over an alphabet -- a set of 
symbols.  You may define the language ostensibly, by listing all the 
strings, which would imply which symbols are in the alphabet.  You may 
define the language by means of a grammar, that is, specify the alphabet 
and rules for forming strings in the language.  You may do both, but the 
definitions must agree.    (03)

When you say that a URL may be a well-formed string and yet not in the 
language (i.e., not registered by IANA as an element of the language), 
then you state a contradiction.  Either the string is not well-formed, 
or it is in the language.    (04)

It may well be that a syntactically valid URL is one that is registered 
by IANA, and only such -- it is a matter of definition.  (It's best that 
we look up the procise definition.)  But well-formedness is then not a 
question of following any grammar, but rather of being registered.  The 
URL grammar defines a different language, which happens to be a superset 
of the language of syntactically valid URLs (unless IANA registers a 
string which is not well-formed according to the grammar, in which case 
those two languages would still be overlapping).    (05)

> 
> Technically, an invalid domain name will fail before any connection is 
> attempted, because the DNS will find no corresponding entry, and 
> therefore no IP address and no host to connect to ("host not found").  A 
> DNS lookup request is not a connection request.  The connection attempt 
> will only occur if the DNS returns an IP address.    (06)

This is far more complicated, and what you say is only partially true.
You do not need to contact a DNS server on every occasion.  Your 
system's DNS resolver may cache IPs, and may try to connect you directly 
to an IP, given a URL, without consultation with any DNS server.  If 
your DNS resolver provides you with an IP previously looked-up for a 
domain name that has just become invalid, you may either experience a 
connection attempt failure (not a lookup failure), or even get connected 
  to an IP which may now be associated with another domain name.
A connection attempt may not only occur if the DNS does not return an IP 
address, it may occur even without the DNS server being asked for the IP.    (07)

(Actually, IPs may even be cached, e.g., directly by the browser.)    (08)

> 
> Finally, the "validity of the URL" is separate from being able to 
> connect to the referenced host and access the resource.  The real 
> "validity of the URL" depends only on whether the referenced host 
> actually associates a resource with that URL.  The connection or access 
> could fail for several reasons even when the URL is valid, e.g., the 
> host could be down, or refuse connection to your system, or refuse to 
> let you access the resource.  So, if you can access a resource using the 
> URL, then it is certainly valid, but the inverse does not hold: If you 
> cannot access a resource using the URL, the URL is not necessarily invalid.    (09)

Sure.    (010)

> 
> These are probably subtleties you never wanted to know about, but they 
> have a habit of surfacing in conversations about "validity" of URLs.    (011)

Alway nice to learn.    (012)

> 
> -Ed
>     (013)

-- 
Wacek Kusnierczyk    (014)

------------------------------------------------------
Department of Information and Computer Science (IDI)
Norwegian University of Science and Technology (NTNU)
Sem Saelandsv. 7-9
7027 Trondheim
Norway    (015)

tel.   0047 73591875
fax    0047 73594466
------------------------------------------------------    (016)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (017)

<Prev in Thread] Current Thread [Next in Thread>