ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Danger of URIs in mission-critical applications

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Ed Barkmeyer <edbark@xxxxxxxx>
Date: Thu, 09 Jul 2009 11:57:46 -0400
Message-id: <4A56137A.2020106@xxxxxxxx>

John F. Sowa wrote:    (01)

> All modern technology is based on universal identifiers such
> as 'gram' and 'volt', which are unique within the domain of
> measurement.  For such purposes, the methods of resolving the
> identifiers are far more secure than any method based on URIs.    (02)

Yes, but the mechanism is widespread agreement to use that term for that 
measurement.  It is partly enforced by international treaty, but it is 
much more strongly enforced by education in the discipline and by 
industrial practice.    (03)

> A URI for the term 'gram', for example, would be a single point
> of failure that could be attacked by any novice-level hacker.    (04)

Nonsense.  A URI for 'gram' would be useless and meaningless if it were 
one of 27 different URIs for the measurement unit, all of them "commonly 
in use" by some small part of the engineering populace.  And if there is 
only one, and it is in common use, nearly no one will ever need to look 
it up, as Ken Laskey observed.    (05)

The underlying presumption here is that either:
  - the URI is followed to its source and the text at that source is 
presented to a human who can decide what it means and what consequence 
that has for whatever s/he was doing with the corpus that contained it; or
  - the URI is followed to its source and the software can read and 
interpret the text at that source in some useful way.    (06)

These are only important when the URI is a term that is NOT commonly in 
use.    (07)

When the URI is in common use, there is a third possibility:
  - the software tests the URI for equality to a known (and expected) 
URI and behaves according to its internal rules for handling that 
concept, which may only mean that it presents the "common natural 
language term" to the human decision maker.    (08)

How many tools do you suppose actually visit the W3C site and fetch the 
Schema for schemas to somehow drive their parse of an XML schema that 
refers to that URI as the definition of its own schema?  My guess would 
be 0. If they bother, the tools test for equal to the standard URI.    (09)

MOST of the current uses of URIs in which the software can actually 
interpret the corpus so retrieved are XML schemas, and the vast majority 
of the rest are RDF schemas (some of which are OWL).  And most of the 
tools that can read the latter don't follow most of the URIs at all, 
they just record them, test for equal for certain internal 
manipulations, and occasionally pursue them when asked by the human 
interface.    (010)

There is no requirement that a URI can actually be used to access a 
defining resource directly.  You may have to know of a server that can 
convert it to such a reference, even when it appears to be an HTTP URI, 
if I understand the current W3C (but not IETF) position correctly.  Or 
the URL part may be intercepted by a server that tells the human user 
how s/he can get access to the defining source, e.g., by interacting 
with our catalogue and paying 150SF for the PDF.  Or it may be 
intercepted by a server that tells the software agent what fee will be 
charged and asks the agent to execute the financial transaction by 
providing a chargeable account and signature.  (And how much of that 
automation do you want to trust?)    (011)

> I agree with both of those points:
> 
> MH> Using [old fashioned paper methods] provide more legal/
>  > administrative control that can be used to maintain the meaning
>  > associated with the symbol. In particular, there is a lot of
>  > "old economy" legal power to enforce compliance etc.
>  >
>  > URIs, in contrast, have the advantage that they drastically reduce
>  > the cost for the community to look up the intended meaning of the
>  > symbol (i.e. the URI), which reduces the familiarization costs and
>  > may support convergence in the usage of the symbol in communication.    (012)

As long as the community that agrees to use that URI is significant. 
And oh yeah, this is such a wonder of modern technology that we also 
have GUIDs and UUIDs and UPNs and URFIDs and DUNSIDs and a dozen other 
unique machine-readable terms, each for someone's favorite community and 
domain.    (013)

The URI itself is a successor to SGML document ids (vintage 1979), which 
in their time competed with the CCITT X.209 cascading naming standard 
for "network resources" ("object ids"), which was used, by treaty 
agreement, by all digital telephone and telegraph systems worldwide. 
Which of those has drastically reduced the cost for your community? 
(Answer: The latter, we can now call from Washington to Beijing without 
benefit of two or three multilingual human operators.)    (014)

In so many words, the problem with URIs is that they are not *yet* 
supported by a technology that is widely adopted and guarantees that 
they have value other than as a "guaranteed to be unique" string of 
characters.  (John's points about security and reliability and 
expectations enter into this.)  And there are many "guaranteed to be 
unique" string "technologies" that are competing to be THE reference 
identifier for lots of different things.  What URIs have right now is 
"promise", not "value".    (015)

-Ed    (016)

P.S. IMO, cheapening HTTP URIs by allowing them to be little more than a 
unique string of characters processed by some not-clearly-identified 
dictionary service, which seems to be the current gospel of W3C, is the 
surest way to make them just one more competing "universal id" 
technology, with no intrinsic advantages.  (But this is the wrong forum 
for that debate.  And in the appropriate fora, you can choose the cross 
or the crescent and be an infidel to half the participants. ;-))    (017)


-- 
Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694    (018)

"The opinions expressed above do not reflect consensus of NIST,
  and have not been reviewed by any Government authority."    (019)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (020)

<Prev in Thread] Current Thread [Next in Thread>