ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] {Disarmed} Re: OWL and lack of identifiers

To: Ken Laskey <klaskey@xxxxxxxxx>
Cc: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Pat Hayes <phayes@xxxxxxx>
Date: Sun, 15 Apr 2007 18:34:46 -0500
Message-id: <p06230900c2484eb4bac7@[10.100.0.26]>
>Just to be clear, from RFC 2396:    (01)

Unfortunately RFC 2396 isn't clear; and RFC 3986, 
which obsoletes 2396, is worse: it appears to be 
internally self-contradictory.    (02)

I have written quite a lot of emails and comments 
and given talks on this general topic. Here are a 
few:    (03)

http://lists.w3.org/Archives/Public/uri/2003Apr/0047.html
http://lists.w3.org/Archives/Public/uri/2003Apr/0156.html
http://lists.w3.org/Archives/Public/uri/2003May/0062.html    (04)

http://lists.w3.org/Archives/Public/www-tag/2003Jul/0129.html 
and the subsequent thread.    (05)

and so have others, of course, eg see    (06)

http://www.ibiblio.org/hhalpin/homepage/notes/uri.html    (07)

The general issue at the center of the debate has 
come to be called the 'http-range' issue. 
Roughly, if http GET is a function, what is its 
range?    (08)

>A Uniform Resource Identifier (URI) is a compact string of characters
>    for identifying an abstract or physical resource.    (09)

Sounds reasonable, but what exactly does 
'identify' mean? It would be very easy to read 
that as meaning 'denotes' or 'refers to', but it 
can't possibly mean that in the RFC literature, 
since we are told that having a URI for a 
resource enables one to *perform operations* on 
the resource and to *gain access* to it. (BTW, 
this seems obviously wrong, if resources can be 
abstract or physical things instead of computer 
files of one kind or another. For example,
http://www.ihmc.us/users/phayes/TheNumberThree.html
denotes the number three, but it doesn't enable 
anyone to gain access to 3 or perform any 
operations on it.)    (010)

>A resource can be anything that has identity.    (011)

Now, what does THAT mean? On the face of it, is 
simply means 'anything', since of course anything 
and everything 'has an identity' in the sense 
that it is identical to itself. Philosophers have 
celebrated this notion, talking of the 'haeccity' 
of a thing, meaning the particular identity that 
it uniquely has; the property of being itself. 
But *everything* has this, so the phrase used in 
RFC 2396 is utterly redundant if this is what it 
is intended to mean. Presumably, therefore, it is 
supposed to mean something else. What?? I have 
made considerable effort to discover what it is 
supposed to mean, including reading everything 
that has been written on the topic from any W3C 
source (nobody else uses this terminology in this 
way) and direct email requests for clarification 
made to the authors of RFC 2396.
http://lists.w3.org/Archives/Public/uri/2003Apr/0027.html
No such clarification has been forthcoming; and 
when pressed, the authors resort to abuse and 
impoliteness.
http://lists.w3.org/Archives/Public/uri/2003Apr/0036.html    (012)


>  Familiar
>          examples include an electronic document, an image, a service
>          (e.g., "today's weather report for Los Angeles"), and a
>          collection of other resources.  Not all resources are network
>          "retrievable"; e.g., human beings, corporations, and bound
>          books in a library can also be considered resources.
>
>
>Thus, anything that can be identified    (013)

Now, you apparently feel that these last four 
words make sense. What do YOU think they mean? 
Can you cite an example of a thing that CANNOT be 
identified in this sense, for example, and 
therefore is not a resource?    (014)

>is a resource (i.e., you can use everything for 
>something) and URIs are one means (and one that 
>has been found very useful) for providing that 
>identity.    (015)

The URI *provides* the identity? Surely not. The 
TAG group are quite clear that a resource does 
not need to actually have a URI in order to be a 
resource. It is sufficient that it *could 
possibly* have a URI. As far as I can tell, that 
rules out nothing.    (016)

Pat    (017)

>
>Ken
>
>
>On Apr 13, 2007, at 3:10 PM, Ed Barkmeyer wrote:
>
>>Ken Laskey wrote:
>>
>>>>When the URI is a reference to a Web page 
>>>>(full stop), the resource is the web page, 
>>>>and by extension, the information content of 
>>>>the web page. 
>>>>
>>>I think of the page and its information content as being separate.
>>>
>>
>>From an ontological point of view, I may also 
>>want to distinguish the content from its 
>>external representation, if that was your 
>>point.  But the Web does not make that 
>>distinction.  Put another way, the Web 
>>consciously manages external representations of 
>>information, and leaves the abstraction of 
>>content to the reader.  The whole idea of the 
>>Semantic Web is to provide standard external 
>>representations for some orderly abstraction of 
>>content, in order to facilitate search.
>>
>>I find it important to distinguish the location 
>>of the information from its content, which was 
>>my point.  So perhaps we are talking past each 
>>other.
>>
>>But the definition of URI (IETF RFC 2396) says it identifies a "resource".
>>
>>>For example, I can make statements about the 
>>>style of the page display, the server where 
>>>the <html> tags reside, the provenance 
>>>information for the page.  These are all 
>>>separate from the information content of the 
>>>page.
>>>
>>
>>We have now identified several distinguishable concepts:
>>  1) the place
>>  2) the presentation structure (web page)
>>  3) the information content
>>  4) a formal description of the content
>>  5) the "provenance metadata" for the content
>>  6) the provenance metadata for the presentation
>>  7) the provenance metadata for the presentation in that place
>>
>>And we could easily make a model (ontology) for 
>>these things and their relationships:
>>  place(1) conveys presentation(2)
>>  presentation(2) conveys content(3)
>>  content(3) has formal description(4)
>>  content(3) has provenance of content(5)
>>  presentation(2) has provenance of presentation(6)
>>  place(1) has provenance of site content(7)
>>
>>Further we note that there are other possibilities.  In particular,
>>  place(1) provides service(8)
>>  service(8) permits access to presentation(2)
>>
>>RFC 2396 is pretty clear that a URL identifies 
>>a place(1) full stop, and indicates a means of 
>>access to whatever is at that place.  From our 
>>would-be ontology above, what is thus addressed 
>>is either a presentation/document or a service.
>>
>>By comparison, RFC 2396 says that a URI 
>>identifies a "resource".  And all of 
>>(2),(4),(5),(6),(7) and the service (8) are 
>>distinct resources that may be found at the 
>>*same site*.  (I think the Web view is that 
>>content(3) is only accessible through its 
>>presentation(2).)  It follows that each of them 
>>should have a distinct URI.  Those URIs may be 
>>distinct URLs in their own right, or they may 
>>all incorporate a common URL and each have a 
>>distinct fragment identifier.
>>
>>Since a URL always identifies a place, if the 
>>distinct resources have distinct URLs, our 
>>model above needs some additions:
>>   place(1) conveys formal description(4)
>>   place(1) conveys provenance of content(5)
>>   place(1) conveys provenance of presentation(6)
>>   place(1) conveys provenance of site content(7)
>>
>>One place can convey some or all of 
>>(2),(4),(5),(6),(7),(8), but when one place 
>>conveys more than one of them, each has a 
>>distinct URI whose "fragment identifier" 
>>distinguishes the "component".  And by 
>>convention, in those cases, the URI with no 
>>fragment identifier (the simple URL) conveys 
>>either (2) or (8).  It is also possible that we 
>>have a (9), which is a web page that is a 
>>container for (2),(4),(5),(6),(7), delivered as 
>>a single resource.
>>
>>Note that our model is starting to get rather messy.
>>This is why Tim Burners-Lee says you need to 
>>impose some discipline on your site.  The 
>>problem is that several different conventions 
>>have emerged (including not imposing any 
>>discipline), and there are no reference 
>>standards.
>>
>>In a somewhat different vein, I wrote:
>>
>>>>I have argued with TBL before that URIs that 
>>>>are URLs confuse WHAT something is with WHERE 
>>>>it is.  And it is only an acceptable idea 
>>>>when that relationship is required to be 
>>>>1-to-1.  The idea of identifiers is that you 
>>>>can test for equal.  When the same thing can 
>>>>be in multiple places, unequal doesn't tell 
>>>>me anything, which is annoying, especially 
>>>>when tools think unequal to the expected 
>>>>value means unusable.  And when the same 
>>>>place can hold different things, equal 
>>>>doesn't tell me anything, which defeats the 
>>>>purpose.
>>>>
>>
>>Ken says:
>>
>>>What you are saying is it doesn't serve the 
>>>purpose you have in mind, not that it doesn't 
>>>serve other purposes quite well.  One could 
>>>say the success of the Web shows a real value.
>>>
>>
>>Whoa!  I fully agree that URLs locate lots of 
>>useful and functionally different things, just 
>>as postal addresses do.  But if today it's a 
>>bank and tomorrow it's a laundry or a residence 
>>or a casino, what "resource" is being 
>>"identified"?
>>
>>What I said was that if the content to which a 
>>URI refers changes radically from day to day, 
>>the URI doesn't identify "an information 
>>resource" in any useful sense.  And thus the 
>>idea that the URI identifies something 
>>different from a location is false.  If the 
>>purpose of a URI is to denote content, 
>>function, behavior, as distinct from location, 
>>some one of those has to be consistent over 
>>time.  A bulletin board and a pulpit are just 
>>locations.
>>
>>>>(I wonder how many XML tools would break if 
>>>>the namespace URL for XML Schema pointed to a 
>>>>local copy of the specification...  Is the 
>>>>W3C URI THE name or A name for the XML Schema 
>>>>specification?)
>>>>
>>>This is where provenance comes in.  It is THE 
>>>URI if you believe W3C to be the authoritative 
>>>source. 
>>>
>>
>>This confuses two ideas:
>>  1.  The location of the document
>>  2.  The identity of the document as the one 
>>issued by the authoritative source.
>>
>>Example:  The authoritative source for the 
>>Oxford Dictionary of English is presumably in 
>>Oxford, England, but I can find the document at 
>>my public library.
>>
>>All of the copies of the ODE have the same 
>>designation, but you can find copies in lots of 
>>places.  So if I point you to a place where you 
>>can find it, that has nothing to do with the 
>>authoritative source.
>>
>>But my example was wrong.  The xmlns reference 
>>is to the "namespace URI", which is the 
>>required *identifier* for the specification. 
>>The tool is free to get a copy from anywhere it 
>>likes.  So if I put another URL there, it may 
>>be a location of a copy of the specification, 
>>but it is NOT the *identifier*, and the tool 
>>should fail.  It is exactly as if I referred to 
>>the "Peoria Public Library's dictionary" 
>>instead of the ODE.
>>
>>>>The webhead idea is that you will always go 
>>>>to the URL, fetch the resource, and use it. 
>>>>The idea that a tool has been pre-programmed 
>>>>to support that *content*, and, in conducting 
>>>>a web-based transaction, this might require 
>>>>the tool to fetch and compare two 10MB files 
>>>>to determine whether they are *versions of* 
>>>>the same specification, is beyond their 
>>>>hobbyist view of the Internet.
>>>>
>>>So what metadata do you need in place to 
>>>support your use?  How do you want to create 
>>>and maintain that metadata?  Will you make it 
>>>available for others to use?
>>>
>>
>>Ah, now we are talking about what "responsible 
>>management" of referenceable resources might 
>>be.  This is the kind of discipline that the 
>>WebDAV folks have worked on, and there is a 
>>"widely accepted" scheme for life cycle 
>>management of documents.  The trouble is that 
>>it is widely accepted among the various 
>>organizations involved in making document and 
>>metadata standards, but those folks operate and 
>>influence less than 1% of websites.  It does 
>>mean that publishers, and standards 
>>organizations, and library websites will 
>>probably use it.
>>
>>>Everything is a resource to someone, as it 
>>>should be.  What we want to be able to do is 
>>>differentiate resources so we use the one(s) 
>>>most suitable for our needs.
>>>
>>
>>Exactly.  But unless there are common 
>>conventions for that differentiation, all we 
>>have is a bunch of disorganized resources 
>>labeled according to hundreds or thousands of 
>>incompatible schemes, most of which are not 
>>very good or very useful.  Google has built a 
>>successful enterprise on the failure of the 
>>Web, and its principal resources, to address 
>>that problem.  And there are many who believe 
>>that that also is as it should be.
>>
>>IMO, the problem is that Internet is still the 
>>big city of the Middle Ages. We know how to 
>>build all kinds of buildings and we have a lot 
>>of demand for them and a lot of construction of 
>>various kinds and qualities going on.  But no 
>>one is responsible for much of it, we have no 
>>civil engineering discipline, we have no land 
>>use planning, we have random patchworks of 
>>streets, we are carrying the water on foot in 
>>buckets from the most convenient well, we have 
>>no police force and no fire brigade, we have 
>>sewage problems, crime problems and frequent 
>>plagues.  Some communities thrive and some die 
>>out, and we don't really understand why.  And 
>>yet people keep coming here, because there is 
>>education, and jobs, and entertainment, and 
>>money to be made.  Ultimately, technology 
>>enabled us to get control of it, and fires and 
>>plagues forced us to.  But it took 7 centuries. 
>>I hope the Internet experience is shorter.
>>
>>-Ed
>>
>>--
>>Edward J. Barkmeyer 
>>Email: <mailto:edbark@xxxxxxxx>edbark@xxxxxxxx
>>National Institute of Standards & Technology
>>Manufacturing Systems Integration Division
>>100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
>>Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694
>>
>>"The opinions expressed above do not reflect consensus of NIST,
>>  and have not been reviewed by any Government authority."
>>
>
>------------------------------------------------------------------------------------------
>Ken Laskey
>MITRE Corporation, M/S H305     phone:  703-983-7934
>7515 Colshire Drive                        fax:        703-983-1379
>McLean VA 22102-7508
>
>
>
>
>
>
>_________________________________________________________________
>Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/ 
>Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ 
>Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
>Shared Files: http://ontolog.cim3.net/file/
>Community Wiki: http://ontolog.cim3.net/wiki/
>To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>    (018)


-- 
---------------------------------------------------------------------
IHMC            (850)434 8903 or (650)494 3973   home
40 South Alcaniz St.    (850)202 4416   office
Pensacola                       (850)202 4440   fax
FL 32502                        (850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes    (019)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (020)

<Prev in Thread] Current Thread [Next in Thread>