Kingsley Idehen wrote:
> Ed and other respondents,
> I prefer to look at the entity relationship semantics on the Web as
> being akin to screen resolution fidelity. You have high and low
> resolution. There are data spaces within the Linked Open Data cloud
> where the fidelity or entity relationship semantics are very high and
> discernible to humans and machines. Of course, there are enclaves where
> the aforementioned semantic fidelity is very low. Such is the nature of
> the Web.
> (01)
The first question is whether and how the Web technologies in use
support the identification of the 'entity relationship semantics' of a
link. The meaning of the link in Pyotr Nowara's email today was not
clear to some of the participants on this exploder, even though it was
embedded in English text that was presumably capable of expressing its
exact relationship to the subject of the email. By comparison, formal
links in HTML documents are nothing more than short strings with
associated URIs. What is the relationship between the hyperlink text
and the information body at the URI? The problem in such links is not
whether the linked information is faithful to the intended E-R
semantics; the problem is in conveying the intended E-R semantics at all. (02)
In a language like RDF, the stated semantics of the use of a URI term is
that the term designates some concept and the URI can be dereferenced to
some resource that facilitates our understanding of that concept. There
one can talk about fidelity, because the intrinsic semantics of the link
is well-defined. That is a major difference from an HTML hyperlink or a
URI value of an XML attribute. (03)
> The Web's ultimately advantage is that crowd-sourcing is intrinsic. The
> number of subject matter experts contributing to the Linked Data Cloud
> is growing rapidly too :-)
> (04)
"Crowd sourcing a la Wikipedia" and "Crowd sourcing a la Google" are two
entirely different things. The committed experts contributing to
Wikipedia articles work actively to suppress the committed ignoramuses.
That doesn't prevent Wikipedia articles from being inaccurate or
debatable or one-sided, because there are many fields in which that kind
of information variance is typical of "accepted truth" in the trade. It
does prevent clear misinformation from surviving very long. By
comparison, Google's approach is essentially a political contest -- the
most commonly referenced links are presented first, although Google is
also doing some kind of credibility control. This kind of crowd
sourcing returns what most people accept, rather than what experts
accept. And it is ripe for exploitation, as Doug Foxvog pointed out:
"Never underestimate the power of stupid people in large groups." (05)
My experience in standards development has significant parallels. There
are development groups led by effective organizers with expert
contributors who work together to produce something of value. There are
development groups led by effective organizers deliberately ignorant of
other work who build groups with motivation but little relevant
expertise to produce would-be standards of no real value. And there are
groups consisting of a few real experts and a few uneducated leaders and
a body of lesser folk who are regularly asked to choose between the
right way and the wrong way without having the background to be capable
of distinguishing them. The webby standards bodies themselves fall into
all three categories. So we have every reason to suppose that the
highly linked Web will have all of those characteristics. (06)
It seems to me (and I think this is what TBL had in mind in the treatise
John Sowa cited) that the experts can be expected to find and recognize
other expert work, and thus their links will be reliable and as
well-defined as the technologies permit. So, good information will link
to good information, for the most part. Bad information linked to good
information, and bad information linked to bad information, and good
information badly linked will be much more common, because expertise is
a much rarer commodity than "information" on the Web. The problem will
always be finding the good starting points. And, like the literature of
certain trades, there will unfortunately be brilliant insightful
contributions that are inadvertently suppressed because they are not
mainstream and get few citations, and conversely, there will be
"revolutionary" treatises that get lots of citations by being
provocative rather than accurate. "Even bad publicity is good", if what
you are counting is site visits, not content value. (07)
So I do not see the crowd-sourced linked information space being a
significant improvement over the current situation. The Linked Open
Data technologies just make the linking easier; they don't improve the
capture of the semantics of the links or the quality of the linked
resources. There will be valuable entities clearly related to other
valuable entities; valuable entities somehow related to other valuable
entities; and a large body of work in which one or the other entity is
mostly garbage and the quality of the link is irrelevant. And that is
where we are now. What we need is mechanisms for creating high-quality
resources, mechanisms for recognizing the quality of resources, and
mechanisms for improving the expressiveness of the links. (08)
As near as I can tell, the current web-think is that RDF and its
spinoffs will somehow do this for us (after only 14 years). (09)
The alternative being promulgated by several standards bodies takes the
general form of an "architecture", or a "framework" or a "best practice
guideline", for what gets linked to what and how, and one can better
understand the relationships expressed by the links if one understands
the architecture/practice being used. This approach is great for the
"in-crowd" who are part of the development activity, but the resulting
artifacts are less clear to the web tourist who enters one of these
well-developed structures in the middle. Some link in the resource or a
related resource points to the architecture, and if you accidentally
find it, you may be able to determine how you should have followed the
links. (010)
-Ed (011)
--
Edward J. Barkmeyer Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Systems Integration Division, Engineering Laboratory
100 Bureau Drive, Stop 8263 Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263 Cel: +1 240-672-5800 (012)
"The opinions expressed above do not reflect consensus of NIST,
and have not been reviewed by any Government authority." (013)
> Kingsley Idehen wrote:
>
>> On 11/13/12 11:12 AM, John F Sowa wrote:
>>
>>
>>> Bottom line: The current SW tools and notations will survive
>>> for a while, but Schema.org is where mainstream IT is heading.
>>>
>>> John
>>>
>>>
>> But this isn't the fundamental problem.
>>
>> The problem always boils down to webby data object representation,
>> access, and relationship semantics that scales to the Web. Once in
>> place, we have a functional global data space that accommodates
>> intensional and extensional data interaction. Basically, we end up with
>> Data as a new kind of Electricity conducted via hyperlinks.
>>
>> The above isn't specific to any format, as you know. It has everything
>> to do with the following, functioning at Web-scale:
>>
>> 1. Entity Relationship Model
>> 2. Entity Relationship Semantics
>> 3. Instances of Entity Relationship Graphs linked across a variety of
>> boundaries.
>>
>> The Web as it exists is evolving into what I've outlined above at
>> frenetic pace.
>>
>> (014)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (015)
|