ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Amazon vs. IBM: Big Blue meets match in battle for t

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Patrick Cassidy" <pat@xxxxxxxxx>
Date: Tue, 30 Jul 2013 19:42:59 -0400
Message-id: <109101ce8d7e$8661dfc0$93259f40$@micra.com>
John,
   Re: your comment:
>> [JFS} Studies of inter-annotator agreement among well-trained humans show
that 95% agreement is very rarely achieved.  More typically, the best
computer systems achieve about 75% accuracy.    (01)

   In what - er -  "context"  is this true?  Do you have a pointer to this?
This kind of number must be task-dependent.    (02)

PatC    (03)


Patrick Cassidy
MICRA Inc.
cassidy@xxxxxxxxx
1-908-561-3416    (04)


-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F Sowa
Sent: Tuesday, July 30, 2013 5:14 PM
To: ontolog-forum@xxxxxxxxxxxxxxxx
Subject: Re: [ontolog-forum] Amazon vs. IBM: Big Blue meets match in battle
for the cloud    (05)

Kingsley,    (06)

Note the qualification:  replacing *words* (i.e., ordinary words in English
or other NLs) with IRIs gives a misleading impression of precision.    (07)

JFS
>> Replacing words with IRIs is worse than useless, because it gives a 
>> misleading impression of precision.    (08)

KI
> Doesn't this depend on the communication medium though? If any of 
> these entities are communicating by desktop, notebook, tablet, palm 
> top, phone etc., circa, 2013, there is immense value in have HTTP URI 
> denotes the entities, relationships, and relations in the discourse 
> domain. To the participants in such communications the HTTP URIs will 
> be tucked behind HTML anchor tags.    (09)

I agree that you need a precise identifier to locate something that you need
to access on the WWW.    (010)

But the issue that Hans and Ed were discussing is the problem of detecting
the context or other information needed to determine the exact sense of a
word in an ordinary language text.    (011)

Studies of inter-annotator agreement among well-trained humans show that 95%
agreement is very rarely achieved.  More typically, the best computer
systems achieve about 75% accuracy.    (012)

If you have a page with 300 words, you have about 15 errors in the best
cases (95% accuracy).  With the typical accuracy of 75%, you would get about
75 errors in a 300-word page.    (013)

What this implies is that if you want to process documents written in
ordinary language, it's better to use the raw text designed for human
consumption than annotated text that some human or computer had marked up
with IRIs for word senses.    (014)

Note that the IBM Watson system for Jeopardy! used a large number of
different algorithms that came up with independently derived methods for
generating an answer.  Then it used a kind of learning method to estimate
which of the many possible answers was best.
And they ran the system on a supercomputer with 2880 cores.    (015)

John    (016)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
http://ontolog.cim3.net/wiki/ To join:
http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (017)



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (018)

<Prev in Thread] Current Thread [Next in Thread>