[Top] [All Lists]

Re: [ontolog-forum] Amazon vs. IBM: Big Blue meets match in battle for t

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: John F Sowa <sowa@xxxxxxxxxxx>
Date: Tue, 30 Jul 2013 17:13:33 -0400
Message-id: <51F82C7D.60304@xxxxxxxxxxx>
Kingsley,    (01)

Note the qualification:  replacing *words* (i.e., ordinary words
in English or other NLs) with IRIs gives a misleading impression
of precision.    (02)

>> Replacing words with IRIs is worse than useless,
>> because it gives a misleading impression of precision.    (03)

> Doesn't this depend on the communication medium though? If any of these
> entities are communicating by desktop, notebook, tablet, palm top, phone
> etc., circa, 2013, there is immense value in have HTTP URI denotes the
> entities, relationships, and relations in the discourse domain. To the
> participants in such communications the HTTP URIs will be tucked behind
> HTML anchor tags.    (04)

I agree that you need a precise identifier to locate something
that you need to access on the WWW.    (05)

But the issue that Hans and Ed were discussing is the problem of
detecting the context or other information needed to determine
the exact sense of a word in an ordinary language text.    (06)

Studies of inter-annotator agreement among well-trained humans
show that 95% agreement is very rarely achieved.  More typically,
the best computer systems achieve about 75% accuracy.    (07)

If you have a page with 300 words, you have about 15 errors
in the best cases (95% accuracy).  With the typical accuracy
of 75%, you would get about 75 errors in a 300-word page.    (08)

What this implies is that if you want to process documents written
in ordinary language, it's better to use the raw text designed
for human consumption than annotated text that some human or
computer had marked up with IRIs for word senses.    (09)

Note that the IBM Watson system for Jeopardy! used a large number
of different algorithms that came up with independently derived
methods for generating an answer.  Then it used a kind of learning
method to estimate which of the many possible answers was best.
And they ran the system on a supercomputer with 2880 cores.    (010)

John    (011)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (012)

<Prev in Thread] Current Thread [Next in Thread>