ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] IBM Watson on Jeopardy

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Peter Yim <peter.yim@xxxxxxxx>
Date: Thu, 10 Feb 2011 08:47:37 -0800
Message-id: <AANLkTik8_8p-BBK=56WY4QYO8OdgLvZvGqARisevtOWV@xxxxxxxxxxxxxx>
Thanks, John.    (01)

> [JFS]  Compared to Ferrucci's talks, the PBS Nova program was
> a disappointment. ...    (02)

[ppy]  I watched the PBS program last night, and think it was *really*
well done (considering the audience they are airing this to.) The only
disappointment (to me, at least) is that the "O" word did not appear
even once.  :-)    (03)

Congratulations, DaveFerruci and every one on your team! ... and
ChrisWelty, we saw you there too!    (04)

Best regards.  =ppy
--    (05)


On Thu, Feb 10, 2011 at 8:27 AM, John F. Sowa <sowa@xxxxxxxxxxx> wrote:
> Peter,
>
> Thanks for the reminder:
>
>> Dave Ferrucci gave a talk on UIMA (the Unstructured Information
>> Management Architecture) back in May-2006, entitled: "Putting the
>> Semantics in the Semantic Web: An overview of UIMA and its role in
>> Accelerating the Semantic Revolution"
>
> I recommend that readers compare Ferrucci's talk about UIMA in
> 2006 with his talk about the Watson system and Jeopardy in 2011.
> In less than 5 years, they built Watson on the UIMA foundation,
> which contained a reasonable amount of NLP tools, a modest ontology,
> and some useful tools for knowledge acquisition.  During that time,
> they added quite a bit of machine learning, reasoning, statistics,
> and heuristics.  But most of all, they added terabytes of documents.
>
> For the record, following are Ferrucci's slides from 2006:
>
> 
>http://ontolog.cim3.net/file/resource/presentation/DavidFerrucci_20060511/UIMA-SemanticWeb--DavidFerrucci_20060511.pdf
>
> Following is the talk that explains the slides:
>
> 
>http://ontolog.cim3.net/file/resource/presentation/DavidFerrucci_20060511/UIMA-SemanticWeb--DavidFerrucci_20060511_Recording-2914992-460237.mp3
>
> And following is his recent talk about the DeepQA project for
> building and extending that foundation for Jeopardy:
>
> 
>http://www-943.ibm.com/innovation/us/watson/watson-for-a-smarter-planet/building-a-jeopardy-champion/how-watson-works.html
>
> Compared to Ferrucci's talks, the PBS Nova program was a disappointment.
> It didn't get into any technical detail, but it did have a few cameo
> appearances from AI researchers.  Terry Winograd and Pat Winston,
> for example, said that the problem of language understanding is hard.
>
> But I thought that Marvin Minsky and Doug Lenat said more with their
> tone of voice than with their words.  My interpretation (which could,
> of course, be wrong) is that both of them were seething with jealousy
> that IBM built a system that was competing with Jeopardy champions
> on national TV -- and without their help.
>
> In any case, the Watson project shows that terabytes of documents are
> far more important for commonsense reasoning than the millions of
> formal axioms in Cyc.  That does not mean that the Cyc ontology is
> useless, but it undermines the original assumptions for the Cyc
> project:  commonsense reasoning requires a huge knowledge base
> of hand-coded axioms together with a powerful inference engine.
>
> An important observation by Ferrucci:  The URIs of the Semantic Web
> are *not* useful for processing natural languages -- not for ordinary
> documents, not for scientific documents, and especially not for
> Jeopardy questions:
>
>  1. For scientific documents, words like 'H2O' are excellent URIs.
>     Adding an http address in front of them is pointless.
>
>  2. A word like 'water', which is sometimes a synonym for 'H2O',
>     has an open-ended number of senses and microsenses.
>
>  3. Even if every microsense could be precisely defined and
>     cataloged on the WWW, that wouldn't help determine which
>     one is appropriate for any particular context.
>
>  4. Any attempt to force human being(s) to specify or select
>     a precise sense cannot succeed unless *every* human
>     understands and consistently selects the correct sense
>     at *every* possible occasion.
>
>  5. Given that point #4 is impossible to enforce and dangerous
>     to assume, any software that uses URIs will have to verify
>     that the selected sense is appropriate to the context.
>
>  6. Therefore, URIs found "in the wild" on the WWW can never
>     be assumed to be correct unless they have been guaranteed
>     to be correct by a trusted source.
>
> These points taken together imply that annotations on documents
> can't be trusted unless (a) they have been generated by your
> own system or (b) they were generated by a system which is at
> least as trustworthy as your own and which has been verified
> to be 100% compatible with yours.
>
> In summary, the underlying assumptions for both Cyc and
> the Semantic Web need to be reconsidered.
>
> John    (06)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (07)

<Prev in Thread] Current Thread [Next in Thread>