Thanks John, that is very interesting. So would you say that the
semantics is more in terms of taxonomic hierarchies (I'm thinking
of something like Quillian's 1968 semantic network concept here
as well) rather than quite so much formally axiomatized ontology,
or something of both? (01)
On 16/02/2011 16:29, John F. Sowa wrote:
> Mike, Krzysztof, Pavithra, Doug, and Christopher,
> I'd like to make a few comments, starting with Mike's point:
>> From what little I've seen it does rather seem to be a triumph
>> of statistics over semantics.
> Statistics certainly plays a big role, but the relative role
> depends on what you mean by 'semantics'. I would say that Watson
> represents a triumph of 1980s style of NLP, and I'd summarize
> the influences in one line:
> Michael McCord + Roger Schank + statistics + supercomputer
> For McCord's influence, I'll quote the following passage from the
> article in AI Magazine:
> Page 11, http://www.stanford.edu/class/cs124/AIMagzine-DeepQA.pdf
>> The DeepQA approach encourages a mixture of experts at this
>> stage, and in the Watson system we produce shallow parses,
>> deep parses (McCord 1990), logical forms, semantic role labels,
>> coreference, relations, named entities, and so on, as well as
>> specific kinds of analysis for question answering.
> In the 1980s, McCord had written an excellent parser in Prolog
> with a good grammar of English. In the 1990s, he rewrote the
> parser in C for better performance. Apparently, that's the
> "deep" parser they use for Watson.
> Also in the 1980s, Roger Schank made the claim that axioms in
> logic are irrelevant, but large volumes of background knowledge
> are essential for language understanding. He also said that
> machine learning is essential for NLP, since every text says
> something new that must be added to the background knowledge.
> Unfortunately, the technology at the time was too slow, and
> the available resources of machine-readable texts were woefully
> inadequate to support Schank's claims.
> Watson also uses ideas developed in the 1990s and later, but
> one could say that the Watson-style of semantics is a high-speed
> implementation of a Schankian style of NLP. The crucial technology
> that Schank did not have is a supercomputer plus huge volumes of
> preprocessed material indexed and accessible via a relational DB.
> I certainly won't downplay the importance of statistics, which
> are essential for many aspects of Watson: evaluation of what
> is relevant, estimating the confidence in an answer, and most
> especially for techniques of machine learning. But learning
> is also one of the aspects that Schank emphasized years ago.
> In short, Roger Schank's emphasis on informal methods of
> processing and using large volumes of background knowledge,
> case-based reasoning, and machine learning are much closer
> to what Watson is doing than any logic-based method --
> either Richard Montague's formal logic for NLP or Lenat's
> enormous formal ontology for Cyc.
> Yet Watson does use some logic, however. It's just not the
> main focus. Statistics and heuristics are more important.
>> As Watson turns out to be very successful this leads again to the
>> question which role ontological categories play...
> Watson does use WordNet and other lexical resources. That is
> important for selectional constraints on permissible combinations,
> but Wordnet and similar resources have very few axioms.
>> Too bad Watson could not do any Google search..
>> and get better answers..
>> This was not the problem.
>> Watson has Wikipedia in its memory. The answers to the specified
>> question are in Wikipedia: Toronto being a Canadian city, Chicago
>> being a US city, Chicago having O'Hare and Midway as airports,
>> O'Hare being named after a WW II hero (flying ace), Midway being the
>> name of a WW II battle.
> I strongly agree with Doug. Watson has predigested 15 terabytes
> of background knowledge, which is tailored and indexed for its own
> representations. Google's search methods are much less precise,
> and they only return entire documents, which Watson would have
> to spend too much time to read and analyze before it could answer
> a jeopardy question.
>> Question answering by a machine such as Watson is not very useful unless
>> the system can explain its answers. Thus i am disappointed that IBM's
>> "explanation" of its error in Final Jeopardy does not explain why it
>> chose its answer.
> I agree. But the immediate task for the Jeopardy challenge did not
> require explanations. Much more would have to be added, revised, and
> extended before the Watson technology could be applied to other domains.
> From what I've read about Watson, I think that it could keep a backchain
> of steps that lead to any conclusion. Keeping the entire derivation
> tree of all rejected steps would be too voluminous. But Watson could
> keep the supporting information for each step of the main thread. It
> could also keep a summary of each rejected option leading away from
> the main thread.
> Our VivoMind system, which also reads untagged English documents
> in order to answer users' queries, maintains the backchain of
> derivations, and it can display the sources from which any answer
> was derived. See slides 32 to 40 of the following presentation:
> Slide 41 discusses how that technology could be adapted to
> diagnosing cancer patients.
>> One exchange [with Ken Jennings] that seems very relevant to
>> this Forum is this one:
>> Q: It was interesting to me that the "Which decade" category seemed
>> especially hard for Watson. Why was that?
>> A: I think it took it a little while to figure out that the answers
>> would all be decades. This seems incredible to a human playing
>> along at home, but basic contextual issues like this are incredibly
>> hard for machine intelligence to master. The cool thing about
>> Watson is it learns from those mistakes. By the end of the
>> category, it had learned that the answers were going to be decades,
>> and adjusted accordingly.
> Ferrucci explained the design decision for Watson to avoid using the
> category names to evaluate answers because many Jeopardy categories
> have misleading puns and wordplay. One example was the category named
> "Church and State" for which the answer was "Christchurch, New Zealand."
> Watson got that right, and the category name could have been confusing.
> However, some category names are very relevant -- "Which decade" and
> "U.S. Cities", for example. The humans Ken and Brad correctly used
> that information, but Watson made serious blunders by ignoring it.
> This is an example for which the Watson developers deliberately
> ignored important information. They should have used machine
> learning to let Watson analyze whether or not a particular kind
> of category name might be more useful or more distracting.
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
89 Worship Street
London EC2A 2BF
Tel: +44 (0) 20 7917 9522
Mob: +44 (0) 7721 420 730
Registered in England and Wales No. 2461068 (04)
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (05)