ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] What is "understanding" - was: Building on common gr

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Patrick Cassidy" <pat@xxxxxxxxx>
Date: Mon, 31 Mar 2008 12:36:05 -0400
Message-id: <058501c8934d$50c6cd70$f2546850$@com>

It’s a pleasure to hear from Sergei on this list.  His group is one of the few left in the US that is using an ontology for NLP.

I find no serious disagreement with any of his points, but would like to clarify what may be some misunderstandings.

 

[[[1]]] [SN] > First, our main problem is well beyond choosing the

> "right" knowledge representation schema: it is all about 

> the content of knowledge, not the format. 

Yes, yes.  The question of finding a common foundation ontology that is structured as a Conceptual Defining Vocabulary is precisely to allow representation of content using a common specification of meaning, so that the meanings of information created by separate groups can be automatically translated and interpreted by any system using the common foundation ontology.  This *is* an issue of content.  Whether different formats can be precisely translated depends on levels of expressiveness (assuming logical consistency of the content).

 

[[[2]]] [SN] > So, alas, we should set our goals in a way that is somehow

> commensurate with realistic expectations.

Yes.  That is why my immediate focus is on one small bite of that ten-foot submarine; to try to determine if the Conceptual Defining Vocabulary (foundation ontology) will expand slowly enough as new domains are added that it will be sufficiently stable to serve as a standard of meaning.

 

[[[3]]][SN] > This is one of the reasons why I think that no 

> standards could really be enforced in this area and

> that it may be a noble but doomed task to try to

> come up with a single common syntax and semantics for

> the metalanguage for specifying knowledge about

> language and the world (whether these are different

> metalanguages or a single one).

  Probably.  But we do not need universal acceptance to build a useful foundation ontology.  It only has to be used by a sufficiently large community of research groups to provide the common standard of meaning that facilitates reuse of research results among many groups, thereby serving as a common paradigm within which incremental improvements in application components can cumulatively develop into powerful systems.  At present, improvements tend to occur only within individual or small clusters of research teams, and reuse between them is very inefficient.  There are some parts of the problem (e.g. reasoning methods) where there is a high level of reuse, but much of NLP tends to evolve as local systems.  To be sure, some of them have grown impressive, but still have a long way to go to look like human performance.  I think that a common foundation ontology would be a powerful tool for NLU research, however parsimoniously it is funded.

 

All suggestions for changes or additions to the COSMO are welcome, including pointers to a full dump of any foundation ontology that anyone thinks is useful.

 

Pat

 

Patrick Cassidy

MICRA, Inc.

908-561-3416

cell: 908-565-4053

cassidy@xxxxxxxxx

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Sergei Nirenburg
Sent: Monday, March 31, 2008 11:09 AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] What is "understanding" - was: Building on common ground

 

John -

 

I am very happy to see a public discussion of issues that

I have been studying for many years. It feels, in part,

that we are back in 1982. Of course, we have learned

a lot since then, as a field. 

 

I think that the core of the new understanding can be 

reduced to two high-level points.

 

First, our main problem is well beyond choosing the

"right" knowledge representation schema: it is all about 

the content of knowledge, not the format. 

 

Second, the scope of work is so much broader than even

the most sober among us had expected. 

 

A few years ago I compared the outlays for the

Manhattan project, the Human Genome project

and NLP. The statistics were not complete,  I am

sure, but the trend was clear: our work has been

funded at a small fraction of those projects.

It is societally understandable, of course. But the

bad news is that I think our problem is more

complex than either of those problems...

 

So, alas, we should set our goals in a way that is somehow

commensurate with realistic expectations. The bad news

here is that there will be no instant gratification on a

grand scale. Not for our generation. 

 

As to knowledge acquisition work, it is thankless:

while it can be intellectually quite demanding, 

one can't defend a dissertation in it, so students

naturally prefer either working on formalisms or 

on theorem provers or on statistics-oriented experiments. 

 

This is one of the reasons why I think that no 

standards could really be enforced in this area and

that it may be a noble but doomed task to try to

come up with a single common syntax and semantics for

the metalanguage for specifying knowledge about

language and the world (whether these are different

metalanguages or a single one).

 

I'll also make some comments inline.

 

On Mar 30, 2008, at 11:26 PM, John F. Sowa wrote:



Pat C. and John B.,

 

I'll accept parts of what both of you are saying, but with

many qualifications.  Before getting to the qualifications,

I'll quote an example I've used before.

 

Following are four sentences that use the same verb in

a similar syntactic pattern, but with very different,

highly domain-dependent senses:

 

  1. Tom supported the tomato plant with a stick.

  2. Tom supported his daughter with $20,000 per year.

  3. Tom supported his father with a decisive argument.

  4. Tom supported his partner with a bid of 3 spades.

<...>

 

Making those choices requires quite a bit of background knowledge,

and it's definitely nontrivial with current technology.  But the

next question is what to do with that choice.  It might be useful

in machine translation for picking the correct verb in some target

language.  Perhaps a statistical translator with enough data could

do so.  But could that be called "understanding"?

 

Well, there's no problem with calling this a modicum of understanding,

not complete understanding...

 

Suppose we had a word-expert analyzer with an enormous amount

of information about each verb.  Would that be the best way to

organize the knowledge base?  Would you put some knowledge about

bridge or tomatoes into some rules for each verb, noun, and

adjective that might refer to bridge or to tomatoes?  Or would

it be better to put all the knowledge about bridge in a module

that deals with bridge and all the knowledge about tomatoes in

a module that deals with tomatoes?

 

Any which way. Let it even be inefficient. But we need multiply

cross-indexed descriptions of complex events with their subevents

and participants, pre- and post-conditions and other properties.

 

We are building such entities for a few projects we are working

on, and, of course, it is a slow and painful task with lots of corrections

etc. 

 

With either way of organizing the knowledge -- by words or

by subject matter -- how would you relate the lexical info

about each word to the ontology and to the background

knowledge about how to play bridge or work in the garden?

 

The organization in our approach is by (ontological) elements of world knowledge

but our lexicon expresses lexical meaning in terms of the

ontological metalanguage (there are exceptions, but talking

about them is well beyond the grain size of this message). So, in 

the example above, there will be in the ontology the event 

describing what happens when people play bridge, and there

will be indications in the lexicon of any idiosyncratic word

and phrase senses relating to bridge playing. Many meanings

will still be derived in a compositional way, with the knowledge

of the complex event of playing bridge serving as a (core)

heuristic for making preferences during ambiguity resolution.

 

BTW, there are many more kinds of ambiguity to deal with

in addition to word sense or PP attachment (to name a couple

that have been in the center of the field's attention): scope

ambiguities, referential ambiguities, semantic dependency 

ambiguity, non-literal language-related ambiguity, etc.

 

If you intend to use logic, how much logic would be needed

for those sentences?  What would a theorem prover do to aid

understanding?  Could proving some theorem about tomatoes be

considered understanding?  

 

A theorem prover can be adapted to drive

the ambiguity resolution process.

 

However, the main issue is that we need to be able to

make successful inferences against a knowledge base that is

not sound and incomplete. That's reality. So, if logic can

come up with methods that support such a task, great. 

Otherwise, we scruffies will have to make do with whatever

we can muster.

 

Is it likely that a bunch of people (similar to the Wikipedians)

would be willing and able to enter the kinds of knowledge in the

kinds of formats necessary for a system that understands?

 

I think that hoping that something like this will be done by

enthusiasts is, err, premature. As the saying goes, you get

what you pay for (and this is not - entirely - a cynical view :-) ).

 

Realistically, though, the field doesn't have the funding for this kind of effort.

 

The Cyc project has been paying professional knowledge engineers

to enter such knowledge into their system for the past 22 years.

They had two million axioms in 2004, but Cyc still can't read

a book in order to build up its knowledge base.  How much more

would be needed?  Or is there some better way?  What way?

 

As far as I know, many of the Cyc axioms are actually facts (e.g., knowledge about

Austin, TX), not concepts  (knowledge about cities in general). Also, it

is instructive that they seem to be using the knowledge base

only for statistical NLP (my information may be wrong here, though!).

 

As for reading a book, we have started a project that uses our

current, limited, ontology/lexicon/grammar/preprocessing resources

to extract knowledge of unknown concepts/words from the web.

But it's just scraping the surface, however exciting the project may be...

 

There is much. much more that can be said, of course. Would be

nice to talk about this, not e-mail...

 

Sergei



 

John Sowa

 


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (01)

<Prev in Thread] Current Thread [Next in Thread>