Re: [ontolog-forum] What is "understanding" - was: Building on common gr

Date: Sun, 30 Mar 2008 23:11:39 -0400
Gary,    (01)

     The internet is full of pointers to relevant research, and I try to
read as much of them as makes sense to spend time on.  Is there any that you
think displays a system that shows promise of leading to human-level
language understanding?  A very specific pointer to that would be
appreciated.  Do you know of any research that bears *directly* on the
question of the size of the Conceptual Defining Vocabulary?    I am already
aware of the work of Bruce Porter in Texas, which has some relevance to the
Conceptual Defining Vocabulary.  Katz's START system at MIT
(http://start.csail.mit.edu/) is actually quite impressive, but a few
experiments should convince one that even that superb group's system is
still far from what one might term 'understanding'.  (Ask 'where was Bill
Clinton born?' and get a direct, spot-on response.  Ask 'Why was Bill
Clinton born' and you get a 'I don't know'.  Ask 'does START use an
ontology?'  Guess the answer.  Great fun, posing questions to that system.)    (02)

PatH alludes to the Powerset project to create a parse of the internet and
compete with Google on search.  When I spoke to the managers of that project
last year, they were considering building an ontology to supplement WordNet
as the semantic representation in addition to their parse structures.  But
they decided at that time to focus on the parsing, to concentrate resources
where it would produce usable results most quickly.  I very much sympathize
with the need to focus in commercial projects; they need to survive before
they can evolve.  I don't know the current state of their research
directions.  And so it goes on with most groups - they need to focus on what
seems to be near-term useful results because that is where the funding is.
If longer-term results were high on the list of funders' concerns PatH would
be far too busy with multiple projects to reply to these discussions - he
has numerous good ideas on how to structure and build useful artifacts for
representation and reasoning.    (03)

I expect that we will hear more detail from Prof. Dr. Bateman on the status
of his own research, to which I shall attend with great interest.    (04)

If anyone has specific pointers to other projects that show genuine promise
for human-level language understanding, I will appreciate *specific*
pointers to the page that allows one to get an understanding of that
system's capabilities.    (05)

If you enjoy following dead-end pointers (I don't) try googling "human-level
language understanding" and see if you can come up with a single working
system that actually displays results suggesting that they are getting close
to what you would consider "understanding" of language.  Good luck.    (06)

Pat    (07)

PatC:    (014)

I think that there have been a number of useful suggestions pointing to
relevant research (e.g. John bremen;as observations in the message below
along with John Sowa;s and Pat Hayes) that should be built on.  So perhaps
rather than continuing to discuss this as largely another effort one should
first make clear what can be leveraged from prior work and what specifically
are the unanswered issues in this work that you have an approach to.    (015)

PatC:    (020)

OK, before I was sniping, I admit it (and still a smiley face:
at least for the ontology building effort!), but I must
take exceptions to trivialisations of language and its
treatment (being a linguist), just as other members of this
list (quite rightfully) take exception to misuses and misrepresentations
of logic and logical formalisation.    (021)

Yes, but it is even less than a sketch because I am still trying to
construct an outline of individual components that need to work together.
There is nothing particularly original in the methods I would like to try.
It is basically to revisit the old notion of "word experts" but now with
much more powerful computers than were used back in the early trials, and to
integrate an ontology that has structures that come as close as possible to
at least the most common English-language structures.  Think of it as
"extreme lexicalization" of the grammar.
<<<<<<<<<<<<<<<<<<    (022)

Given that NLP, including NLU, is now a huge area with multimillion
dollar investments across both research and industry I find a
statement of the form "I am still try to constuct an outline of
the individual components that need to work together" curious:
*WHY*? There are N architectures out there already, being developed
and tested in applications from coffee machines to everything
in the not-so-semantic web. All current large-scale analysis
components use lexicalised grammars, many have lexical semantics
of various kinds, there are many mechanisms for running these
kinds of things in half-efficient ways, and all that is just to
get going so that one can start *thinking of* addressing the
interesting problems!    (023)

Whether that will port to other
languages I have no idea.
<<<<<<<<<<<<<<<<<<<<    (024)

Lexicalised grammars are used for all languages.
Multilingual information extraction is an established area.    (025)

The virtue, in my opinion, of the "Word Expert" approach is that it can be
very modular - one verb, one program - and therefore could be built by a
very large collaborative effort, provided that there is a common ontology in
which the meanings are represented.  One big difficulty is in the great
<<<<<<<<<<<<<<<<<<<<    (026)

Which is why one does not just have words lying around but also
has to look at the 'grammar' part of the 'lexicalised grammar'
term. This is where non-linguistic approaches typically start
unravelling to resemble a mixture of patchwork and hacks.
One possible method: adopt a combinatory categorial grammar
(CCG) (like we do :-), make sure you take one that delivers
hybrid dependency logic semantics as a output, and start fitting the
semantic types of the lexicon to the COSMO types. Parsing omplexity
is O(n^6) (mildly context sensitive) and CCG is the closest
thing to 'word experts' that is still formally attractive,
gives you a way of getting at language and other resources
that people are developing, and allows you to focus on the
additional benefits and difficulties of the common defining
vocabulary. Given your repeated statement about lack of
time to work on these things, at least *that* degree of
focus would be useful I'd've thought rather than redesigning
wheels. It would also move everything towards evaluable
prototypes much quicker since so many mechanisms
would already be in place. This is *already* a
"very large collaborative effort", so why not join in?    (027)

Other modules such as image processing, speech
recognition, graphic-oriented reasoning, and robotic functions would be able
to communicate using the same foundation ontology.
<<<<<<<<<<<<<<<<<<<<    (028)

another multimillion dollar investment R&D area. This has been
set out in SmartKom and the K-Space network of excellence (EU)
and there are substantial ontological investments here. Again:
integrative work on the basis and with those ongoing efforts
is essential.    (029)

There are plenty of problems for which I have no meaningful proposal,
<<<<<<<<<<<<<<<<<<<    (030)

if we all had to have meaningful proposals for every aspect of this
huge area we work in, we'd have problems.    (031)

But, getting back to ontology, we will take a look at COSMO and
give feedback: particularly on its manner of combining info from
other sources and the effects that has on the consistency of the
whole. We will be converting it to CASL to do proper structuring
though.    (032)

John.    (033)

