John,
All your questions are legitimate, given the sentiments that I expressed
in answer to the question of whether I have in mind some outline of a NLU
system that might approach human performance. But as I tried to emphasize,
the notion of an NLU architecture at this point merely provides *one* image
of how a Conceptual Defining Vocabulary could help in achieving human-level
performance, and thereby provides a motivation to pursue the CDV effort. It
in no way is intended to disparage the excellent work that is being done on
Natural Language by many competent groups, among whom yours is at or near
the top in my estimation of the likely productivity of the effort. The
issue I am trying to focus on is how to structure a foundation ontology that
can get wide enough usage to serve as a standard of meaning to enable more
efficient reuse of the excellent work on NLU that is going on, in many
different groups.
I have a varying but generally shallow acquaintance with the different
grammatical formalisms that are being explored (the last time I actually
built a parser was over 20 years ago, just before I tried to find a semantic
dictionary. After that I spent a lot of time trying to figure out how to
build such a dictionary. Now I am concerned about how to encourage the
research community to all use the *same* semantic dictionary. WordNet is
inadequate.). CCG seems, on such shallow acquaintance, to have at least as
good a chance to develop deep meaning structures as any other I am aware of.
But once again, my focus at this point is not the grammatical analysis of
text, but on how to develop a foundation ontology that will have a structure
that can accommodate multiple viewpoints and serve as a standard (one
paradigm of meaning) to enable more efficient reuse of programs that can use
ontology-based representations of reasoning. As I mentioned in my
"outline", having a common foundation ontology as a means to represent
meanings should permit any grammatical and semantic analysis to be performed
on a sentence (possibly more than one). The formal grammatical/semantic
analysis would provide (one or more) candidate interpretation, which then
could be evaluated as to likely correctness by the "experts". "Word
Experts" are one kind of expert system that could be used. "Context
experts" might be needed to do the evaluation of candidate interpretations
to determine which is the most likely, given the overall context of a text
(or conversation), and including consideration of pragmatic issues. But
once again, an image of how a human-level NLU system might work is not a
proposal to build such a system at this time - obviously it is far too
sketchy. It does provide a rationale for why it is worthwhile spending time
to discover whether a CDV can be developed that can serve as the standard of
meaning that can help integrate multiple reasoning modules. If such an
ontology can gain widespread use among diverse research groups, at that time
the specific issue of which grammatical/semantic NL analysis system will
provide the best interpretation will likely become my main focus.
It is my expectation that, regardless of the grammatical/semantic
analysis that is performed, Word Experts or their equivalents will be needed
to achieve a level of understanding that can approach human level. But
perhaps, as you suggest, CCG can provide enough of that functionality to
render individual Word Experts unnecessary. That would be a wonderful
thing. (01)
So, on that specific issue: (02)
> CCG is the closest
> thing to 'word experts' that is still formally attractive,
> gives you a way of getting at language and other resources
> that people are developing, and allows you to focus on the
> additional benefits and difficulties of the common defining
> vocabulary. Given your repeated statement about lack of
> time to work on these things, at least *that* degree of
> focus would be useful I'd've thought rather than redesigning
> wheels. It would also move everything towards evaluable
> prototypes much quicker since so many mechanisms
> would already be in place. This is *already* a
> "very large collaborative effort", so why not join in?
> (03)
As you may recall from some previous conversations, I have a high regard
for the work of your group, and would be pleased to collaborate, but up to
now was not aware that there is a mechanism to do so at long range. I am
certainly interested in any project that attempts to coordinate
ontology-based NL research among multiple groups. There was one in the US a
couple of years ago, but my proposal to participate in that was not funded
(along with some others in the same consortium). I do think Europe is ahead
of the US in NL research, but the projects I have paid attention to seem to
be restricted to European participants. Meanwhile, I try not to redesign
wheels, but to use the ones that fit. I take most of the COSMO ontology
components from existing sources, thus far mostly the OpenCyc. I add what is
not already in there. (04)
[JB] >> There are N architectures out there already, being developed
> and tested in applications from coffee machines to everything
> in the not-so-semantic web
Yes, and the results I have seen thus far do not suggest to me that the
approaches being used are likely to lead to human-level NLU. I may well
have missed some promising results, and will be interested to know what you
consider evidence that a group is on the direct line to that distant goal. (05)
Two questions:
(1) is there a site where texts can be submitted for analysis by the NL
system you are using, to allow evaluation of performance on different texts?
I find that trying a number of different sentences and paragraphs on a
system is a very much more efficient way to determine its strengths and
weaknesses, than a long period of reading specifications. The issue is
motivation - one needs to decide quickly if the effort of learning a complex
system is likely to be worthwhile with respect to the ultimate goal of
achieving human-level NLU. Alternatively, if it is not available for
testing on-line, is there a downloadable version of the system that can be
installed locally to provide the ability to test it? And if not, could you
provide *direct* pointers to result sets that indicate that the system
provides a deep analysis of text that at least suggest the likelihood of its
eventual evolution to be capable of human-level understanding?
I do try to keep aware of what is being done in NL research, and haven't
yet seen a description of a system that looks like it is moving toward
human-level performance. Of course, I may have missed one (I only look at
some of the papers and conference reports), and would be delighted to be
pointed to what you consider the most promising of those that you are not
yourself working on. I assume that you believe your own system to be *the*
most promising (and it may well be), and, now that the subject has come up
thus forcefully, will want to learn more about the details. Suggestions for
a crash course? (06)
(2) thus far I have not seen any NLU work that actually focuses on deep
understanding of the most basic vocabulary - 2000 to 5000 root words. Is
that a focus of any of the work you are aware of? Or of some component of
it? If so, is there a pointer to the results that are achieved by such a
system or component? (07)
Regards,
Pat (08)
Patrick Cassidy
MICRA, Inc.
908-561-3416
cell: 908-565-4053
cassidy@xxxxxxxxx (09)
> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-
> bounces@xxxxxxxxxxxxxxxx] On Behalf Of bateman@xxxxxxxxxxxxx
> Sent: Sunday, March 30, 2008 3:29 AM
> To: ontolog-forum@xxxxxxxxxxxxxxxx
> Subject: Re: [ontolog-forum] What is "understanding" - was: Building on
> common ground
>
> PatC:
>
> OK, before I was sniping, I admit it (and still a smiley face:
> at least for the ontology building effort!), but I must
> take exceptions to trivialisations of language and its
> treatment (being a linguist), just as other members of this
> list (quite rightfully) take exception to misuses and
> misrepresentations
> of logic and logical formalisation.
>
> >>>>>>>>>>>>>
> Yes, but it is even less than a sketch because I am still trying to
> construct an outline of individual components that need to work
> together.
> There is nothing particularly original in the methods I would like to
> try.
> It is basically to revisit the old notion of "word experts" but now
> with
> much more powerful computers than were used back in the early trials,
> and to
> integrate an ontology that has structures that come as close as
> possible to
> at least the most common English-language structures. Think of it as
> "extreme lexicalization" of the grammar.
> <<<<<<<<<<<<<<<<<<
>
> Given that NLP, including NLU, is now a huge area with multimillion
> dollar investments across both research and industry I find a
> statement of the form "I am still try to constuct an outline of
> the individual components that need to work together" curious:
> *WHY*? There are N architectures out there already, being developed
> and tested in applications from coffee machines to everything
> in the not-so-semantic web. All current large-scale analysis
> components use lexicalised grammars, many have lexical semantics
> of various kinds, there are many mechanisms for running these
> kinds of things in half-efficient ways, and all that is just to
> get going so that one can start *thinking of* addressing the
> interesting problems!
>
> >>>>>>>>>>>>>>>>>>>
> Whether that will port to other
> languages I have no idea.
> <<<<<<<<<<<<<<<<<<<<
>
> Lexicalised grammars are used for all languages.
> Multilingual information extraction is an established area.
>
> <<<<<<<<<<<<<<<<<<<
> The virtue, in my opinion, of the "Word Expert" approach is that it can
> be
> very modular - one verb, one program - and therefore could be built by
> a
> very large collaborative effort, provided that there is a common
> ontology in
> which the meanings are represented. One big difficulty is in the great
> complexity.
> <<<<<<<<<<<<<<<<<<<<
>
> Which is why one does not just have words lying around but also
> has to look at the 'grammar' part of the 'lexicalised grammar'
> term. This is where non-linguistic approaches typically start
> unravelling to resemble a mixture of patchwork and hacks.
> One possible method: adopt a combinatory categorial grammar
> (CCG) (like we do :-), make sure you take one that delivers
> hybrid dependency logic semantics as a output, and start fitting the
> semantic types of the lexicon to the COSMO types. Parsing omplexity
> is O(n^6) (mildly context sensitive) and CCG is the closest
> thing to 'word experts' that is still formally attractive,
> gives you a way of getting at language and other resources
> that people are developing, and allows you to focus on the
> additional benefits and difficulties of the common defining
> vocabulary. Given your repeated statement about lack of
> time to work on these things, at least *that* degree of
> focus would be useful I'd've thought rather than redesigning
> wheels. It would also move everything towards evaluable
> prototypes much quicker since so many mechanisms
> would already be in place. This is *already* a
> "very large collaborative effort", so why not join in?
>
> >>>>>>>>>>>>>>>>>>>>
> Other modules such as image processing, speech
> recognition, graphic-oriented reasoning, and robotic functions would be
> able
> to communicate using the same foundation ontology.
> <<<<<<<<<<<<<<<<<<<<
>
> another multimillion dollar investment R&D area. This has been
> set out in SmartKom and the K-Space network of excellence (EU)
> and there are substantial ontological investments here. Again:
> integrative work on the basis and with those ongoing efforts
> is essential.
>
> <<<<<<<<<<<<<<<<<<<
> There are plenty of problems for which I have no meaningful proposal,
> <<<<<<<<<<<<<<<<<<<
>
> if we all had to have meaningful proposals for every aspect of this
> huge area we work in, we'd have problems.
>
> But, getting back to ontology, we will take a look at COSMO and
> give feedback: particularly on its manner of combining info from
> other sources and the effects that has on the consistency of the
> whole. We will be converting it to CASL to do proper structuring
> though.
>
> Best,
> John.
>
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-
> forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
> (010)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (011)
|