[Top] [All Lists]

Re: [ontolog-forum] Case realtions as Practical Semantic Primitives - wa

To: "[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Obrst, Leo J." <lobrst@xxxxxxxxx>
Date: Thu, 8 Aug 2013 20:27:37 +0000
Message-id: <FDFBC56B2482EE48850DB651ADF7FEB01F25C171@xxxxxxxxxxxxxxxxxx>



Many of us have used and use thematic role (theta role) relations as the main participant roles in events, which one then refines based on elaborating the needed ontologies. This has been a continuing thread for many years in linguistics. I know John has and I have myself, in the Event ontology we are using (denotations of verbs, etc.) We’ve had discussions about this on the list. Most foundational ontologies also use some representation of these: Cyc, DOLCE, etc. Most recently, schema.org’s Actions and Activities extension uses these, as reported just recently to the Ontolog Forum by Peter Yim: http://lists.w3.org/Archives/Public/public-vocabs/2013Jul/0090.html.


For a quick perspective from linguistics, see Manfred Pinkal’s 2006 slides (among other; search for “Manfred Pinkal event semantics”): http://www.coli.uni-saarland.de/courses/semantics-06/lectures/lect13.pdf. The main recent basis in linguistics starts at Charles Fillmore’s Case Grammar. Also the more recent FrameNet work: see below.










From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Gary Berg-Cross
Sent: Thursday, August 08, 2013 3:47 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] Case realtions as Practical Semantic Primitives - was Context and Inter-annotator agreement


The Context and Inter-annotator agreement topic seems to have wound down, but along the
path of that discussion there was this idea of semantic primitives. 
 John Sowa provied an historical  list of people who have addressed this seductively, common sense 
idea of selecting a small number of primitives for defining everything. It is, as he noted:
" one of the oldest in the history of philosophy,
logic, linguistics, and AI.  It can be traced back at least to 500 BC
with Pythagoras, Plato, and Aristotle. " 
He provided a highlight of work, but in that list I didn't see Fillmore's Case grammar,
which did have an important role in other part's of John's postings such as the
Verb Semantics Ontology project.  This might not provide ultimate primitives, but are
perhaps molecules of a deeper chemistry. Case relations may not be the final word, but they provide a
starting point for core meta-relations that can be used to develop canonical propositions.
As John noted, more research is needed but this is one tool that can be used for now.


Gary Berg-Cross, Ph.D.  


SOCoP Executive Secretary

Knowledge Strategies    

Potomac, MD



On Sun, Aug 4, 2013 at 1:28 PM, John F Sowa <sowa@xxxxxxxxxxx> wrote:



> The point at issue is whether all of the senses of a particular word
> needed for language understanding can be included in a semantic lexicon.
> My experience suggests that they can, even though new senses are being
> developed all the time.  The new senses can also be included in the lexicon,
> if they are important enough to warrant the effort.

That claim is vague enough to cover all bases.  If you want a project
that includes all word senses anyone considers important, I suggest
Wiktionary.  It has "3,476,017 entries with English definitions from
over 500 languages":


Large numbers of people around are actively updating and extending
Wiktionary.  When the number of senses is in the millions and growing,
it seems hard to claim that there is any finite upper limit.


> JFS seems to be saying that failure of some groups to achieve a goal means
> that no amount of effort trying a related but different way can succeed

More precisely, the idea of selecting a small number of primitives for
defining everything is one of the oldest in the history of philosophy,
logic, linguistics, and AI.  It can be traced back at least to 500 BC
with Pythagoras, Plato, and Aristotle.  For summaries and references,
see http://www.jfsowa.com/talks/kdptut.pdf .

Slides 13 to 18:  Aristotle's categories, definitions, and the Tree
    of Porphyry for organizing them graphically.

Slides 91 to 93:  Universal language schemes in the 17th and 18th
    centuries.  John Wilkins developed the largest and most impressive
    set of primitives (40 genera subdivided in 2030 species).  Wilkins
    got help from other members to define 15,000 words in those terms.
    For more information about these and other schemes, see references
    by Knowlson (1975), Eco (1995), and Okrent (2009).

Slides 94 to 97:  Ramon Llull's Great Art (Ars Magna), which included
    Aristotle's categories, the Tree of Porphyry, rotating circles
    for combining categories, and a methodology for using them to
    answer questions.  Leibniz was inspired by Llull to encode the
    primitive categories in prime numbers and use multiplication
    to combine them and division to analyze them.

Slide 98:  Leibniz's method generated a lattice.  For modern
    lattice methods, see FCA and Ranganathan's facet classification.
    Click on the URLs to see FCA lattices that are automatically
    derived from WordNet and from Roget's Thesaurus.

Slides 99 to 101:  Categories by Kant and Peirce.  A suggested
    updated version of Wilkins' hierarchy that includes more
    modern developments.

Slides 102 to 107:  Issues about the possibility of ever having
    a complete, consistent, and finished ontology of everything.

For modern computational linguistics, the idea of selecting a set
of primitives for defining everything was proposed and implemented
in the late 1950s and early '60s:

1961 International Conf. on Machine Translation.  See the table
    of contents: http://www.mt-archive.info/NPL-1961-TOC.htm .
    At that conference, Margaret Masterman proposed a list of 100
    primitive concepts, which she used as the basis for lattices
    that combine them in all possible ways.  Yorick Wilks worked
    with Masterman and others at CLRU, and he continued to use
    her list of primitives for his later work in NLP.  For the
    list, see http://www.mt-archive.info/NPL-1961-Masterman.pdf

TINLAP (three conferences on Theoretical Issues in Natural Language
    Processing from 1975 to 1987).  The question of primitives was
    the focus of these conferences.  Yorick Wilks was one of the
    organizers.  Roger Schank (who also had a set of primitives for
    defining action verbs) was prominent in them.  For summaries,
    see http://www.aclweb.org/anthology-new/T/T78/T78-1000.pdf
    and http://www.aclweb.org/anthology-new/T/T87/T87-1001.pdf .

Anna Wierzbicka spent many years working on issues of selecting and
    using a proposed set of primitives for defining words in multiple
    languages.  From Wikipedia:  "She is especially known for Natural
    Semantic Metalanguage, particularly the concept of semantic primes.
    This is a research agenda resembling Leibniz's original "alphabet
    of human thought", which Wierzbicka credits her colleague, linguist
    Andrzej Bogusławski, with reviving in the late 1960s."  Many people
    tried to use her "semantic primes" in computational linguistics,
    but none of those projects were successful.

I never said "No amount of effort trying a related but different way
can succeed."  In fact, I have been proposing and *using* related
methods, but I always insist on keeping all options open.

There is no evidence that a fixed set exists, and an overwhelming
amount of evidence that Zipf's Law holds:  there is an extremely long
tail to the distribution of word senses.  But if you keep your options
open and *if* a fixed set of primitives is sufficient, then you will
discover that set.  That is my recommended strategy.

> So the statistical approach has become vastly more funded than
> the ontological/analytical.

I certainly agree with you that a deeper analysis with ontologies and
related lexical resources is essential for NL understanding.  I believe
that statistical methods are useful as a *supplement* to the deeper
methods.   At VivoMind Research, we use *both*, but the emphasis is
on a syntactic and semantic analysis by symbolic methods.

> the current strong emphasis on the statistical approach is, I believe
> retarding progress by failing to develop even the most basic resources
> needed for the analytical stage 2 function.

I wholeheartedly agree.  But from a selfish point of view, that gives
us a competitive advantage.  We got a contract with the US Dept. of
Energy based on a competition with a dozen groups that used their
favorite methods of NLP.

For the test, all competitors were asked to extract certain kinds of
data from a set of research reports and present the results in a table.
The scores were determined by the number of correct answers.  Our score
was 96%.  The next best was 73%.  Third best was above 50%, and all the
rest were below 50%.

For analyzing the documents, we used very general lexical resources
and a fairly simple general ontology.  But we supplemented it with
a detailed ontology that was specialized for chemical compounds,
chemical formulas, and the related details of interest.

For an example of a spreadsheet with the results, see slides 49 & 50
of http://www.jfsowa.com/talks/relating.pdf .



Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>