ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Practical Semantic Primitives

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Patrick Cassidy" <pat@xxxxxxxxx>
Date: Sat, 10 Aug 2013 12:14:50 -0400
Message-id: <0ae001ce95e4$be000ec0$3a002c40$@micra.com>

Leo,

   Perhaps a clarification is needed as to the relation of  Wierzbicka’s work to my effort with COSMO:

 

     COSMO is a ontology, which is a logical theory; it includes mappings to words in English that label (in some context) the logical COSMO structures specified.  But the relation of words to the concepts people use in thinking can vary with individual and context, and, in the absence of detailed neurolinguistic data, a focus on human use of words can generate a great deal of discussion that cannot resolve the issue of logical primitives, due to absence of adequate data.  So, while Wierzbicka focused on linguistic primitives, and her work is relevant and fascinating, those 60 primitives fall far short of the thousands of logical primitives that appear to be needed to logically distinguish the entities that people talk about in texts.

    It is important to recognize that the manner in which concepts are represented in COSMO may have little in common with the way they are likely to be represented in a connectionist brain.  COSMO is not an attempt to model brain processes **per se**.  It *is* an attempt to model logical processes that can have the same *effect* as human thinking, even if performed by quite a different mechanism.  So, just as those designing algorithms for mathematical computation know and don’t care that the specific logical manipulations have little relation to human brain processes, other than the final result, the theory that motivates the COSMO effort is to find a set of logical representations of primitive concepts that can model the *effect* of human language understanding, human thinking, and human language production.  As with mathematics, one may hope that ultimately the results will be even more precise than the human processes, but it will take a lot of work to find the proper algorithms and show that they do the task intended.  COSMO is focused only on the internal computer representation of the necessary concepts.

    Word mappings in COSMO  are intended to support convenient AND rapid linguistic input and output, in the interface between people and the machines.  The linguistic phenomena of synonymy and polysemy are reflected in the COSMO mappings, as they are in WordNet.  The internal thinking process that uses the COSMO structures does not have to be related to human thinking except insofar as it reaches the same correct results.   The algorithms to use COSMO in Natural Language programs still need to be developed.  Some of current NLP effort, particularly that using WordNet  as part of the understanding task, may also be useful with a COSMO-grounded lexicon.   To support statistical NLP, a lot of text will have to be labeled with COSMO “senses” rather than WordNet synsets.  There is a lot of useful structure in WordNet that is consistent with COSMO linkages, though much also differs.  It may not be necessary to completely redo the text taggings to get the a COSMO-tagged set of texts for experimentation.  That’s one of the later goals of the COSMO effort.

    I have also mentioned often that COSMO should be able to support accurate semantic interoperability among databases.  There is nothing particularly profound here, it is based only one the self-evident notion that to communicate one needs a common language.  The virtue of COSMO for that purpose is that, being primitives-based and therefore of minimal size, learning it should be easier than any other interlingua with the same functionality.

 

Pat

 

Patrick Cassidy

MICRA Inc.

cassidy@xxxxxxxxx

1-908-561-3416

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Obrst, Leo J.
Sent: Friday, August 09, 2013 3:05 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] Practical Semantic Primitives

 

The linguist Anna Wierzbicka has attempted to define a set of semantic primes or primitives for language (i.e., all languages): http://en.wikipedia.org/wiki/Semantic_primes, perhaps similar in notion as Pat Cassidy is trying with COSMOS. There is also Swadesh’s list of core words for historical linguistics, with many variations: http://en.wikipedia.org/wiki/Swadesh_list.

 

It’s the dream of many. Personally, I think it is a lost cause when considered as a reduction to semantic primitives, but there may be some merit in looking for a set of common words in many languages.

 

It also strikes me as an effort of lexical decomposition similar to that of the Generative Semanticists of the late 1960s/early 1970s, and some of the semantic-feature based work of Jackendoff, etc.

                                                                                     

Thanks,

Leo

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Bruce Schuman
Sent: Friday, August 09, 2013 11:43 AM
To: '[ontolog-forum] '
Subject: Re: [ontolog-forum] Practical Semantic Primitives

 

Good morning from Santa Barbara.  As a new member of this very interesting forum, thanks to all for being here.

 

On this issue of “primitives” – my instinct is to go to the basic theory of concepts, and ask how any concept is defined or “constructed”.  For me, the answer is more or less found in the Aristotelian approach to definition, as described by John Sowa in slide 17 of http://www.jfsowa.com/talks/kdptut.pdf  -- a process which defines a “distinction within a genus”.

 

When I look at systems defined by primitives – to my eye and understanding, these elements are usually not what I would call primitive – not fundamental – not truly “ontological”.  They are most often composite/holistic objects with a complex but undefined and implicit internal structure, that we are asked to take on faith, on the assumption that these “units” are somehow basic.

 

I want to see an approach to primitives that constructs everything – every possible concept – from a simple fundamental algebraic process of “drawing a distinction”, as per the Aristotelian method.

 

As I see it, the concept of “distinction” or differentiation is related to the fundamental mathematical concept of “cut” – as per the Dedekind Cut at the foundation of mathematics and the definition of continuity and the real number line.   From my point of view, we should be building our fundamental conceptual units from this foundation.

 

As regards the “atoms/molecules” analogy – for me, the right approach is to look for a “fundamental particle”.  Even atoms are composite structures.  If we are going to take a bottom-up approach to constructing every possible cognitive unit, we need to build these units from something truly fundamental.

 

In pursuit of this basic approach, I am developing a model of conceptual structure based on dimensionality and taxonomy that I call “synthetic dimensionality”.  I put a brief intro written for this list online:  http://sharedpurpose.net/groupdocs/introtoontolog.docx

 

Thanks so much for this discussion.

 

Bruce Schuman

(805) 966-9515 Santa Barbara

http://interspirit.net | http://sharedpurpose.net | http://bridgeacrossconsciousness.net

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Patrick Cassidy
Sent: Thursday, August 08, 2013 8:30 PM
To: '[ontolog-forum] '
Subject: Re: [ontolog-forum] Case realtions as Practical Semantic Primitives - was Context and Inter-annotator agreement

 

Gary,  

    On two points:

[GB-C]

> He provided a highlight of work, but in that list I didn't see Fillmore's Case grammar,
which did have an important role in other part's of John's postings such as the
Verb Semantics Ontology project.  This might not provide ultimate primitives, but are

perhaps molecules of a deeper chemistry.

 

   I have been tempted to refer to primitive concepts as “atoms” that build up “molecules” of meaning, but there are important differences that make the analogy misleading.  Many “primitive” concepts that are types within a hierarchy will be distinguished not by necessary and sufficient conditions (a logical “definition”), but only by necessary conditions.  This leaves a lot of potential instances unspecified, and differs from the fixed properties of atoms; I believe that is indeed the way people use the primitives – they are only as specific as necessary for particular communication tasks.  Perhaps even in the ‘atom’ analogy there can be some flexibility, since the isotopes of elements can have differing properties, but even that variability is much less than one sees with many conceptual primitives.

 

[GB-C]

> Case relations may not be the final word, but they provide a
starting point for core meta-relations that can be used to develop canonical propositions.

 

    Yes, case relations are among the relations I believe are primitive, but they are still only a small part of the total number of primitive relations.

    As my earlier note suggested, these hypotheses (however well motivated) need careful experimental testing to warrant strong assent, but the current trends in funding of NL research suggest that proper testing is still years in the future.

 

Pat

 

Patrick Cassidy

MICRA Inc.

cassidy@xxxxxxxxx

1-908-561-3416

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Gary Berg-Cross
Sent: Thursday, August 08, 2013 3:47 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] Case realtions as Practical Semantic Primitives - was Context and Inter-annotator agreement

 

The Context and Inter-annotator agreement topic seems to have wound down, but along the
path of that discussion there was this idea of semantic primitives. 
 John Sowa provied an historical  list of people who have addressed this seductively, common sense 
idea of selecting a small number of primitives for defining everything. It is, as he noted:
" one of the oldest in the history of philosophy,
logic, linguistics, and AI.  It can be traced back at least to 500 BC
with Pythagoras, Plato, and Aristotle. " 
He provided a highlight of work, but in that list I didn't see Fillmore's Case grammar,
which did have an important role in other part's of John's postings such as the
Verb Semantics Ontology project.  This might not provide ultimate primitives, but are
perhaps molecules of a deeper chemistry. Case relations may not be the final word, but they provide a
starting point for core meta-relations that can be used to develop canonical propositions.
As John noted, more research is needed but this is one tool that can be used for now.
 
 

Gary Berg-Cross, Ph.D.  

gbergcross@xxxxxxxxx     

http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross

NSF INTEROP Project  

http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816

SOCoP Executive Secretary

Knowledge Strategies    

Potomac, MD

240-426-0770

 

On Sun, Aug 4, 2013 at 1:28 PM, John F Sowa <sowa@xxxxxxxxxxx> wrote:

Pat,

PC

> The point at issue is whether all of the senses of a particular word
> needed for language understanding can be included in a semantic lexicon.
> My experience suggests that they can, even though new senses are being
> developed all the time.  The new senses can also be included in the lexicon,
> if they are important enough to warrant the effort.

That claim is vague enough to cover all bases.  If you want a project
that includes all word senses anyone considers important, I suggest
Wiktionary.  It has "3,476,017 entries with English definitions from
over 500 languages":

    http://en.wiktionary.org/wiki/Wiktionary:Main_Page

Large numbers of people around are actively updating and extending
Wiktionary.  When the number of senses is in the millions and growing,
it seems hard to claim that there is any finite upper limit.

PC

> JFS seems to be saying that failure of some groups to achieve a goal means
> that no amount of effort trying a related but different way can succeed

More precisely, the idea of selecting a small number of primitives for
defining everything is one of the oldest in the history of philosophy,
logic, linguistics, and AI.  It can be traced back at least to 500 BC
with Pythagoras, Plato, and Aristotle.  For summaries and references,
see http://www.jfsowa.com/talks/kdptut.pdf .

Slides 13 to 18:  Aristotle's categories, definitions, and the Tree
    of Porphyry for organizing them graphically.

Slides 91 to 93:  Universal language schemes in the 17th and 18th
    centuries.  John Wilkins developed the largest and most impressive
    set of primitives (40 genera subdivided in 2030 species).  Wilkins
    got help from other members to define 15,000 words in those terms.
    For more information about these and other schemes, see references
    by Knowlson (1975), Eco (1995), and Okrent (2009).

Slides 94 to 97:  Ramon Llull's Great Art (Ars Magna), which included
    Aristotle's categories, the Tree of Porphyry, rotating circles
    for combining categories, and a methodology for using them to
    answer questions.  Leibniz was inspired by Llull to encode the
    primitive categories in prime numbers and use multiplication
    to combine them and division to analyze them.

Slide 98:  Leibniz's method generated a lattice.  For modern
    lattice methods, see FCA and Ranganathan's facet classification.
    Click on the URLs to see FCA lattices that are automatically
    derived from WordNet and from Roget's Thesaurus.

Slides 99 to 101:  Categories by Kant and Peirce.  A suggested
    updated version of Wilkins' hierarchy that includes more
    modern developments.

Slides 102 to 107:  Issues about the possibility of ever having
    a complete, consistent, and finished ontology of everything.

For modern computational linguistics, the idea of selecting a set
of primitives for defining everything was proposed and implemented
in the late 1950s and early '60s:

1961 International Conf. on Machine Translation.  See the table
    of contents: http://www.mt-archive.info/NPL-1961-TOC.htm .
    At that conference, Margaret Masterman proposed a list of 100
    primitive concepts, which she used as the basis for lattices
    that combine them in all possible ways.  Yorick Wilks worked
    with Masterman and others at CLRU, and he continued to use
    her list of primitives for his later work in NLP.  For the
    list, see http://www.mt-archive.info/NPL-1961-Masterman.pdf

TINLAP (three conferences on Theoretical Issues in Natural Language
    Processing from 1975 to 1987).  The question of primitives was
    the focus of these conferences.  Yorick Wilks was one of the
    organizers.  Roger Schank (who also had a set of primitives for
    defining action verbs) was prominent in them.  For summaries,
    see http://www.aclweb.org/anthology-new/T/T78/T78-1000.pdf
    and http://www.aclweb.org/anthology-new/T/T87/T87-1001.pdf .

Anna Wierzbicka spent many years working on issues of selecting and
    using a proposed set of primitives for defining words in multiple
    languages.  From Wikipedia:  "She is especially known for Natural
    Semantic Metalanguage, particularly the concept of semantic primes.
    This is a research agenda resembling Leibniz's original "alphabet
    of human thought", which Wierzbicka credits her colleague, linguist
    Andrzej Bogusławski, with reviving in the late 1960s."  Many people
    tried to use her "semantic primes" in computational linguistics,
    but none of those projects were successful.

I never said "No amount of effort trying a related but different way
can succeed."  In fact, I have been proposing and *using* related
methods, but I always insist on keeping all options open.

There is no evidence that a fixed set exists, and an overwhelming
amount of evidence that Zipf's Law holds:  there is an extremely long
tail to the distribution of word senses.  But if you keep your options
open and *if* a fixed set of primitives is sufficient, then you will
discover that set.  That is my recommended strategy.


> So the statistical approach has become vastly more funded than
> the ontological/analytical.

I certainly agree with you that a deeper analysis with ontologies and
related lexical resources is essential for NL understanding.  I believe
that statistical methods are useful as a *supplement* to the deeper
methods.   At VivoMind Research, we use *both*, but the emphasis is
on a syntactic and semantic analysis by symbolic methods.


> the current strong emphasis on the statistical approach is, I believe
> retarding progress by failing to develop even the most basic resources
> needed for the analytical stage 2 function.

I wholeheartedly agree.  But from a selfish point of view, that gives
us a competitive advantage.  We got a contract with the US Dept. of
Energy based on a competition with a dozen groups that used their
favorite methods of NLP.

For the test, all competitors were asked to extract certain kinds of
data from a set of research reports and present the results in a table.
The scores were determined by the number of correct answers.  Our score
was 96%.  The next best was 73%.  Third best was above 50%, and all the
rest were below 50%.

For analyzing the documents, we used very general lexical resources
and a fairly simple general ontology.  But we supplemented it with
a detailed ontology that was specialized for chemical compounds,
chemical formulas, and the related details of interest.

For an example of a spreadsheet with the results, see slides 49 & 50
of http://www.jfsowa.com/talks/relating.pdf .

John


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J

 


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>