[ontolog-forum] OntoNotes and the Omega ontology

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Sat, 25 Sep 2010 08:51:44 -0400
Message-id: <4C9DF060.7040903@xxxxxxxxxxx>
OntoNotes was announced as a big project with large resources
that are being produced and released under the LGPL license.
See the first excerpt at the end of this note for the main
URL and a paragraph that describes the project.    (01)

Following are the components:    (02)

  * Treebank.  Text annotated with syntactic information    (03)

  * PropBank.  Verbs tagged with their semantic argument structure    (04)

  * Word Sense.  Tagging all verbs and nouns with their word sense and
    linking them to the Omega ontology    (05)

  * Ontology.  The Omega Ontology, a broad-coverage ontology containing
    word senses    (06)

  * Coreference.  Marking multiple mentions of the same entity in text.    (07)

The second excerpt below describes the Omega ontology.    (08)

These resources are very useful, especially for NLP.  However, the
intent of the OntoNotes project is to annotate huge volumes of
text for the purpose of training statistical tools.  In my opinion,
large-scale annotation by hand is obsolescent and unnecessary.    (09)

In any case, the resources can be used for other projects as well.    (010)

John Sowa    (011)

______________________________________________________________    (012)

 From  http://www.bbn.com/ontonotes/    (013)

The OntoNotes project is a collaborative effort between Raytheon BBN 
Technologies, the University of Colorado, the University of 
Pennsylvania, and the University of Southern California's Information 
Sciences Institute to produce such a resource. It aims to annotate a 
large corpus comprising various genres of text (news, conversational 
telephone speech, weblogs, use net, broadcast, talk shows) in three 
languages (English, Chinese, and Arabic) with structural information 
(syntax and predicate argument structure) and shallow semantics (word 
sense linked to an ontology and coreference). OntoNotes builds on two 
time-tested resources, following the Penn Treebank for syntax and the 
Penn PropBank for predicate-argument structure. Its semantic 
representation will include word sense disambiguation for nouns and 
verbs, with each word sense connected to an ontology, and coreference. 
Over the course of the five-year program, our current goals call for 
annotation of over a million words each of English and Chinese, and half 
a million words of Arabic.    (014)

 From  http://www.isi.edu/~philpot/papers/ijcnlp05/ijcnlp-olr05.pdf    (015)

Omega is a 120,000-node terminological ontology constructed at USC
ISI as the reorganization and synthesis of WordNet 2.0 (Miller 1990;
Fellbaum 1998), a lexically oriented network constructed on general
cognitive principles, and Mikrokosmos (Mahesh 1996; O’Hara et al.
1998), a conceptual resource originally conceived to support
translation, into a new upper model, created expressly in order
to facilitate the merging of lower models into a functional whole.
Omega, like its close predecessor SENSUS (Knight et al. 1994), can
be characterized as a shallow, lexically oriented, term taxonomy.
By far the majority of its concepts can be stated in English by
a single word. Omega contains no formal concept definitions and
only relatively few interconnections (semantic relations)
between concepts. By making few commitments to any specific
theories of semantics or particular representations, Omega enjoys
a malleability that has allowed it to be used in a variety of
applications, including question answering and information integration.    (016)

