Dear Leo, John, some comments on the intended usage of FRED.
FRED is intended as a tool for automatic population of the Semantic Web from text, either as a direct, or intermediate representation.
Such tools typically perform shallow parsing (NER, sense tagging, topic detection, some limited relation extraction), which produces sparse triples. Therefore, we try to produce RDF and OWL from the Neo-Davidsonian, DRT-like approach of Boxer's, which is also a very fast tool, compared to deep parsing state of art. Such RDF needs also to be linked, i.e. automatically connected to Semantic Web resources, in order to make such an automatic population useful.
Producing RDF and OWL that makes sense in typical Semantic Web ontologies proved to be a *huge* effort. We had to test dozens of heuristics to arrive at a minimally acceptable result, which is what you are playing with. The work is far from being complete, and many other heuristics are being tested to deal with specific lexical relations, non-standard pseudo-boolean operators, (limited) universal quantification of DRT, etc. You will keep finding less than ideal and some plainly wrong representations, but we wanted anyway to bootstrap this thing, in order to indicate the direction, and to get early feedback. Consider that this is just the basics, what Ferdinand de Saussure called the "in praesentia associations" of words.
What John refers to: metonymy, metaphor, and how basic lexicon is re-described within them, pertains to Saussure's "in absentia associations", and requires deeper heuristics and model training, and something will always be lost. Firstly because no logical language can represent all the subtleties and contextual underpinnings of natural language meaning, and secondly because human meaning is bound to our practices, which are not simply reproducible by machines. But this is a vexata quaestio.
Our hypothesis is that deep parsing can be useful to a certain extent, as many other approaches are, either statistical or logical. We now have the responsibility to stretch the hypothesis in concrete use cases. An example is the Tipalo tool ( http://wit.istc.cnr.it/stlab-tools/tipalo), which creates a linked data-rich model for the definitions of Wikipedia entities, based on FRED.
Best Aldo
On 14 Jul 2012, at 23:20, John F Sowa wrote:
_____________________________________
Aldo Gangemi Senior Researcher Semantic Technology Lab (STLab) Institute for Cognitive Science and Technology, National Research Council (ISTC-CNR) Via Nomentana 56, 00161, Roma, Italy Tel: +390644161535 Fax: +390644161513 skype aldogangemi
|