Good question David,
The interpreter must have a wide
vocabulary of meanings to be interpreted in the way the DE expects them to be
interpreted. Where the interpretation is debatable, it should be thrown
back to the engineering team to fix the inappropriate interpretation.
This leads to possible inconsistencies and
lots of ambiguities in real operations, so this is not a good technology for
dangerous mission critical applications. Instead, the first few widespread
uses of CNLs will likely be in areas where the DE is simply overworked, and the
CNL somehow will help reduce the volume of work, or process a large fraction of
otherwise correctable descriptions.
Personally, I prefer working with analysis
of English in structured database forms, such as the PTO database, where each
English statement has some kind of narrowing predication about what kinds of
things the statements can describe.
For example, in patents, there are a
couple dozen fields of structured information – patent number, first
inventor, PTO classification, filing date, and so forth. To that, I add
the context words (those that are not commonly used “noise” words
with very simple syntactic roles). The context words are appropriate for
identifying the context of each patent, and measuring similarity of one patent
to another in very formal ways, i.e., through claim construal.
But most text is written in REAL English,
not in CNL, so the problem is to fit the entire database into a vocabulary of
CNL which can be disambiguated by the observable context.
JMHO,
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
9 4 9 \ 5 2 5 - 5 7 1 2
From:
ontolog-forum-bounces@xxxxxxxxxxxxxxxx
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of David Eddy
Sent: Saturday, March 12, 2011
9:40 AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] Fwd:
Re: Using controlled natural languages forontology
Adrian
-
On 2011-03-12, at 12:15 PM, Adrian Walker wrote:
We can indeed provide such a tool, and the English can optionally be open, rather than
controlled vocabulary.
Then how do you hand the commonplace situation where there are at
minimum several dozen synonyms?
social security number = {SSN, TIN, EIN, SIN, taxid, TAX-NO,
soc_sec_no, SOC-SEC-NBR, empl_ID,....}
policy number = {M0101, POL-NO, CONTRACT-ID, and 67 more}