Human to human language is widely
unrestricted, even nonsyntactic, in patent documents. Yet those documents are
carefully written by the inventor, carefully critiqued through at least two
office actions by the PTO examiner, and the claim language is debated,
reviewed, checked against prior art, and finally allowed to issue for some
small fraction of patent applications.
My software treats them as unrestricted
text, and I think that is by far the most feasible and effective way to treat
them. It is fairly easy to break up the text into sentences, with periods,
question marks and exclamations able to detect at least 80% of the sentence
endings. False endings (especially with periods and acronyms) can be effective
endings in most of the other 20%, with a few mismarks remaining in nearly every
patent.
Claim language, though unrestricted, has
some statutory words and punctuations which can be detected and used to
effectively break long claim statements (well over 100 words) into elements.
The elements also contain rarer words that
designate the claimed materials, ideas, systems and methods. Those rarer words
are useful in tracing each claim element back to sentences in the specification
which form the disclosure of the invention.
So my software uses mostly unrestricted
forms of English, except for the statutory lexicon needed to structure the
interpretation out of the nearly unrestricted specifications, claims, abstract,
and more highly structured database-like columns.
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
9 4 9 \ 5 2 5 - 5 7 1 2
From:
ontolog-forum-bounces@xxxxxxxxxxxxxxxx
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Ali H
Sent: Monday, March 18, 2013 11:34
AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] Fwd: MOVED: Re: [ontology-summit]
Hackathon: BACnet Ontology
Hi John,
On Mon, Mar 18, 2013 at 2:04 AM, <sowa@xxxxxxxxxxx> wrote:
I believe that it is easier to process unrestricted NL as written by
humans who are writing for other humans than it is to correct the errors
in the artificial languages written by humans who are writing for
machines.
Do you mean that it would be easier to process by machines as well? Easier to
process by whom and how?