ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] [Corpora-List] What support should a corpus provide

To: "'Ed Lowry'" <eslowry@xxxxxxxxxxxx>, "'[ontolog-forum]'" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Rich Cooper" <rich@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 9 Aug 2014 19:09:57 -0700
Message-id: <!&!AAAAAAAAAAAYAAAAAAAAAAb3x6NyrzVKo6ReWvn+7BjCgAAAEAAAADbJmT4EK/NHkdfmQGA2Ng0BAAAAAA==@xxxxxxxxxxxxxxxxxxxxxx>

Thanks Ed; Good point. 

 

I noticed you have two patents, and hopefully you understand the legal concepts of invalidity through prior art, through obviousness, or through indefiniteness.  Those are the concepts I am making transparent and efficiently comparable for patent corpora as used to explore invalidity and infringement. 

 

EfP does that with the sentences retrieved from patents – some syntactically well-formed and others not so well formed.  I don’t have the luxury of forcing users to obey a controlled language syntax.  I have to take what is actually in the patent database at the USPTO. 

 

Each claim sentence, comprising claim elements, is displayed and analyzed in EfP, but the claims are often syntactically flawed because there is no strict typing or other force requiring it when the patent database is collected.  Nevertheless, the rough English claim sentences are interpreted and debated over by attorneys and judges very often, in nearly every patent dispute.  I estimate there are about twenty patent litigations per week day in the US on average, but there is wide variance. 

 

Each rough English sentence in the specification sections is a logical assertion in Modus Ponens form, directly from the Abstract, Description, or Claims.  Therefore the nouns and verbs used in the claims are literally the language of comparison and similarity estimates among patents. 

 

I use a directed acyclic graph of objects and activities in my FOL converser because it provides a fast database, searchable with inference, to map the situations and contexts.  I used a simple open syntax English-like language for the FOL question answerer, which also functions as a theorem prover.  But this mechanical level isn’t yet incorporated into the released ELK for Patents, which is the first release.  FOL comes later, after sample corpora for invalidity questions can be gathered, stored and organized. 

 

I mentioned object and activity models in the patent (7,209,923), but I also plan to use that in other software after the ELK for Patents product.  But first, I have to develop large scale processing of large patent sets to rationally narrow down the huge numbers of initial candidate prior art patents to a reasonable number of candidates.  That is what the market requires, IMHO. 

 

I plan to provide IDEF0 activity diagrams and object categories for instantiation of the various kinds of ICOMAs in future applications, as described in the patent also, but it isn’t there yet.  I also mentioned that I have a FOL language, which at some point will be released, but isn’t ready just yet.  First, I want to develop a community of users who can explain what they want in improvements.  That takes people with a stake in the patent industry. 

 

-Rich

 

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2

From: Ed Lowry [mailto:eslowry@xxxxxxxxxxxx]
Sent: Saturday, August 09, 2014 6:40 PM
To: [ontolog-forum]; Rich Cooper
Subject: Re: [ontolog-forum] [Corpora-List] What support should a corpus provide to ontologists?

 

Rich
My impression is that you have missed the need for a language that allows for representing
a maximum of richly structured precise information with a minimum of complexity. 
See "Inexcusable Complexity for 40 years"  at http://users.rcn.com/eslowry  .
 
Ed Lowry


On 8/9/2014 8:44 PM, Rich Cooper wrote:

In developing ontologies to match corpora samples, as in learning algorithms, what kind of analysis of each document would be useful to compare one patent claim against that patent’s description, and against an arbitrary potential prior art candidate?

 

Entity recognition, with and without names or descriptions or anaphora;

Objects and activities mentioned in the claims, as compared to those mentioned in each patent; Mereological relationships among the identified objects and activities;

Common verb signature database with identified variables and constants,

Modus ponens interpreter of signature phrases wrt the identified objects and activities,

               Logic language of FOL level, Horne clause, lexical scopes, question answering,

               Heuristic search through And/or graphs with FOL parameterization, simple algebra

 

What have I missed?

 

The idea, or long term goal, is to build an ontology of patent claims as encountered in published patents.  If that turns out to be helpful, other document analysis tasks might benefit from the ontology so developed. 

 

-Rich

 

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2

From: corpora-bounces@xxxxxx [mailto:corpora-bounces@xxxxxx] On Behalf Of Rich Cooper
Sent: Friday, August 08, 2014 11:12 AM
To: 'John F Sowa'; corpora@xxxxxx
Cc: '[ontolog-forum] '
Subject: [Corpora-List] What support should a corpus provide?

 

Dear Corpus Analysts and Ontologists,

 

I have just made available a corpus of documents from the US Patent and Trademark Office which are available for corpus analysts.  The tools available now are sufficient for supporting attorneys, inventors, scientists, and other similar application legal and technology roles. 

 

What additional support should I provide in the software for supporting corpus analysis of selected patent document subsets?  I have a web site with extensive help and tutorial materials – I suggest starting at:

 

www.EnglishLogicKernel.com/Help/help.htm

 

to see an index of capability descriptions.  I can make available the “frequent words” and the “rare words” lists as text files, along with the patent documents in whole or in sections for data, abstract, description and claims, which are already extracted from the selected document set.  The claim tree is parsed, and the claims are separated into claim elements, all of which can be provided. 

 

Is there anything else that corpus analysts would like to see in the software?

 

Suggestions highly appreciated,

-Rich

 

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2




 
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
 

 

--
Ed Lowry


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>