What you are looking for is not ontology
per se, but the discovery of patterns in textual data. You may need a
text mining capability. For example, you can read about one which I
described in my patent document at:
That document describes a method for
making sense of textual tokens in sequence, using each token with respect to a
pattern being matched against text and data. Predicates are used to
identify matches of each sample token with its referents in a database of text
samples intermixed with data fields.
You can use this kind of method to perform
the classical discovery process (also described in the paper above) on the data
to experiment, theorize, classify and explain the database of texts. These
four processes interface in various ways as shown in the diagram below from
figure 13 of that document:
At present, that technique is being
developed into a specialized vertical market to match selected patent claims
against documents describing technical details. The goal is to match a
claim against all documents that could possibly be practicing the patent claim.
The approach used for that purpose is detailed in the patent application
If your goal is to analyze randomly chosen
documents (programs, papers, books, …) for textual patterns, you may have
to be more detailed about a specific need you have.
Rich AT EnglishLogicKernel DOT com
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Pat Hayes
Sent: Tuesday, January 26, 2010
To: David Eddy
Subject: Re: [ontolog-forum]
Context in a sentence
On Jan 25, 2010, at 8:41 PM, David Eddy wrote:
On Jan 25, 2010, at 11:16 AM, FERENC KOVACS wrote:
what MT people do not seem to understand, becasue they believe in Frege who
says that the sense of a sentence is derived from the sense of its constituents
I'm not clear on what you're saying here. "Frege"
conveys no information or context for me.
What I'm GUESSING you're saying is what I think is a significant divide
in the MT (machine translation) community.
In 2006 on a lark I attended a MT conference here in Cambridge simply because I could. Other
than being amused by the Russian-to-English translation fumblings in the early
1960s, I have absolutely nothing to do with MT other than general intellectual
In the opening meeting the out-going president mentioned in his opening
remarks about the long standing dispute/battle/squabble/wrangle over the basic
(1) the entire meaning of a message is self-contained in said message,
(2) the complete meaning of a message could depend on contextual
information OUTSIDE of the message.
I couldn't believe my ears, since I could not then & can not now
believe in #1. It's just not a world I've experienced. Is #1
possible? Yes. How much of the time does #1 happen? I'd say
I double checked to be sure I correctly heard what I thought he'd said
& he confirmed the above two decidedly opposing points of view.
Reading the NY Times is not the problem I'm interested in. Those
are documents written by humans, edited by humans for readability &
intended for more or less widespread human consumption.
I want something--MT? Ontology support?--that can read Fortran, Jovial,
COBOL. Java, PHP, Ruby, C, etc. (oops... that's a computer language) documents
& make (more) sense out of said documents.
More sense for who or what?
These are textual artifacts (therefore "documents"?) which
may or may not be written by humans, they're decidedly NOT edited for
readability, and they are really not intended for human consumption.
Well, most of them were certainly designed with this goal in mind.
Programs in Fortran, COBOL, etc. are written and read by programmers, who are
arguably human beings of a kind.
AND much of the context in software code has been entirely stripped
away. So much to most of the context is external to the message/document
I'm trying to make sense of.
AND... not to belabor the obvious, these documents can be stunningly
devoid of any sort of formal, mathematical logic since they're dealing with
regulations, laws, & business practices which often defy logic. If a
law states that under thus & such circumstances 1 + 1 = 3, then that's
the way it is.
But the computer will still evaluate it to false :-)
Is "ontology" going to help deal with my problem, or am I
peering down the wrong rabbit hole?
I think, unfortunately, the latter.
(850)434 8903 or (650)494 3973
40 South Alcaniz St. (850)202 4416
(850)202 4440 fax
(850)291 0667 mobile