Hi John,
My comments below are mixed with yours,
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
9 4 9 \ 5 2 5 - 5 7 1 2
-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F. Sowa
Sent: Friday, September 03, 2010 12:58 PM
To: ontolog-forum@xxxxxxxxxxxxxxxx
Subject: Re: [ontolog-forum] Patent application for using a formal ontology
inNLP
In my previous note about the patent by Werner
Ceusters, et al., I
didn't go into detail about the patent description,
which has a very
lengthy description of formal ontologies and how they
are used.
I received an offline note, saying that the
description was very
dangerous, because the so-called inventors submitted
another patent
application as a "continuation" of the
previous patent. Although
the continuation has only one fairly general claim,
the "inventors"
could add any claims they wish up until the time that
the patent
is granted.
He can file any claim he wants, but only
actually get claims patented that withstand the scrutiny of the examination
process. New claims, when added, have to be negotiated with the examining
staff which includes an examiner (not so well paid, they don't usually stay in
the job for very long), and the examiner's supervisor (a little better paid,
either a long staying examiner who got promoted or a career PTO employee). Many
of those are bachelor level physicists, mathematicians and engineers who are in
grad school studying to be patent lawyers.
At the end of this note, I copied the presentation or
"teaching"
from that patent, which describes how they use the
formal ontology.
There is nothing new in it. All of it has been
published and
implemented many times over during the past half
century. But the
patent examiners didn't know that. They granted
the patent, and
it's quite likely that they will approve this
"continuation".
If it has been granted, then the examiner
and her boss decided that the application met the legal specification for not
being "anticipated" by prior art, and for not being
"obvious" due to the combination of at least two concepts not
previously found to be taught in the literature. Usually, the patent
examining process is very thorough, but in the end, it comes down to those two
people and their dedication to finding the right decision on that patent. Or
not.
For a patent application, the US patent law allows the "inventors"
to add any new claims they please to the application
up until the
date the patent is granted. Since this
application has just one
claim, that indicates the "inventors" plan
to stuff the application
with many more claims just before it is granted.
No, your offline partner is fantasizing
the worst case. Every addition of new claim material requires another two
or more office actions by the examiners to review the new claims. Normally,
claims go through several iterations before the applicant and the examiners can
reach agreement. If the examiner believes the patent is either
"anticipated" (prior art exists) or "obvious" (a
combination of at least two new concepts), then she will get kudos from the
patent office for rejecting the claims and/or the entire specification.
Furthermore, the specification has to
teach the material which is stated so simply in any new claims. If the
spec gets changed in any way during the subsequent office actions, the filing
date is updated to the changed date, and any prior art that has been published
prior to the changed date is now anticipatory to the new material.
I suggest that readers of Ontolog Forum look at the
description
quoted below with one thought in mind: "Is
this similar to anything
I have done or plan to do with a formal
ontology." If so, these
"inventors" (or any company they assign the
patent to) could sue you.
If any claim by the Ceusters' inventors is
not fully taught by the self contained specification or prior art publications
referenced therein, then the specification is considered to not be
"enabling", meaning that a "person of ordinary skill in the art"
(PHOSITA OR POSITA) isn't considered able to learn the teaching from the patent
materials.
But in the end, the judge and the jury
make the decision during litigation whether the accused product or method
infringes the patent. That usually leaves a much wider credibility gap
than the examining process. Attorneys routinely tell me that they can't
really predict how the judge and/or jury will rule until they do so. So
the result of the great majority of litigation projects ends in settlement.
-Rich
John Sowa
________________________________________________________________________
http://www.faqs.org/patents/app/20090259459
The Formal Ontology
[0038] The formal ontology according to the current
invention comprises
a plurality of concepts one part of them being
independent of a specific
language, the other part being those concepts that
explain the
relationships between language-independent concepts
and language as a
medium of communication. By independent of language it
is meant that the
concepts do not depend on a particular language to be
given a definition
within the system. For example, in English, the word
"dog" is a label
applied to the concept of a particular animal. In
other languages the
same concept may be labeled with a different word,
such as "Hund" in
German, "cane" in Italian, or
"perro" in Spanish. In reality, regardless
of the label used in a particular language, the
concept of the animal
remains reasonably constant. The concept is therefore
said to be
independent of a specific language. Similarly, in the
domain ontology
according to the current invention, the concept for
this particular
animal is not dependent on a particular language. By
keeping concepts
independent of a specific language, the system
according to the current
invention can link concepts contained in the formal
ontology to terms in
more than one language.
[0039] However, although the concepts are independent
of any specific
language (such as English, French, . . . ), in the
present invention
they are not represented as being independent of
language as a medium of
communication. The second part of the formal ontology,
the linguistic
ontology, contains concepts about how humans interpret
language. For
example, the linguistic ontology according to the
current invention
contains the concept labeled "dispositive
doing", which as a real world
object relates to instances of an actor doing
something to an actee. The
concept is independent of a specific language because
the notion of
actor and actee in the context of the real world
object, an action, is
common to all languages. However, the concept is not
totally independent
of language in that the concept governs how the relationship
between the
actor and actee is understood by human beings.
For example, in the sentence [0040] "The doctor
treated the patient."
it is understood in language that the action
"treated" has an actor
"doctor" and an actee "patient".
That is, in the real world human beings
understand that doctors treat patients, and patients
don't treat
doctors. The linguistic ontology applies this
understanding to the real
world object "treatment".
[0041] Thus, the concepts that are contained in the
formal ontology are
of two types generally. The first type of concept
relates to real world
objects that are recognized by human beings as
metaphysical instances.
These concepts comprise physical entities, procedures,
ideas, etc and
are contained in the domain ontology. The second type
of concept relates
to how human beings understand language and allows the
identification of
real world instances. That is, how human beings
understand the
interactions of real world objects represented by the
concepts in the
domain ontology.
[0042] The concepts that are contained in the formal
ontology will
depend on the knowledge area that the ontology is to
be applied to, as
well as on the principles according to which human
languages function
independent of the knowledge area. The domain ontology
may contain
concepts comprising general knowledge about the world,
or may be limited
to a specific knowledge area of interest to a user.
Similarly, the
linguistic ontology may define very broad rules about
how language
functions, or it may define very narrow rules to limit
the relationships
that can exist between concepts in the domain
ontology. In a preferred
embodiment of the system according to the current
invention, the
concepts contained in the domain ontology are limited
to the knowledge
area of medical concepts complemented by a linguistic
ontology
containing the concepts required to understand how
natural language
functions, and how humans deal with natural language.
However,
ontologies built with concepts from other knowledge
areas can be created
with equal success.
[0043] By allowing the concepts in the formal ontology
to remain
independent of specific language, the system according
to the current
invention allows documents in a variety of languages
to be indexed and
searched independent of the language(s) known by the
system user.
According to a preferred embodiment of the invention,
the concepts in
the formal ontology are tagged with labels in English
to allow easy
maintenance of the formal ontology by a user. However,
the labels in
English are for ease of use in maintaining the formal
ontology only and
do not contribute to the functioning of the system in
indexing or
retrieval of documents. The concepts in the formal
ontology can be
alternatively labeled in Dutch, German, French,
Italian or any other
language desired by the user. Alternatively, the
concepts may be labeled
using a coding system that is completely independent
of language, such
as ICD-9 or ICD-10.
[0044] The basic architecture of the formal ontology
of the current
invention is a directed graph, i.e. a hierarchical
structure that allows
multiple parents. Referring to FIG. 2, an example of
the hierarchical
structure is shown. In the hierarchy shown in FIG. 2,
a primary node
comprises a single primary concept. In the example
shown, the single
primary concept is the concept "City". The
primary concept has as direct
children, narrower related concepts, such as "European City" and "North
American
City". Each of the
child concepts further have one or more
child concepts that further narrow the primary
concept. For example, the
concept of "European
City" may be narrowed to "French City",
"German
City" and "Belgian City".
The concept of "North American City" may be
narrowed to "Canadian City"
and "US. City".
[0045] The hierarchical structure of the formal
ontology, creates the
most basic relationships between concepts contained in
the formal
ontology, that of parent and child in a strict formal
subsumption
interpretation, and that of siblings. The formal
subsumption
interpretation guarantees that all characteristics
described of a
parent, apply to all of its children without any
exception. Referring
again to the example, the concept of "City",
which occupies the highest
level of the hierarchy is the parent concept to "European City" and
"North
American City".
By reciprocal relationship, the concepts of
"European
City" and "North American
City" are the
children of the
concept "City". Further, the concept of
"European City" is the parent of
the concept "German City",
etc. Further, the concept of "City" is the
grandparent concept to the concept "German City",
etc. Still further,
the concepts of "European
City" and "North American
City" have the
relationship of siblings since they share a common
parent.
[0046] Regardless of the knowledge area of the
concepts contained in the
formal ontology according to the current invention, a
similar
hierarchical structure with parent/child and sibling
relationships
exists. This is true of both the general world
concepts in the domain
ontology and the linguistic concepts in the linguistic
ontology. In a
preferred embodiment of the invention, the highest
level of the
hierarchy is occupied by a primary concept with a
label such as "Domain
Entity". According to the preferred embodiment of
the invention, the
primary concept of "Domain Entity"
encompasses all real things whether
they be physical entities, states, ideas, etc. The primary
concept may
then preferably be sub-divided into physical entities,
states, ideas,
linguistic concepts, etc. at the next lower level of
the ontology.
[0047] It should be apparent that because the
hierarchical structure of
the formal ontology, that all concepts in the ontology
can be traced
back to a single related concept at the highest level
of the ontology,
such as "Domain Entity". On the most basic
level therefore, the degree
of relatedness between two concepts can be measured by
how many steps in
the hierarchy must be traversed to find a common
ancestor for the two
concepts. Again referring to the example, the concepts
of "Brussels"
and
"Antwerp"
are siblings since they share a common parent, and are
therefore closely related to each other within the
hierarchy. By
contrast, one must traverse the hierarchy back to the
primary concept of
"City" to find a common ancestor for the
concepts of "Brussels"
and
"Chicago".
Since the concepts of "Brussels" and
"Chicago"
share only a
great-grandparent concept in common, they are less
closely related
within the context of the hierarchy than are the
concepts "Brussels"
and
"Antwerp".
[0048] It should further be recognized that a single
concept can have
more than one direct parent. For example, in addition
to the child
concepts shown in FIG. 2, the concept "City"
may have a child concept
"Capital
City". In this case
"Paris", "Berlin"
and "Brussels"
would be
children of the concept "Capital City"
in addition to being children of
"French City", "German
City" and "Belgian City"
respectively. By
allowing a concept to have multiple parent concepts,
the degree of
relatedness between two concepts within the hierarchy
may vary based on
the context of the relationship. As can be seen from
the examples,
"Paris",
"Berlin" and "Brussels" are more closely related in
the context
of "Capital
City" than in the context of
"European City". The only
limitation on the structure of the hierarchy is that a
concept cannot
have itself as an ancestor, which would lead to a
circular reference of
a concept to itself.
[0049] As stated above, the most basic relationship
between concepts in
the formal ontology according to the current invention
is the link
created by the parent/child relationship. However, the
relationships
that can exist between two concepts in the formal
ontology according to
the present invention is not limited to that of parent
and child. By
allowing other relationships to exist, the richness of
the knowledge
contained in the formal ontology is greatly enhanced,
while limiting the
overall size of the ontology. For example, in reality
the medical
concepts of "brain",
"inflammation" and "meningitis" are quite closely
related. However, the concept "brain" refers
to a body part, whereas
"inflammation" is a symptom and
"meningitis" is a disease. If a formal
ontology were limited to parent/child relationships as
a measure of the
relatedness of concepts it is likely that the degree
of relatedness
between these three concepts within the ontology would
potentially be
very low. This is because a large number of
parent/child relationships
would likely have to be traversed before a common
ancestor was found for
all three concepts. This would of course lead to an
inaccurate
reflection of reality. A potential solution to this
problem would be to
construct a formal ontology with sufficient detail to
narrow the gap
between these concepts in the hierarchy. For example,
the concepts of
the body part "brain" and the symptom
"inflammation" could be made
children of the concept of the disease
"meningitis". However, in order
to provide an accurate reflection of reality it would
be necessary to
construct similar relationships between
"brain" and "inflammation" and
every other concept that they are related to. Since
the concepts of
"brain" and "inflammation" would
most likely be attached to a large
number of concepts, this would result in a large
number of such
parent/child relationships. Further, similar parent
child relationships
would have to be built for every concept in the
ontology. This would
result in an unmanageably large ontology. In addition,
such a solution
would violate the formal subsumption nature of the
parent/child
relationships exploited in this invention.
[0050] The current system solves this problem by
providing a large
number of link types for linking concepts within the
formal ontology.
The link types within the formal ontology according to
the current
invention are used to define relationships between
concepts. For
example, in reality the concept of "brain"
is linked to the concept
"meningitis" in that the brain is the
location for the disease
meningitis. Using the link types available in the
formal ontology, a
user can create a link between the concepts
"brain" and "meningitis" in
the formal ontology so that this conceptual link is
also recognized by
the system. A user may further create a link between
the concept
"inflammation" and the concept
"meningitis" in the formal ontology to
indicate that inflammation is a symptom of meningitis.
Again, this
allows the system to recognize a conceptual link that
exists in reality.
Furthermore, by linking the concepts "brain"
and "inflammation" to the
concept "meningitis", a conceptual link
between the brain and
inflammation is created. That is, the link through the
concept
"meningitis" shortens the distance between
"brain" and "inflammation"
within the ontology. By shortening the distance
between these two
concepts, the conceptual linkage between the two
concepts in the
ontology is increased.
[0051] An advantage of this type of linking of
concepts is that it
allows for more accurate indexing of documents because
the deep meaning
of the text can be pulled out. For example, a text
that contains a
discussion of meningitis may contain very few
instances of the exact
term "meningitis". However, the document may
contain a significant
number of references to inflammation in the brain. A
standard indexing
technique that looks only for the specific concept
"meningitis" may rank
such a document of very low relevance, while in
reality it may have a
very high relevance to the subject. In contrast, the
system according to
the current invention will recognize the linkage
between the concepts of
"brain", "inflammation",
"meningitis" and as a result rank the document
with a more accurate relevance to the subject.
[0052] The number of link types that can be provided
for an ontology is
only limited by the number of such relationships that
can exist in
reality. According to a preferred embodiment of the
invention, a user
can use the available concepts and link types to build
criteria and
concept criteria. A criteria according to this
embodiment is comprised
of a concept with an associated link type. For
example, the link type
HAS-LOCATION can be associated with the concept BRAIN
to produce the
criteria [HAS-LOCATION] [BRAIN]. This criteria can
further be used to
define a property of another concept as part of a
concept criteria. For
example [MENINGITIS] [HAS-LOCATION BRAIN]. The
association of the
criteria [HAS_LOCATION] [BRAIN] to the concept
MENINGITIS provides a
partial definition of the concept meningitis.
[0053] In a preferred embodiment of the invention,
each link type from a
first concept to a second concept has a complimentary
reciprocal or
contra link type that can be established from the
second concept to the
first concept. For example in reality, when two
objects "A" and "B" are
close to each other, we say that "A" is
close to "B" and "B" is close to
"A". In such case where a relation operates
bi-directionally, the
ontology is constructed by placing the same link type
twice, from "A" to
"B" and from "B" to "A".
E.g.: A IS-NEAR-OF B, B IS-NEAR-OF A.
[0054] A second case of paired link types according to
this embodiment
is used to describe an inverse relationship. For
example, where concept
"A" performs some action on "B",
"A" is defined as acting on "B" whereas
"B" is defined as being acted on by
"A". E.g.: A HAS-ACTOR B<-> B
IS-ACTOR-OF A; or A IS-SPATIAL-PART-OF B<-> B
HAS-SPATIAL-PART A. The
link types can be declared each other's inverse by use
of either CONTRA
or AUTOCONTRA attributes that can be assigned to them.
[0055] he operation of link types and reciprocation
will now be
explained by means of example. Prior to the
explanation, it is necessary
to define what is meant herein by the term
"instance". As used herein,
the term "instance" refers to an individual
manifestation or embodiment
of a concept in the real world (i.e. metaphysical
instances). By
example, for the concept of the disease meningitis, an
individual
diagnosed case of meningitis contracted by a specific
person would be an
occurrence or "instance" of the disease.
[0056] Now if we declare in the formal ontology
"MENINGITIS" IS-CAUSE-OF
"INFLAMMATION IN THE BRAIN", then it means that
all metaphysical
instances of meningitis cause inflammation in the
brain. However, this
does not provide any reciprocal information about
metaphysical instances
of inflammation in the brain.
[0057] By contrast, if we declared "INFLAMMATION
IN THE BRAIN" HAS-CAUSE
"MENINGITIS", then it means that all
metaphysical instances of
inflammation in the brain are caused by meningitis.
Here again however,
we are provided with no information about metaphysical
instances of
meningitis.
[0058] By declaring a CONTRA, such as
"MENINGITIS" IS-CAUSE-OF CONTRA
HAS-CAUSE "INFLAMMATION IN THE BRAIN", the
system according to the
current invention provides information about all
instances of
meningitis: all instances of meningitis cause
inflammation in the brain.
By declaring a CONTRA, the system also provides
information about some
instances of inflammation in the brain: some instances
of inflammation
in the brain are caused by meningitis.
[0059] By declaring an AUTOCONTRA, such as
"MENINGITIS" IS-CAUSE-OF
AUTOCONTRA HAS-CAUSE "INFLAMMATION IN THE
BRAIN", the system according
to the current invention provides information about
all instances of
meningitis and all instances of inflammation in the
brain: all instances
of meningitis cause inflammation in the brain AND all
instances of
inflammation in the brain are caused by meningitis.
[0060] By using the various link types, and CONTRA and
AUTOCONTRA
declarations to link concepts within the ontology, a
user can build
definitions of the concepts in the ontology, while
giving it a precise
semantics as to how these declarations are to be
applied by interpreting
events in the world, this however without the
computational burdens
related to full first order logic.
[0061] As stated above, creating a link between two
concepts defines a
relationship between the two concepts. It also defines
something about
at least one of the concepts itself, such as
"brain" is the location of
"meningitis", or "inflammation" is
a symptom of "meningitis". By
creating these two links, a user enriches the
knowledge contained on the
ontology by providing a definition for the concept
"meningitis" based on
its interactions with other concepts in the ontology.
In a preferred
embodiment of the invention, a full definition can be
created for each
concept in the formal ontology. The full definition as
it is used here
means the set of necessary and sufficient links that a
concept has to
identify occurrences in the real world as instances of
the concept. In
other words: the set of all links of a given concept
in the ontology
defines what is true for all occurrences in the real
world that are
instances of the concept. The full definitions
assigned to a concept in
the ontology allow occurrences in the real world to be
recognized as
instances of the particular concept.
[0062] A further feature of the formal ontology
provided according to
the invention is the subsumption of child concepts
within parent
concepts, which results in full inheritability of
links from parent to
child concepts. That is, a child concept will
automatically be linked to
all concepts that its parent is linked to. For
example, the concept
"meningitis" may have the child concepts of
"viral meningitis" and
"bacterial meningitis", both of which are
more specific concepts
subsumed within the concept "meningitis".
Thus the link established
between the concept of "meningitis" and
"brain" will automatically be
established between the concept of "viral
meningitis" and "brain", and
"bacterial meningitis" and
"brain". Therefore, "viral meningitis" and
"bacterial meningitis" will inherit the
definition of the parent concept
"meningitis", but will be further defined
based on the further links
that each has to other concepts. In this way, the
system according to
the current invention can recognize each instance of
either "viral
meningitis" or "bacterial meningitis"
as an instance of "meningitis",
but will not necessarily recognize each instance of
"meningitis" as
"viral meningitis" or "bacterial
meningitis". This feature provides the
advantage of allowing a user to propagate a link to
the progeny of a
concept by establishing a single link.
[0063] As stated above, the link types provided as part
of the formal
ontology can be used by a user to define relationships
between two
concepts. At the same time the link types can provide
full definitions
of the concepts in the formal ontology. However, it is
recognized in
reality that, some relationships between concepts do
not make sense. For
example, it is recognized in reality that the disease
"meningitis"
cannot not have "inflammation" as a
location. In computerized systems
however, such nonsensical relationships are not
automatically recognized
unless you make the system work under a "close
world assumption" (i.e.
what is not known, is not allowed), or if it is
specified explicitly
what is not allowed. It is necessary to teach a
natural language
understanding system what are and are not appropriate
relationships
between concepts.
[0064] The system according to the current invention
solves this problem
by providing the linguistic ontology as part of the
formal ontology. The
linguistic ontology contains the rules about how
language works as well
as the principles that the human mind adheres to when
representing
reality at the conscious level of a human being.
[0065] In the linguistic ontology provided according
to the current
invention, rules are established regarding what
relationships can exist
between concepts on the basis of how these relations
are expressed in
language in general (though independent of a specific
language). For
example, a rule may be established that the concept
"disease" in the
formal ontology cannot be linked to the concept
"symptom" in the formal
ontology as a location. Because "meningitis"
and "inflammation" are
children of "disease" and
"symptom" respectively in the hierarchy, the
rule prohibiting this link would be inherited by them.
As a result, the
definition of inflammation as a location for
meningitis could not exist
in the formal ontology.
[0066] In one embodiment, the linguistic ontology may
be set up so that
there is an absolute prohibition against using certain
link types to
link certain concepts. In the example above, a user
would not be able to
create a link indicating the concept
"inflammation" as a location for
the concept "meningitis". Alternatively, the
linguistic ontology could
be set up such that a verification by the user will be
required when a
prohibited link is proposed. In this embodiment, the
user still has the
option to create the link.
[0067] The rules established in the linguistic ontology
may be as broad
or restricting as required for a given application or
knowledge area.
[0068] A second application of the linguistic ontology
is that it
restricts the possible representations of reality to
those that are
closest to the way reality is talked about by means of
language. For
example, in a shooting event, there are a number of
participants such as
the shooter, the deer, the bullet, the gun, etc. There
is only that one
specific event that happened (the shooting) in a
precise way (the deer
hit by the bullet shot from the gun by the shooter),
but there are
different ways to represent it formally: it can be
represented from the
viewpoint of the deer, the bullet, the shooter, etc.
The present
invention exploits the way humans usually talk about
such an event,
giving a central place to those aspects that are put
central by the
story teller.
[0069] A third application follows from the second in
that sometimes
single events are described as distinguishable
entities by means of
natural language. An example is the notion of baby
brought on earth,
wherein the view of "birth" (the baby's
viewpoint) is equally preferred
in medical language usage as that of
"parturition" (the mother's point
of view) or "delivery" (the physician's
point of view).
[0070] The domain and linguistic ontologies have thus
far been spoken of
as being separate entities within the formal ontology.
However, in the
current invention they are connected within the formal
ontology in that
a concept may have both a domain and a linguistic
concept as a direct
parent. For example, the linguistic concept of
"dispositive doing" may
have as a child the concept of a
"treatment", wherein a "treatment" as
an action has a physician as actor and a patient or
disease as actee. At
the same time, "treatment" may descend from
the parent concept
"healthcare procedure" in the domain
ontology. Within the domain
ontology, the concept of a "treatment" is
defined as a real world
object, but this definition cannot be used to relate
the object to other
real world objects. The linguistic ontology defines
how the real world
object actively relates to other concepts and relates
other concepts in
language.
[0071} As indicated above, the formal ontology
according to the current
system is independent of any specific language,
although not independent
of language altogether. However, free text documents
are written in
specific languages. In order to be useful for indexing
free text
documents it is necessary to relate the language
independent concepts to
specific languages.
[0072] The system according to the current invention
accomplishes this
by providing a lexicon of terms that are linked to the
formal ontology.
The terms contained in the lexicon may comprise single
words or
multi-word units that correspond to concepts, criteria
and concept
criteria in the formal ontology. Further, each term in
the lexicon may
be linked to more than one concept, criteria or
concept criteria in the
formal ontology, which allows for the existence of
homonyms. Likewise,
each concept, criteria and concept criteria may be
linked to more than
one term in the lexicon, such as when terms in two or
more languages are
contained in the lexicon.
[0073] When indexing a free text document or
interpreting a query to
retrieve an indexed document, the system according to
the current
invention uses the lexicon of terms to segment the
free text and to
relate the free text to the concepts, criteria and
concept criteria
contained in the formal ontology. Thus, the current
system makes use of
both terms and independent concepts in the analysis of
free text.
Managing the System, System Architecture
[0074] An additional feature of the present invention
provides a
management system for managing the formal ontology. As
discussed, the
formal ontology according to the current invention can
be constructed
using any available relational database system, such
as ORACLE®, SYBASE®
and SQLSERVER®. The ontology itself is abstracted
away from the
relational database system by wrapping access to the
database into a
management tool that exposes functionality to the
user. The database
functions as a physical storage medium for the
ontology. According to
the current invention a management tool is provided
for giving a user
access to the ontology for the purpose of adding to or
manipulating the
ontology. The tool allows the user to view the formal
ontology using a
variety of different criteria that together give a
complete picture of
the structure of the formal ontology. In a preferred
embodiment of the
invention a user can view several different views of
the ontology at
once as a layout, allowing the ontology to be viewed
from several
perspectives at once.
[0075]The management system for maintaining the formal
ontology will be
explained with reference to FIG. 3, which shows the
architecture of the
ontology management system according to the current
invention. The
formal ontology and lexicon of terms are stored on a
database 20, which
is in communication with a server 22, which houses the
server based
component of the ontology management tool 26. The
server based component
of the ontology management system comprises a
relational database which
controls access to the formal ontology, and contains
the components for
building the formal ontology, such as the hierarchical
structure, link
types, setting rules in the linguistic ontology,
linking terms to
concepts, etc, along with the tools for creating multiple
views of the
ontology. The ontology management system further
comprises a client
based component(s) 24 that allows a user to access and
maintain the
ontology via the server based component 22. The system
can be
implemented on a number of platforms, including but
not limited to
WINDOWS®, SOLARIS®, UNIX® and LINUX®.
Preferably, the management tool 26
is a set of business objects. A low layer is a thin
wrapper on top of
the database structure that implements the base
functions to access a
particular relational database. A middle layer also
exposes a set of
functions that manage multi-user access to any type of
supported
database, such as a relational database. As such the
middle layer allows
the creation of customized versions of the management tool
within
certain limited parameters. A top layer implements the
high level
interface. This interface surfaces functionality from
a logical point of
view to outside users (e.g. "getConceptTree"
is a high level layer
function that makes use of the underlying middle and
low level layer
functions to populate a tree object with information
about the place of
a concept in the formal ontology). Functionality
implemented by the low
and middle layers includes but is not limited to the
linking of external
databases, database manipulation and navigation, and
text searching.
Linking External Databases
[0076] As described thus far, the formal ontology
according to the
current invention is constructed manually by a user by
creating
hierarchical levels, slots within those hierarchical
levels and further
filling those slots with concepts, thereby creating
the basic hierarchy
with its parent/child relationships between concepts.
The user further
enriches the knowledge base by using the link types
provided to define
relationships between the concepts entered into the
hierarchy. In
addition to being able to manually construct the
formal ontology, an
alternative embodiment of the system according to the
current invention
provides the ability to map data from an independent
database onto the
formal ontology.
[0077] In a number of knowledge areas, large databases
of information
are already in existence. In order to avoid the
laborious work of
manually re-entering this information into the formal
ontology, the
system according to the current invention provides the
capability to
link the formal ontology to an external independent
database. Although
the external data never becomes a physical part of the
ontology, this
feature allows a user to access and use data contained
on an independent
database as if it were part of the formal ontology.
[0078] Data in an external database is linked to the
ontology by
creating a parent/child relationship between at least
one concept in the
formal ontology and at least one item of data in the
database. In the
case of an external database in tabular format, such
as an ACCESS®
database, a user can link an entire column of data in
the external
database to one or more concepts in the formal ontology
by creating a
parent/child link between at least one concept in the
ontology to the
header for the column in the table. Normally, when
data is provided in
tabular format, each column of the table is given a
header with a
descriptive title for the data contained in that
column. In creating the
parent/child relationship between the concept in the
ontology and the
column of data, the system analyzes the title and
associates it with
appropriate concepts in the ontology. Alternatively,
the system may
provide the user with a list of potential concepts
that the data can be
mapped to. The system may make use of the terms
contained in the lexicon
when performing this function. In an alternative
embodiment of the
invention, a user can manually map an item or column
of data to the
desired concept.
[0079]Referring to FIG. 4, an example of how an
external independent
database may be mapped to the formal ontology is
shown. The relational
database 30, server 32 and client based component 34
are as described in
FIG. 3. Databases 36 and 38 are external independent
databases, such as
ACCESS® databases containing data to be mapped
onto the formal ontology.
Database servers 40 and 42 associated with each
database 36 and 38 allow
access to their respective databases so that queries
can be run. A
database directory service 44, assigns keywords to the
separate
databases 36 and 38. According to the current system,
the same keyword
may be assigned to two or more databases containing
similar data that
can be accessed at the same time. The database
directory service
provides the location of all of the available
databases to an ontology
proxy module 46. The ontology proxy module 46 receives
queries from a
user via the client based component 34. The ontology
proxy module then
directs the queries to the server 32 and to a
database-ontology mediator
module 48. The database-ontology mediator module
comprises an
ontology-to-database translator 50 and a database-to-ontology
translator
52. The ontology-to-database translator 50 serves the
function of
translating the ontology concept based queries to
database queries that
can be used to search the databases 36 and 38 for data
that is mapped to
the particular concept or concepts embodied in the
query. The
database-to-ontology translator 52 serves the function
of translating
the information returned from the database to a form
that can be viewed
by the user via the client based component 34.
Coding Using Independent Coding Systems
[0080] In a preferred embodiment of the system
according to the current
invention, the formal ontology is comprised of a
knowledge base of
medical concepts. A preferred use for the system is in
the indexing of
medical documents. A further preferred application of
the system
according to the current invention is the coding of
medical documents
using a standard medical coding system. Standard
medical coding systems
that can be used in conjunction with the current
invention include, but
are not limited to ICD-9, ICD-10, MedDRA and SNOMED.
[0081] To accomplish this, the medical concepts
contained in the formal
ontology of the system can be mapped to the
appropriate codes contained
in the appropriate independent database (i.e. ICD-9,
etc).
Alternatively, the appropriate coding system may be
included in the
formal ontology as a separate and parallel hierarchy
to the hierarchy of
medical concepts. In this alternative embodiment, each
medical concept
is linked to the appropriate code via a "has
code" link type. For
example, the concept "meningitis" would be
linked to the ICD-9 code
322.9 or the MedDRA code 10027252.
[0082] By linking the concepts in the ontology to the
appropriate codes,
the system is able to annotate free text documents
with these codes as
the documents are being indexed.