All (01)
I seem to have got this via the "broadening" and am not up on all of
the RFPs. (02)
However, some brief observations: (03)
There are several different kinds of artifacts that are often
confused. I fear the RFPs
may be confusing them. (04)
i) Controlled vocabularies and systems of controlled identifiers,
designed to provide unamiguguous
references, usually with persistent "nonsemantic" identifiers plus a
lexicon of "terms" or "labels", often
divided into "preferred" and "synonyms", sometimes also "hidden" and
other categories (05)
ia) Lexicons of the linguistic information that may, or may not, be
associated with controlled vocabularies, but
typically contains much more linguistic information than is required
for logical inference - alternative forms of
speach, inflections, rules about position, mass vs count markers for
nouns, etc. depending on the language
in question. (06)
ii) Thesauri designed to be browsed and navigated by humans but
explicitly using "broader than" / "narrower than"
to indicate that the relations between entities are not logically
defined. Most library cageloguing systems such as
the MeSH headings used in PubMed fall into this category. Recently W3
has tried to produce a standard for
thesauri - SKOS - Simple Knowledge Organisation System - which is a
specialised vocabulary which fits neatly
with dublin core. (07)
iii) Ontologies - representations of the definitons and universal
relations amongst types of entities, often expressed
in formal logical representations, but always beginning "All Xs ...". (08)
iv) Knowledge representation systems that consist of various sorts of
contingent knowledge -
statements that beginning " Some Xs...", Simple "facts" (ground
statements) default statements with exceptions,
probabilistic statements, epistemic statements, etc. Knowledge
representation systems may use
the entities defined in an ontology, but they express things not
expressible in an ontology per se, and not
properly "terminological". (09)
v) General logical theories, which may encompass ontologies, and parts
of knowledge representation systems, but
are not limited to statements of the form "all x. P(x) & ... --> ..." (010)
Recently we have experienced what might be described as "ontological
empiricism" in which artifacts that begin as
ontologies are attempted to be extended to be knowledge
representations or even "logical theories of everything". (011)
"Terminologies" may indicate either a simple controlled vocabulary,
thesaurus, or ontology in conjunction with some level
of lexicon. Recently there is a serious tendency for some parties to
attempt to extend the scope of "ontologies" to be
all of knowledge representations or general logical theories of
everything. (012)
Any RFP for a "terminology server" must take great care in specifying
scope, lest it suffer from terminal scope creep. (013)
Regards (014)
Alan (015)
On 16 Nov 2009, at 14:06, Ed Dodds wrote: (016)
> FYI:
>
>
> ---------- Forwarded message ----------
> From: Rubin, Ken <ken.rubin@xxxxxx>
> Date: Mon, Nov 16, 2009 at 10:26 AM
> Subject: RE: 1st draft of an RFP requesting API for Knowledge Bases
> To: "edbark@xxxxxxxx" <edbark@xxxxxxxx>, "hugues.vincent"
> <hugues.vincent@xxxxxxxxxxxxxxx>
> Cc: "ontology@xxxxxxx" <ontology@xxxxxxx>, Alan Honey
> <aphoneysys@xxxxxxxxx>, "jobst.landgrebe@xxxxxxxxx"
> <jobst.landgrebe@xxxxxxxxx>, "healthcare@xxxxxxx"
> <healthcare@xxxxxxx>, "Solbrig, Harold R." <Solbrig.Harold@xxxxxxxx>
>
>
> Broadening the pool a bit to include the healthcare list.
>
> Ed: I'll start with the end so we don't argue the semantics in the
> middle. I agree with your assertions around alinging the work in
> advance and avoiding silo-building. Frankly, the remainder are
> probably smaller nuance items.
>
> The one point that I think merits a mention is your assertion around
> the role of an information model. In a "classic IT sense" the Info
> model drives toward design of persistence, but at least within the
> Health domain that core use has morphed over the past several years.
> I see an information model as a shared represntation of concept
> understanding. If anything, the "lines" between an information model
> and underlying concept/terminology models is ever blurring, driven
> more by the ability to achieve a cross-party consensus on consistency
> (represented in the "info model") versus diversity (in the
> terminology).
>
> Your penultimate assertion, however, is spot on. If we come through
> this opportunistic time without the ability for these different
> standards to align if not directly complement each other we've missed
> the boat.
>
> - Ken
>
> -----Original Message-----
> From: Ed Barkmeyer [mailto:edbark@xxxxxxxx]
> Sent: Friday, November 13, 2009 1:49 PM
> To: hugues.vincent
> Cc: ontology@xxxxxxx; Alan Honey; Rubin, Ken;
> jobst.landgrebe@xxxxxxxxx
> Subject: Re: 1st draft of an RFP requesting API for Knowledge Bases
>
> All,
>
> I can't resist sticking my oar in on this.
>
> First observation: Terminology models, and "dictionary services" are
> becoming a cottage industry. There is a need for a set of cohesive
> standards, but not a need for four diverse standards activities, e.g.,
> in ISO TC37, in JTC1/SC32 (ISO 11179), in TC184/SC4 (ISO 29002), and
> OMG CTS (and SBVR and KDM). Some, perhaps all, of these activities
> are somewhat misguided, because each is projecting its own viewpoint
> on the world.
>
> Second observation: Hugues quotes from the CTS RFP:
>
>> Terminologies (also designated as controlled vocabularies) are
>> concept
>> centric, i.e. they provide a set of concepts, representations of
>> these
>> concepts (designations and codes), definitions of their meanings and
>> binary relationships of the concepts to each-other. They are not
>> primarily used to represent knowledge, but to provide concept
>> representations, the basic elements for computational semantics.
>> Ontologies provide knowledge representation systems used to represent
>> knowledge in a machine storable and interpretable manner which
>> allows machine-based syntactic deduction ("reasoning").
>>
>
> Ontologies are concept-centric, i.e. they provide a set of concepts,
> designations for the concepts, and formal or informal definitions of
> their meanings in terms of other concepts. In ontologies written in
> Description Logic languages, concepts are commonly described as being
> of two kinds: "classes", which characterize individuals, and
> "properties", which characterize relationships between individuals.
> In First-Order Logic languages, concepts are characterized by
> "relations" (which are just the generalization of "class" and
> "property"), and "axioms" which are assertions about individuals in
> terms of the relations they satisfy.
>
> Note the parallels, in particular the parallel to Description Logic.
> While ontologies (in the narrow sense used in OMG) are nominally about
> reasoning, the ability to reason with them is based on the language in
> which they are written, and not on the nature of the knowledge that is
> captured. That is, a terminology whose terms and definitions are
> captured in a formal language is a (formal) ontology. A terminology
> whose definitions are captured only in unstructured natural language
> is a "weak ontology". Many published OWL models and most UML
> "information models", along with IMM E-R models and ORM models, are
> weak ontologies:
> the only relationships they capture that are suitable for reasoning
> are subsumption (subtype) and cardinality constraints (association end
> multiplicities). Per ISO 704, terminological definitions capture
> subsumption; they may or may not capture some cardinality constraints.
>
> So there is a kind of continuum here: terminological dictionary,
> taxonomy, information model, DL ontology, FOL ontology. Terminology
> services can operate effectively on all of them. DL services operate
> on the ontology to deduce unstated relationships among classes and
> properties (not usually individuals). Data services operate on any of
> (taxonomy, information model, DL ontology) and an associated
> data/knowledge base of facts about individuals to provide asserted
> facts and mechanically derived facts about individuals. Reasoning
> services operate on an ontology and an associated knowledge base of
> assertions about individuals to infer facts about individuals.
>
> So, if we are going to draw lines between these things, we all have to
> have an agreed-upon terminology and an agreed-upon model for kinds of
> information and kinds of information schemata. The text of CTS2
> excerpted above makes the erroneous assumption that formal information
> modeling languages are not primarily about concepts. Information
> modeling, which includes all ontologies, is_ only_ about concepts and
> designations and relationships among concepts. The distinction is in
> how those are captured and to what end.
>
> Third observation: The services characterize the purpose of the
> information model. The purpose of a terminology is to support human
> comprehension of designations and to support the translation of
> designations from one language (natural or formal) to another. The
> traditional purpose of an information model is typically to design a
> data repository or a message suite. The nominal purpose of a formal
> ontology is to support some kind of inferencing, either about terms,
> or about individuals. But it should be noted that formal information
> models, including ontologies, are used to translate data element
> languages, design or interpret data repositories, to design or
> interpret message suites, and to characterize automated "services" in
> an SOA environment, very little of which involves any "reasoning".
>
> So "Terminology services" should apply to arbitrary information
> models, from dictionaries up to formal ontologies. Service support
> for each of the other categories would presumably include services
> that don't apply to less structured categories of information model.
>
> And it is exactly that distinction that makes having an architecture
> for these things meaningful. The nature of an "ontology" -- an
> information model -- is what it enables as built. An RDF triple store
> is just a poor relational database unless the services can do
> something more interesting than joins.
>
> Finally, therefore, I think there is a common subset of "ontology
> repository services" and CTS services. But like ISO 29002, CTS will
> be concerned about the relationships between designations for the same
> concept in different languages, both formal (codes) and natural
> (jargon), whereas ontology services will not typically call out that
> set of concerns as having any great significance. In ontologies,
> sameAs is just another concept-to-concept relationship, and codes are
> datatype properties of things or classes.
>
> Hugues says:
>> Next, we could indeed highlight the discrepancies between the two
>> RFPs but mainly CTS2 deals with terminologies and API4KB with
>> knowledge representation and reasoning on it. Indeed, reasoning is
>> one of the main points, if not *the* main point, in API4KB.
>>
>
> I agree that it may be *the* main point as written, but I don't
> think it
> *is* the main point of many existing corpora that are written in OWL
> and RDF, and there are many service concepts related to other purposes
> of formal information models that go beyond what one might expect a
> "terminology" to support. The question is more one of the scope of
> CTS2 than the "focus" of API4KB. If the scope of CTS2 is all of ISO
> 11179-3 ed3, then the distinctions are going to be almost entirely
> about reasoning.
>
>> So, I tend to consider that a CTS2 and API4KB are both of interest.
>
> I agree completely.
>
>> They may be overlapping in the sense that a CTS2 implementation may
>> be used by a API4KB implementation to access to terminologies
>> ("basic elements for semantics"), even if that would be a clumsy
>> implementation!
>>
>
> This is the wrong way to think about it. Per the recently proposed
> "future standards architecture for ISO TC184/SC4", the
> "meta-relationship" between an ontology declaration and a
> "terminological entry" in a "pure terminology" is just a URI, or some
> similar form of bibliographic citation. It means: this symbol in the
> formal language represents (or formalizes) the same concept as this
> term in the dictionary. In most cases, the orthography in the
> ontology will be similar, the ontology will provide an "annotation"
> that replicates or paraphrases the natural language definition, and it
> may provide a formal definition that is equivalent. That is why an
> ontology repository is perfectly capable of supporting some
> terminology services directly.
>
> But underlying the verbiage, a primary Healthcare intent for CTS is
> about "data element dictionaries" -- the relationship between _codes_
> and jargon terms and natural language definitions. That is the
> substance of the model in ISO 11179-3:2004 (which is what the NCI
> people use for this). "Codes" are designations for the concept in a
> different kind of formal language, and that is not an ontological
> concept, although it is often an information modeling concept.
>
> So for OMG, the requirement should be that the same service has the
> same service specification in both standards. The trick is to sort
> out the truly terminology services that are common to ontologies,
> dictionaries and data element dictionaries. And all of that should be
> coordinated with the (still draft) information model of that stuff in
> JTC1/SC32.
> Alternatively, we can ask ourselves how many more silos we want to
> produce.
>
> -Ed
>
>
> --
> Edward J. Barkmeyer Email: edbark@xxxxxxxx
> National Institute of Standards & Technology
> Manufacturing Systems Integration Division
> 100 Bureau Drive, Stop 8263 Tel: +1 301-975-3528
> Gaithersburg, MD 20899-8263 FAX: +1 301-975-4694
>
> "The opinions expressed above do not reflect consensus of NIST,
> and have not been reviewed by any Government authority."
> _________________________________________________________________
> Msg Archives: http://ontolog.cim3.net/forum/health-ont/
> Community Files: http://ontolog.cim3.net/file/work/health-ont/NHIN-
> RFI/
> To Post: mailto:health-ont@xxxxxxxxxxxxxxxx
> Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?NhinRfi (017)
-----------------------
Alan Rector
Professor of Medical Informatics
School of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL +44 (0) 161 275 6149/6188
FAX +44 (0) 161 275 6204
www.cs.man.ac.uk/~rector
www.co-ode.org
http://clahrc-gm.nihr.ac.uk/ (018)
_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/health-ont/
Community Files: http://ontolog.cim3.net/file/work/health-ont/NHIN-RFI/
To Post: mailto:health-ont@xxxxxxxxxxxxxxxx
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?NhinRfi (019)
|