Also, note that we have included
references to thread chairs/champions in this version, but will remove these in
the final communique.
Thanks for your help!
Leo and Mark
_____________________________________________
Dr.
Leo Obrst The MITRE Corporation,
Information Semantics
lobrst@xxxxxxxxx
Information Discovery & Understanding, Command and Control Center
Voice: 703-983-6770 7515 Colshire Drive, M/S H305
Fax:
703-983-1379 McLean, VA 22102-7508, USA
From: Obrst, Leo J.
Sent: Saturday, April 26, 2008 6:06 PM
To: Ontology Summit 2008
Cc: Mark Musen
Subject: [COMMUNIQUE] Draft version
Folks,
This is the current version (our
internal draft version 5). Note that we have reduced some of your summaries,
mostly structurally; we have also rewrote many.
We still do not have a State of the
Art summary.
Please review this draft version
and comment on it. It is NOT yet entered onto the Wiki (we really don’t want to
go through formatting stuff again, at least today). If you don’t like
what you see in your thread here: send us something better. If you can keep
to the quasi-MS Word format, all the better: it will simplify our next version.
If you want a copy of the Word doc, please let us know.
Personally, we think this is much
too long at 6 pages and really would prefer much more succinct summaries. Apologies
for shrinking and revising your stuff beyond recognition. Time is short.
Thanks much,
Leo and Mark
Ontology Summit
2008 Communiqué:
Towards an Open
Ontology Repository
1.
Introduction
Each annual Ontology Summit initiative
intends to make a statement appropriate to each Summit’s theme as part of the
Ontolog Forum’s general advocacy to bring ontology science and engineering into
the mainstream. The theme this year is “Towards an Open Ontology
Repository”. This communiqué represents the joint position of those in
the Ontolog Forum who were engaged in the year's summit discourse on an Open
Ontology Repository (OOR). In this discussion, we have agreed that an “ontology
repository is a facility where ontologies and related information artifacts can
be stored, retrieved and managed."
We believe in the premise and the
promise of the Semantic Web, i.e., a Web of exposed data and the interpretation
of that data, i.e., its semantics, using common standards, thereby enabling
distinguishable, computable, reusable, and sharable meaning of Web and other
artifacts: data, documents, and services. But we also believe that making that
vision a reality requires additional supporting infrastructure. And we believe
that infrastructure should be open, extensible, and provide common services.
The purpose of an Open Ontology
Repository is to provide an architecture and an infrastructure that supports a)
the creation, sharing, searching, and management of ontologies, and b) linkage
to database and XML Schema structured data and documents. Complementary goals
include fostering the ontology community, the identification and promotion of
best practices, and the provision of services relevant to the ontologies and
instance stores. Automated semantic interpretation of content expressed in
knowledge representation languages, the creation and maintenance of mappings
among disparate ontologies and content, and inference over this content are
examples of anticipated services. Such repositories ultimately will support a
broad range of semantic services and applications of interest to enterprises
and communities.
Achieving these goals will help reduce
semantic ambiguity whenever and wherever information is shared, thereby allowing
information to be located, searched, categorized, and exchanged with a more
precise _expression_ of its content and meaning. The artifacts of the repository
will provide a semantic grounding for diverse formats and domains, ranging from
the conceptual domains and specific disciplines of communities to technical
schema such as WSDL, UDDI, RSS, and XML schema, and of course expressed in
standard ontology languages such as RDF, OWL, Common Logic, and others. Perhaps
most importantly, the repository will enable wide-scale knowledge re-use and
reduce the need to re-invent the wheel to define concepts and relationships
that are already understood.
These goals cannot be achieved at once,
and must track the evolution of best practices as well as technology itself. It
is also good system development practice to bound complexity by defining a
system in terms of a series of short-term, achievable objectives. For this
reason, as for other such initiatives, it’s envisioned that the Open Ontology
Repository will be developed in a series of phases, proceeding from the simple
to the complex, with achievable goals that capitalize on previous experience
and the emergence of technology over time. It is important to note that for any
given phase, planning and prototyping is always in progress for subsequent
phases.
2.
Requirements for an Open Ontology
Repository
The Ontolog community in the past year
determined that the primary technical areas that needed to be discussed and
illuminated to make the vision of an Open Ontology Repository a reality were
the following: 1) determining the current state of the art in ontology
repositories, 2) determining quality and gatekeeping criteria for registering
and then provisioning ontologies and their instances, 3) developing an ontology
of ontologies that would act as structure and metadata for registering
ontologies and supporting the common repository of their instances, data, and
services, and 4) developing a sound architecture for the envisioned Open
Ontology Repository. Elaborations of these four technical areas,
together, helps provide both requirements and the ideas and tools that could
realize those requirements. The remainder of this communiqué thus summarizes
the results of the discussions in these areas.
3. State of the Art (Frank
Olken) : http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_StateOfArt
4.
Quality and Gatekeeping (Barry Smith and Fabian
Neuhaus) : http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_QualityAndGatekeeping
We distinguish between gatekeeping
and quality control. Gatekeeping criteria are a set of minimal requirements
that any ontology within the OOR has to meet. These criteria are intended to
enable the users of the OOR to find ontologies that fit their needs quickly,
they are not supposed to ensure the quality of the ontologies.
4.1 Gatekeeping Criteria
The ontologies in the OOR have
to meet the following criteria.
1. The
ontology is open. (see below)
2.
The ontology is expressed in a formal language with a
well-defined syntax.
3.
The authors of the ontology provide the required
metadata.
4.
The ontology has a clearly specified and clearly
delineated scope.
5.
Successive versions of an ontology are clearly
identified.
6.
The ontology is adequately labeled.
So far the most controversial
suggested criterion has been the"openness". We need to distinguish
between different kinds of "openness", in particular between 'open'
development processes and 'open' software licenses. Different members of the
community have different preferences on which kinds of openness and how much
openness should be required. Some would like to cancel 'openness' as a gateway
criterion and rather require the developers of ontologies to provide metadata
that allows potential users to understand how 'open' (and in which senses of
the word) an ontology is. This issue needs to be addressed during the meeting
in Gaithersburg.
4.2 Quality Control
The community agrees that it is
not sufficient for the OOR just to store ontologies, but that it needs to
provide the possibility to evaluate the ontologies within it. There is no
agreement on how to evaluate ontologies; the main strategies suggested are: (i)
A market driven approach where ontologies are reviewed by users and ranked like
items on Amazon.com; and (ii) an editorial process where ontologies are
reviewed by experts in a similar way as papers which are submitted to
scientific journals. The difference in opinion about ontology evaluation
reflects the fact that the members of the community are using ontologies for different
purposes and thus have different perspectives on what ontologies are. However,
there is agreement that the OOR should accept ontologies regardless of whether
their developers see ontologies as pieces of software, as representations of
scientific knowledge, or as standardized vocabularies. Accordingly, the OOR
needs to enable the different styles of evaluation and different standards for
ontologies. We suggest a distributed governance model where the OOR allows
for subcommunities that provide stewardship for their respective fields by
evaluating the available ontologies and by distinguishing high-quality
ontologies according to appropriate standards.
5.
Ontology of Ontologies (Michael Gruninger and Pat
Hayes): http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_OntologyOfOntologies
The metadata for ontologies should
support the sharing and reuse of ontologies within the repository.
The metadata should allow users to:
1.
retrieve
ontologies for use in domain applications;
2.
retrieve
ontologies to be integrated with other user ontologies;
3.
retrieve
ontologies that will be extended to create new user ontologies;
4. determine
whether or not an ontology can be integrated with user ontologies;
5.
determine
whether a set of ontologies retrieved from the repository can be used together;
6.
determine
whether an ontology in the repository can be partially shared.
We can consider logical
metadata (logical properties of the ontology independent of any implementation
or engineering artifact) and engineering metadata (properties of the
ontology as considered as an engineering artifact). The logical metadata
include the following.
5.1 Logical Metadata
What language is used to specify the
ontology? There is a range of languages. A
formal language has a syntax (logical symbols together with a formally
specified grammar) and a model theory (which specifies the conditions under
which expressions in the language can be given particular truth assignments). The report "Evaluating Reasoning Systems"
contains a classification of formal languages used to specify ontologies (see
references on the thread page). A formalizable language
has a syntax, although it does not have a model theory. Some examples include (with their languages parenthesized):
topic maps (XML), folksonomies (XML), ISO 15926 (EXPRESS) Some ontologies are only specified in natural language or
specialized syntactic formats: WordNet, most taxonomies,
thesauri.
Modularity is also important. Is a
particular ontology a monolithic set of axioms, or is it composed of a set of
smaller modules? Is each module considered to be a
separate ontology within the repository? If not, what are the relationships
between the modules? Which modules of an ontology can be
used separately? For example, the
Process Specification Language (PSL, http://www.mel.nist.gov/psl/psl-ontology/)
consists of a set of modules which are extensions of a common core theory
PSL-Core. Metadata for each module specifies which other modules must also be
included when using the module.
The relationships among ontologies is also
important. These include the notions of
mutual consistency. For example within the Catalog of Temporal Theories
[REF], a dense linear ordering is inconsistent with a discrete linear ordering.
Another relationship is that of entailment: is one ontology
stronger than another in the sense that any sentence in the first ontology
entails the sentences in the second? This would be the case when one ontology
can be considered to be a weaker version of another ontology within the
repository. For example, in the Catalog of Temporal Theories [REF], the
before relation is a partial ordering (i.e. it is a transitive antisymmetric
reflexive relation). Since this ontology axiomatizes all of these properties,
it entails an ontology that only axiomatizes the transitive property, such as
OWL-Time. In other words, OWL-Time is weaker than the first-order theories in
the catalog.
Another relationship is extension. An ontology T1 is an extension of another ontology T2 iff the
set of sentences in T2 contain or entail the sentences in T1. T1 is a conservative extension of T2 whenever every sentence
in the lexicon of T1 is provable from T1 iff it is provable from T2. T1 is a nonconservative extension of T2 whenever there is a
sentence in the lexicon of T1 which is provable from T2 but not from T1.
Another relationship is definable interpretation.
If the ontologies have different sets of primitives and relations, is it
possible to define the primitives and relations of one ontology using the
second ontology?
5.2Engineering
Metadata
Engineering metadata
include: provenance, versioning, existing applications of
the ontology (e.g. interoperability, search, decision support), and
domain-specificity (e.g. biology, supply chain management, manufacturing)
5.3 Candidate Solutions and
Recommendation
The Ontology Metadata
Vocabulary (OMV) http://omv.ontoware.org/
is a strong candidate for describing ontologies in the OOR. In addition, we recommend collecting ontologies from Ontolog
Summit participants, and testing out the different proposals for metadata on
these ontologies. Developing use case scenarios will motivate the use of
the metadata with these ontologies
6 Repository
Architecture (Michelle Raymond and Ravi Sharma) http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_RepositoryArchitecture
Over the past four months several dozen
Ontology Summit 2008 and Open Ontology Repository Forum members have had the
following categories and varieties of inputs and discussions on Repository
Architecture (OOR-A) for the Summit.
·
Presentations,
Panel discussions with experts on managing repositories
·
Architecture
candidates, use cases, Requirements of Repositories to host ontologies
·
Definitions
/ Roles of Repository and Registry and integration
·
Discussion
threads on Open, Distributed, Federated Repositories
·
Metadata
requirements for Ontology Repositories
·
Engines,
search and query functions
·
Preference
for Content (data) out of repositories and inclusions of examples in
repositories. Also whether some functioning ontologies will be resident in
repositories
·
Functional
and Physical characteristics of repositories
·
Non-functional
requirements such as scalability, storage, security, federation, availability,
and testbeds.
·
Also
preliminary discussions included Governance, Standards, and Criteria for
including different languages, types of ontologies, etc.
·
There
were tremendous cross inputs from Ontology of Ontologies, Quality, State of Art
Summary Workspaces and Threads, as well as from the various entities such as
Content and Organizing Committees and other members.
The overall assessment of the community
is to enable open, distributed, federated repositories, and to provide metadata
for each type of ontology registered, as well as providing logical resources,
inference engines etc., that are required to properly test the services and
functions of ontologies served by the repositories. The general consensus was
that the primary functional responsibility for an ontology lies with the
originating ontology owners and their successors (downstream users) and that a
repository cannot stand alone and thereby be responsible for the content that
is generally stored outside the repository. Community work is expected to
continue in the OOR-Forum and in other standards organizations (e.g.
OMG-Ontology Definition Metamodel, XMDR, NCBO, NSF, W3C, OASIS, Industry and
Others). There is potentially great value in such an open ontology repository,
especially to the government (in critical areas such as Healthcare and
bioinformatics and in acquisition and emergency response), as well as to
industry, for example, by enabling participants to use rich semantic
search/querying over repositories which connect multiple ontologies and instance
bases.
7. Conclusion: Toward the Future
We look forward to establishing an open
ontology repository in the future that adheres to the requirements put forth
above. We endorse an open ontology repository that seeks to honor and implement
the following overarching mission requirements:
1.
Establishing
an Open Ontology Repository (OOR) Initiative that will promote the global use
of ontologies, their instance bases, rules, and services, and mappings among
these.
2.
Enabling
and facilitating open, federated, collaborative ontology repositories.
3.
Establishing
best practices for expressing interoperable ontology work in open
registries/repositories.
4.
Enabling
and facilitating the development of common services to support the repository
and to extend the capabilities available to providers, users, and developers
who use the repository.
We believe that creating this kind of
infrastructure will facilitate the emerging Semantic Web.
This Communiqué was reviewed,
collaboratively edited, finalized and adopted by individuals present at the
Ontology Summit 2008.
Endorsed by:
The above Communiqué has
been endorsed by the individuals listed below. Please note that these people
made their endorsements as individuals and not as representatives of the
organizations they are affiliated with.
<Name,
Affiliation>
_____________________________________________
Dr. Leo
Obrst The MITRE Corporation, Information
Semantics
lobrst@xxxxxxxxx
Information Discovery & Understanding, Command and Control Center
Voice: 703-983-6770 7515 Colshire Drive, M/S H305
Fax:
703-983-1379 McLean, VA 22102-7508, USA