Folks,
This is the current version (our internal draft version 5).
Note that we have reduced some of your summaries, mostly structurally; we have
also rewrote many.
We still do not have a State of the Art summary.
Please review this draft version and comment on it. It is
NOT yet entered onto the Wiki (we really don’t want to go through
formatting stuff again, at least today). If you don’t like what you see
in your thread here: send us something better. If you can keep to the quasi-MS
Word format, all the better: it will simplify our next version. If you want a
copy of the Word doc, please let us know.
Personally, we think this is much too long at 6 pages and really
would prefer much more succinct summaries. Apologies for shrinking and revising
your stuff beyond recognition. Time is short.
Thanks much,
Leo and Mark
Ontology Summit
2008 Communiqué:
Towards an Open
Ontology Repository
1.
Introduction
Each
annual Ontology Summit initiative intends to make a statement appropriate to
each Summit’s theme as part of the Ontolog Forum’s general advocacy
to bring ontology science and engineering into the mainstream. The theme this
year is “Towards an Open Ontology Repository”. This communiqué
represents the joint position of those in the Ontolog Forum who were engaged in
the year's summit discourse on an Open Ontology Repository (OOR). In this
discussion, we have agreed that an “ontology repository is a facility
where ontologies and related information artifacts can be stored, retrieved and
managed."
We
believe in the premise and the promise of the Semantic Web, i.e., a Web of
exposed data and the interpretation of that data, i.e., its semantics, using
common standards, thereby enabling distinguishable, computable, reusable, and
sharable meaning of Web and other artifacts: data, documents, and services. But
we also believe that making that vision a reality requires additional
supporting infrastructure. And we believe that infrastructure should be open,
extensible, and provide common services.
The
purpose of an Open Ontology Repository is to provide an architecture and an
infrastructure that supports a) the creation, sharing, searching, and
management of ontologies, and b) linkage to database and XML Schema structured
data and documents. Complementary goals include fostering the ontology
community, the identification and promotion of best practices, and the
provision of services relevant to the ontologies and instance stores. Automated
semantic interpretation of content expressed in knowledge representation
languages, the creation and maintenance of mappings among disparate ontologies
and content, and inference over this content are examples of anticipated
services. Such repositories ultimately will support a broad range of semantic
services and applications of interest to enterprises and communities.
Achieving
these goals will help reduce semantic ambiguity whenever and wherever
information is shared, thereby allowing information to be located, searched,
categorized, and exchanged with a more precise _expression_ of its content and
meaning. The artifacts of the repository will provide a semantic grounding for
diverse formats and domains, ranging from the conceptual domains and specific
disciplines of communities to technical schema such as WSDL, UDDI, RSS, and XML
schema, and of course expressed in standard ontology languages such as RDF,
OWL, Common Logic, and others. Perhaps most importantly, the repository will
enable wide-scale knowledge re-use and reduce the need to re-invent the wheel
to define concepts and relationships that are already understood.
These
goals cannot be achieved at once, and must track the evolution of best
practices as well as technology itself. It is also good system development
practice to bound complexity by defining a system in terms of a series of
short-term, achievable objectives. For this reason, as for other such
initiatives, it’s envisioned that the Open Ontology Repository will be
developed in a series of phases, proceeding from the simple to the complex,
with achievable goals that capitalize on previous experience and the emergence
of technology over time. It is important to note that for any given phase,
planning and prototyping is always in progress for subsequent phases.
2.
Requirements for an Open Ontology Repository
The
Ontolog community in the past year determined that the primary technical areas
that needed to be discussed and illuminated to make the vision of an Open
Ontology Repository a reality were the following: 1) determining the current
state of the art in ontology repositories, 2) determining quality and
gatekeeping criteria for registering and then provisioning ontologies and their
instances, 3) developing an ontology of ontologies that would act as structure
and metadata for registering ontologies and supporting the common repository of
their instances, data, and services, and 4) developing a sound architecture for
the envisioned Open Ontology Repository. Elaborations of these four technical
areas, together, helps provide both requirements and the ideas and tools that
could realize those requirements. The remainder of this communiqué thus
summarizes the results of the discussions in these areas.
3. State of the Art (Frank
Olken) : http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_StateOfArt
4.
Quality and Gatekeeping (Barry Smith and Fabian
Neuhaus) : http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_QualityAndGatekeeping
We distinguish between gatekeeping and quality control.
Gatekeeping criteria are a set of minimal requirements that any ontology within
the OOR has to meet. These criteria are intended to enable the users of the OOR
to find ontologies that fit their needs quickly, they are not supposed to
ensure the quality of the ontologies.
4.1 Gatekeeping Criteria
The ontologies in the OOR have to meet the following
criteria.
1. The
ontology is open. (see below)
2.
The ontology is expressed in a formal language with a
well-defined syntax.
3.
The authors of the ontology provide the required
metadata.
4.
The ontology has a clearly specified and clearly
delineated scope.
5.
Successive versions of an ontology are clearly
identified.
6.
The ontology is adequately labeled.
So far the most controversial suggested criterion has
been the"openness". We need to distinguish between different kinds of
"openness", in particular between 'open' development processes and
'open' software licenses. Different members of the community have different
preferences on which kinds of openness and how much openness should be
required. Some would like to cancel 'openness' as a gateway criterion and
rather require the developers of ontologies to provide metadata that allows
potential users to understand how 'open' (and in which senses of the word) an
ontology is. This issue needs to be addressed during the meeting in Gaithersburg.
4.2 Quality Control
The community agrees that it is not sufficient for the
OOR just to store ontologies, but that it needs to provide the possibility to
evaluate the ontologies within it. There is no agreement on how to evaluate
ontologies; the main strategies suggested are: (i) A market driven approach
where ontologies are reviewed by users and ranked like items on Amazon.com; and
(ii) an editorial process where ontologies are reviewed by experts in a similar
way as papers which are submitted to scientific journals. The difference in
opinion about ontology evaluation reflects the fact that the members of the
community are using ontologies for different purposes and thus have different
perspectives on what ontologies are. However, there is agreement that the OOR should
accept ontologies regardless of whether their developers see ontologies as
pieces of software, as representations of scientific knowledge, or as
standardized vocabularies. Accordingly, the OOR needs to enable the different
styles of evaluation and different standards for ontologies. We suggest a
distributed governance model where the OOR allows for subcommunities that
provide stewardship for their respective fields by evaluating the available
ontologies and by distinguishing high-quality ontologies according to
appropriate standards.
5.
Ontology of Ontologies (Michael Gruninger and Pat
Hayes): http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_OntologyOfOntologies
The
metadata for ontologies should support the sharing and reuse of ontologies
within the repository.
The
metadata should allow users to:
1. retrieve
ontologies for use in domain applications;
2. retrieve ontologies
to be integrated with other user ontologies;
3. retrieve
ontologies that will be extended to create new user ontologies;
4.
determine
whether or not an ontology can be integrated with user ontologies;
5. determine
whether a set of ontologies retrieved from the repository can be used together;
6. determine
whether an ontology in the repository can be partially shared.
We can consider logical metadata
(logical properties of the ontology independent of any implementation or
engineering artifact) and engineering metadata (properties of the
ontology as considered as an engineering artifact). The logical metadata
include the following.
5.1 Logical Metadata
What language is used to specify the ontology? There is a range of languages. A formal language has a syntax
(logical symbols together with a formally specified grammar) and a model theory
(which specifies the conditions under which expressions in the language can be
given particular truth assignments). The report
"Evaluating Reasoning Systems" contains a classification of formal
languages used to specify ontologies (see references on the thread page). A formalizable language has a syntax, although it does
not have a model theory. Some examples
include (with their languages parenthesized): topic maps (XML), folksonomies
(XML), ISO 15926 (EXPRESS) Some ontologies are
only specified in natural language or specialized syntactic formats: WordNet, most taxonomies, thesauri.
Modularity is also important. Is a particular ontology a monolithic
set of axioms, or is it composed of a set of smaller modules? Is
each module considered to be a separate ontology within the repository? If not,
what are the relationships between the modules? Which
modules of an ontology can be used separately? For example, the Process Specification Language (PSL, http://www.mel.nist.gov/psl/psl-ontology/)
consists of a set of modules which are extensions of a common core theory
PSL-Core. Metadata for each module specifies which other modules must also be
included when using the module.
The relationships among ontologies is also important. These include the notions of mutual
consistency. For example within the Catalog of Temporal Theories [REF], a
dense linear ordering is inconsistent with a discrete linear ordering.
Another relationship is that of entailment: is one ontology stronger than another in
the sense that any sentence in the first ontology entails the sentences in the
second? This would be the case when one ontology can be considered to be a
weaker version of another ontology within the repository. For example, in
the Catalog of Temporal Theories [REF], the before relation is a partial
ordering (i.e. it is a transitive antisymmetric reflexive relation). Since this
ontology axiomatizes all of these properties, it entails an ontology that only
axiomatizes the transitive property, such as OWL-Time. In other words, OWL-Time
is weaker than the first-order theories in the catalog.
Another relationship is extension. An ontology T1
is an extension of another ontology T2 iff the set of sentences in T2 contain
or entail the sentences in T1. T1 is a conservative
extension of T2 whenever every sentence in the lexicon of T1 is provable from
T1 iff it is provable from T2. T1 is a
nonconservative extension of T2 whenever there is a sentence in the lexicon of
T1 which is provable from T2 but not from T1.
Another relationship is definable interpretation. If the ontologies
have different sets of primitives and relations, is it possible to define the
primitives and relations of one ontology using the second ontology?
5.2Engineering
Metadata
Engineering metadata include: provenance,
versioning, existing applications of the ontology (e.g.
interoperability, search, decision support), and domain-specificity (e.g. biology,
supply chain management, manufacturing)
5.3 Candidate Solutions and Recommendation
The Ontology Metadata Vocabulary (OMV) http://omv.ontoware.org/ is a strong
candidate for describing ontologies in the OOR. In
addition, we recommend collecting ontologies from Ontolog Summit participants,
and testing out the different proposals for metadata on these ontologies.
Developing use case scenarios will motivate the use of the metadata with
these ontologies
6 Repository
Architecture (Michelle Raymond and Ravi Sharma) http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_RepositoryArchitecture
Over
the past four months several dozen Ontology Summit 2008 and Open Ontology
Repository Forum members have had the following categories and varieties of
inputs and discussions on Repository Architecture (OOR-A) for the Summit.
·
Presentations,
Panel discussions with experts on managing repositories
·
Architecture
candidates, use cases, Requirements of Repositories to host ontologies
·
Definitions
/ Roles of Repository and Registry and integration
·
Discussion
threads on Open, Distributed, Federated Repositories
·
Metadata
requirements for Ontology Repositories
·
Engines,
search and query functions
·
Preference
for Content (data) out of repositories and inclusions of examples in
repositories. Also whether some functioning ontologies will be resident in repositories
·
Functional
and Physical characteristics of repositories
·
Non-functional
requirements such as scalability, storage, security, federation, availability,
and testbeds.
·
Also
preliminary discussions included Governance, Standards, and Criteria for including
different languages, types of ontologies, etc.
·
There
were tremendous cross inputs from Ontology of Ontologies, Quality, State of Art
Summary Workspaces and Threads, as well as from the various entities such as
Content and Organizing Committees and other members.
The
overall assessment of the community is to enable open, distributed, federated
repositories, and to provide metadata for each type of ontology registered, as
well as providing logical resources, inference engines etc., that are required
to properly test the services and functions of ontologies served by the
repositories. The general consensus was that the primary functional
responsibility for an ontology lies with the originating ontology owners and
their successors (downstream users) and that a repository cannot stand alone
and thereby be responsible for the content that is generally stored outside the
repository. Community work is expected to continue in the OOR-Forum and in
other standards organizations (e.g. OMG-Ontology Definition Metamodel, XMDR,
NCBO, NSF, W3C, OASIS, Industry and Others). There is potentially great value
in such an open ontology repository, especially to the government (in critical
areas such as Healthcare and bioinformatics and in acquisition and emergency
response), as well as to industry, for example, by enabling participants to
use rich semantic search/querying over repositories which connect multiple
ontologies and instance bases.
7. Conclusion: Toward the Future
We
look forward to establishing an open ontology repository in the future that
adheres to the requirements put forth above. We endorse an open ontology
repository that seeks to honor and implement the following overarching mission
requirements:
- Establishing an Open Ontology
Repository (OOR) Initiative that will promote the global use of
ontologies, their instance bases, rules, and services, and mappings among
these.
- Enabling
and facilitating open, federated, collaborative ontology repositories.
- Establishing
best practices for expressing interoperable ontology work in open
registries/repositories.
- Enabling
and facilitating the development of common services to support the
repository and to extend the capabilities available to providers, users,
and developers who use the repository.
We believe that creating this kind of infrastructure will facilitate the
emerging Semantic Web.
This Communiqué was reviewed, collaboratively edited,
finalized and adopted by individuals present at the Ontology Summit 2008.
Endorsed by:
The above Communiqué has been endorsed by the individuals listed
below. Please note that these people made their endorsements as individuals and
not as representatives of the organizations they are affiliated with.
<Name, Affiliation>
_____________________________________________
Dr. Leo
Obrst The MITRE Corporation, Information
Semantics
lobrst@xxxxxxxxx
Information Discovery & Understanding, Command and Control Center
Voice:
703-983-6770 7515 Colshire Drive, M/S H305
Fax:
703-983-1379 McLean, VA 22102-7508, USA