Alexander Garcia    (207H)

University of Bremen
Germany

email: cagarcia-at-uni-bremen.de    (207I)

see: http://www.alexandergarcia.name    (207J)

I am currently at the University of Bremen. I have previously been working in biomedical ontologies and semantic web technology within the biomedical domain. My areas of interest have to do “how to use ontologies” more than how to develop them. However, so far I have mostly involved in the actual “how to develop them”. I also have interest in semantic web technology, cognitive support and systems architecture.    (1JQW)

Something for the Ontology repository initiative.    (1LDZ)

As The Semantic Web (SW) envisions a metadata-rich Web where human-readable content will have machine-understandable semantics there has been an increasing number of OWL ontologies [1] responding to those knowledge representation requirements. Wang et al collected 1275 files, both OWL and RDF schemas, in 2005; a more recent counting, based on web crawling, gave an impressive result of over 6000 validated OWL ontologies (Backer et al, unpublished data); by the same vein Swoogle [2] hosts 2,563,125 Semantic Web Documents (SWD) [3]. These growing numbers, which reflect the intrinsic need of the SW for ontologies, have fostered a number of research projects aimed at supporting re-usability, better modularization as well as intelligent storage and retrieval for the encoded knowledge. To this end the design and development of an agreed upon metadata for describing ontologies is critical. Several repositories should be able to facilitate not only the discovery of reusable components, entire ontologies or just portions of them, but also interoperability across repositories. In a recent effort to unify the description of ontologies, the Ontology Metadata Vocabulary Consortium [4] proposed a set of descriptors that follows the principles of the Dublin Core. This is a step in the right direction as most ontologies exist without any additional information in the form of metadata. We advocate the use of OMV and support further refinements and extensions of this proposal. Although the extensions we propose for the Ontology Metadata Vocabulary (OMV) [4] are, in principle, domain independent, our main interest lies in supporting repositories with a particular focus on supportive applications for elderly and disabled people. As part of the OASIS project (Open architecture for Accessible Services Integration and Standardization) [5], we are currently developing a repository for ontologies aiming to describe spatial-temporal scenarios as well as medical and technological information related to elderly and disabled populations, i.e. users with special needs. Within this context we are working on a repository of ontologies that provides structured access and easy-to-extend descriptions for the ontologies it hosts. The following principles are important when using metadata for structuring ontology repositories: i) ontology standards should be kept intact; ii) the metadata-core is connected to various meta-descriptions through alignments – mediators; iii) meta-descriptions structure specific parts of knowledge; iv) meta-descriptions need to support query languages and reasoning; v) meta-descriptions may again be ontologies.    (1LE0)

Repositories, within the context of the SW, should offer more than just data storage. The Ontolog community, a virtual community of practice of ontology experts, discussed the matter and agreed that the purpose of an Open Ontology Repository (OOR) is to provide an architecture and an infrastructure that supports: a) the creation, sharing, searching, and management of ontologies, and b) linkage to database and XML Schema structured data and documents [5]. Currently there are some ontology repositories over the web, however none of them complies with those requirements agreed upon during the last Ontolog Summit [6]. For instance, Swoogle provides a single entry-point to several semantic web documents (ontologies), but does not offer any validation, as there is no quality control over the exposed material; nor does it facilitate query or editing operations. Swoogle’s query approach for finding ontologies is based on (sub) string search and link-based reference counting; once the document has been found it doesn’t support any further operation. It also allows the composition of queries via the REST interface. OntoSelect [7] offers a similar approach; it presents the user with a basic overview of web-accessible ontologies. The collection can be browsed by: ontology name (derived from owl:Ontology/rdfs:comment); format (from the ontology URL); human language (from rdfs:label); number of labels, classes, properties, or included ontologies (owl:imports). Currently OntoSelect hosts 1530 ontologies. The TONES repository, developed as part of the TONES project [8], hosts 185 ontologies. It aims to provide a reasonable amount of ontologies for testing purposes, emphasizing reasoning techniques. This repository also supports the REST interface for programmatic access. Ontologies can be selected and sorted by means of metrics for expressivity, class and property restrictions and axioms, logics, and individuals. A novel approach is provided by Rubin et al [9] with Bioportal. Not only does it provide access to several ontologies, but it also facilitates online editing operations such as annotation of ontologies in the form of marginal notes –currently only available for classes. In [10], a lightweight metadata ontology for an ontology repository of a multiagent system is presented. The ontology consists of four classes: Conceptualization, Ontology, Person, and Representation. The Ontology is described by a title, version number, language, author, and textual description. The Person defines the author of an ontology, while the Conceptualization class defines an abstract view, on which the ontology is based. The class Representation specifies the encoding of Ontology, Person, and Conceptualization. This repository also supports the REST interface. Although existing ontology repositories aim to provide access to semantic web documents by means of similar query facilities, they diverge in the methods and techniques employed for gathering these documents and making them available; each one of them interprets and uses metadata in a different manner. For instance, Swoogle defines three categories of metadata; (i) basic metadata, which considers the syntactic and semantic features of a ontology, (ii) relations, which consider the explicit semantics between individual ontologies, and (iii) analytical results such as SWO/SWDB classification, and ontologies [2]. Both, TONES and OntoSelect, also rely on structural metadata; however, the use of this metadata is limited to a subset of it. As Bioportal supports the involvement of communities of practice it makes use not only of structural metadata but also of that metadata describing how the community has engaged. For instance, descriptions of those who have defined a new relationship by means of a marginal note in a way that it facilitates to establish rankings of confidence.    (1LE1)