My Experience with designing shallow ontologies for open
data on semantic web
In supporting the scientific publishing task force
under w3c
HCLS group, I need to design a shallow ontology for people to
self-publish
experiment data as single unit in semantic format. The related
information such
as project, protocol, product used, researcher, and research group
should be
also covered by the ontology. The main use of the ontology is to make
searching
and sharing data on the web more effective.
I have surveyed almost all of the related
ontologies that I
can find because it would be good to reuse as many existing classes and
properties as possible. However, I have found that very little can be
reused
from the existing ontologies to build the ontology needed for the
project. Here
are my general observations:
(1)
The existing classes may have names
similar to
the class I need, but they usually don’t have any property defined yet
or have
quite different properties.
(2)
The existing properties, especially
of the type
of ObjectProperty, may have similar definition in words, but they don’t
really
have the range (i.e. data type) desired by the properties I need.
(3)
Deep ontologies in several
scientific domains
are quite comprehensive, but the terms may have narrower scope (e.g.
limited to
one scientific domain) or are in such a deep hierarchy that they can’t
be used
effectively for web search application.
(4)
When describing one object like an
experiment, there
is actually a spectrum of needs. At one end of the spectrum, simple
description
at coarse level may satisfy the need already (such as web search
application),
while at the other high granularity is required.
To illustrate my points, I provide a few examples
below. The
class and its properties that I need are listed first, followed by the
main
related classes and properties I have considered to reuse. More
examples can be added if this becomes helpful to our discussion.
Example 1:
Class spe:Experiment
Definition: A single experiment, usually done by a
defined
procedure, including but not limited to scientific experiments. An
experiment
may start from a hypothesis and if so, the conclusion should articulate
results
in relation to the hypothesis, e.g. rejection, supported, etc. This
allows a search engine to discover what has happened to what
hypothesis. Some
main concepts like proteins or genes can be identified by URI in a list
so that
one can unambiguously search all the works that have been done related
to any
specific concept. It should uniquely identify the protocols,
additional
tools, and references, which serve as important links for effective
discovery
of related information.
Properties:
dc:title (experiment name)
dc:type (type of experiment)
dc:subject (related disciplines like biology, medicine, chemistry,
computer, software, etc. using keywords or key phrases, may be chosen
from
controlled vocabularies)
spe:associatedProject (project composed of the experiment)
spe:introduction
spe:hypothesis (hypothesis for the experiment if any)
spe:procedure (specific to this experiment, may use protocols for some
of
the steps)
spe:protocolUsed (protocols used in the procedure, by URIs)
spe:productUsed (list critical tools that are used in the procedure
but not covered in any protocol used, by URI)
spe:data
spe:dataLink (URI to additional data source that can't be represented
here)
spe:result (result of the experiment)
spe:conclusion (articulate if the hypothesis is supported or rejected)
spe:discussion
spe:mainConcept (URI to some main concepts, uniquely identified by URI
)
spe:conductedBy (who conducts the experiment, by URI)
spe:PI (principle investigator for the experiment, by URI)
boon:startTime (experiment start time )
boon:endTime (experiment end time)
boon:status (an experiment may be published while it's still in
progress)
spe:publishedIn (publications using the experiment's data and results,
by
URI)
dc:references (URIs to publications this experiment refers to, by URI)
dc:isReferencedBy (may list the publications that make reference to
this
experiment, by URI)
boon:fundingSource
dc:description
dc:publisher
dc:license
dc:rights
boon:createdBy (someone who publishes the experiment, not necessarily
the
experimenter)
boon:createTime
boon:updatedBy
boon:updateTime
boon:altwebpage (related or alternative web page, URL)
Examples of related classes in other existing
ontologies:
OBI:study -
a term similar to experiment, but definition
limited to biomedical study; no property
is defined for the class. Not the same as spe:Experiment.
EXPO:ScientificExperiment – limited to science
domain; not
clear what properties it has.
Examples of related properties in other existing
ontologies:
OBI:hypothesis – same definition as
spe:hypothesis, but
domain and range not defined in current version. Not appropriate for
now, but
may be re-used after domain and range are defined.
OBI: protocol – only has the term at this point,
no property
defined for this class. Not appropriate.
EXPO:ResearchHypothesis – subclass of
ExperimentalHypothesis, but no property defined; not clear what the
hypothesis
is expressed in.
EXPO:ExperimentalProtocol – a class without any
property
defined.
EXPO:ExperimentalResults – a class without
property defined.
Example 2:
Class boon:Person
Description: Any person, particularly a researcher.
Properties:
boon:name (full name)
boon:salutation (Dr., Mr., Ms., etc.)
dc:type (functional category, such as research, development,
production,
business. etc.)
dc:subject (related disciplines like biology, medicine, chemistry,
software, etc.)
boon:jobTitle (job title)
boon:role (job function, etc)
boon:expertise (skill set, experience, etc)
boon:interest (professional interest areas)
boon:currentProject (current project the person has)
boon:pastProject (past projects the person did)
spe:hasPublication (publications the person has)
boon:inGroup (group the person is in)
boon:inOrganization (organization the person is in)
boon:workContact (work contact info)
boon:bioLink (url link to biography)
boon:altwebpage
boon:createTime
boon:updateTime
boon:webpage
dc:rights
Examples of related class and properties:
FOAF:Person – similar, but its properties are
quite
different. It also misses some important properties a researcher (a
Person)
usually has, including expertise, role, group, organization, full
contact info,
etc.
FOAF:currentProject – range is owl:Thing, which is
too broad.
A research should have a current project pointing to an existing
Project
instance.
FOAF:interest – range is foaf:Document. A literal or string should be good enough.
Bill Andersen wrote:
On Jan 22, 2007, at 14:58 , AJ Chen wrote:
Unfortunately, I ended up to redefine everything except for a
few common terms from dublin core.
AJ,
Can you say why this was? The answer would be illuminating.
.bill
_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
Community Files: http://ontolog.cim3.net/file/work/OntologySummit2007/
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2007
Community Portal: http://ontolog.cim3.net/