This page, initially drafted by Gary Berg-Cross in the Fall of 2010, provides some basic information, pointers and links to explanations of Ontologies for beginners in this field.    (2JJI)

More focused discussions are planned for geo-spatial ontologies along with material to help develop such ontologies. As time permits these may be focused on specific efforts of the community including demonstrations, workshops and developmental efforts as part of the NSF Interop work.    (2JCP)

Practical & Modern Definitions    (2T92)

Different people have different meanings for the field "ontology" and what a specific one is but simply stated an ontology holds information about what categories exist in the domain, what properties they have, and how they are related to one another (Chandrasekaran et al. 1999). The term ontology has now been exported from philosophy is increasingly used by knowledge engineers to refer to a computational artifact or resource that makes explicit the elements (entities and relations) in a problem domain. This scientific field of ontology has an information science sense whose use has grown since since the late 1980s. While the application of ontology commercially still remains to be fully exploited efforts such as the Semantic Web have made it more widely known by the mainstream IT community. In practice "ontology" may refer to many different types of artifacts that are created by and used in different communities. They may represent entities and their relationships for several purposes including annotating datasets, supporting natural language understanding, integrating information sources, semantic interoperability and to serve as a background knowledge in various applications. Some of these uses are more fully described below.    (2T8Z)

When ontologies are used in specific ways such as enhancing IT systems they have a focus and in practice only cover a needed portion of a domains. Thus like other IT products an actual production ontology is often derived from one or more Use Cases that help scope an ontology. A Use Case makes clear the types of problems that an ontological artifact is designed to address. Within this scope key concepts and how they are called often provide a starting point on a new ontology or enhancing, broadening and refining as existing one. Clearly vocabularies and ontologies are related and some core aspects of this relationship are discussed below.    (2T3U)

Vocabularies, Specialized Terminologies, Theories and Ontologies    (2T8J)

Ontologies use a lexical vocabulary and so you cant have an ontology without a vocabulary to express one's assertions. But processable ontologies are more than a simple collection of vocabulary terms no matter how well documented the terms are in natural language. It is helpful to have vocabulary term for distinguishing concepts like river, stream, creek, brook, channel tributary and rivulet. These are related terms and one can generally distinguish them. When we have specialized vocabularies, such as in Science, we often call this a terminology. There distinctions may be finer and more precise. We may start with a general concept and become more specific as in a taxonomy or lattice. A river is a term for a pretty general concept for a natural watercourse. A canal is a term for a waterway but not a natural one. A river does has what are called principle features. It is usually freshwater, flowing towards an ocean, a lake, a sea, or another river (see http://miimr.com/1074795-River).    (2T8O)

There seems to be no general agreed on rule that defines what can be called a river, although size involved in distinguishing it from other concepts. Small rivers may be called by several of the other special terms including stream, creek, brook, as well as rill. Analysis like this refines a vocabulary, but it also reveals some underlying concepts which we might say can be organized as a theory.    (2T8L)

We may try to create an ontology from a vocabulary of terms like this, but to be an ontology it should organized into a agreed up theory that is relevant to the subject matter. By this we believe that it is more than a surface description based on terms with intuitive meaning. To be a quality ontology it should be able to make meaningful statements about what exists in its focused domain. So river and stream ideas are organized along an hydrological theory of what brings water collections into existence and what natural processes, such as downhill streaming flows, they follow. The hydrological concepts are more basic and underlie the real world phenomena at the river-stream level. Thus water within a river (or other watercourses) is generally collected from precipitation through surface runoff, groundwater recharge, springs, as well as the seasonal release of stored water. Stored water may include what is held by dams but usually is from from natural ice and snow packs (e.g., from glaciers). This theory specifies various classes of real objects (e.g. snow pack) and process (e.g. runoff), and relations (e.g. precipitates) that we assert applies among instances of such classes, as well as relationships among such classes and their instances. Our ontologies rely of a degree of scientific reality and as Burian & Trout put it:    (2T94)

Interestingly when we apply the role of organizing theory for an IT-oriented ontology it reverses the classic philosophical relations between Science an Ontology. To Aristotle Ontology was primary and explained phenomena in Science. There is still that activity as discussed by Burian & Trout. However, for a practical ontology to solve problems the "best" and most tractable Science theory is prior. Ontology uses theories that it can represent in representational languages to make processable statements about phenomena in domains of interest. Ontological Progress in Science - http://www.phil.vt.edu/Burian/OntProgFi.pdf    (2T96)

There is a technical and additional use of the term theory in ontology that can be confusing. The term 'theory' is used in logic to mean the deductive closure over a collection of propositions or assertions. One can think of Those propositions are usually called axioms, as in Euclid's geometry. With 4 axioms and the process of decuction on them we defined a particular Geometry. It is still more complicated because in the ontological context a theory can also mean a model-theory which provides a basis for the semantics asserted in an ontology. The details of this are beyond the scope of material covered here and is left to the work ontologists. But the point to make is that a good ontology is able to express an organizing theory of concepts using a model-theoretic that is instantiated in an ontological language. The model-theoretic provides a principled way of interpreting the propositions asserted and inferences derived    (2T8P)

We can illustrated some ontological thinking that starts with a distinction between a stream class and a creek class. In a watercourse domain ontology we would need more than these 2 isolated terms or just a series of class words like them. We need some organizing theory that explains why they are different. Here is an example. In England a creek is an inlet, an estuary, and the part of a river that goes down towards the sea. This provides enough of a theory to specify a difference since it relates a creek to a river and a sea concept. The English point to the existence of part of a river that is called a creek based on organized hydrological and locations concepts. We can make this idea explicit in an ontology by making a commitment. This is an agreement to use a vocabulary ("creek") in a way that is consistent with respect to the theory specified by the ontology. In Australia the USA and elsewhere, however, the term is used differently. A creek can be just a small river, so it's much the same as a brook or stream, although brooks may flow more slowly than streams. We would have to make these distinctions on the terms clear in our ontology. We would have to explain why we use the term Gulf Stream, for example. This implies that there is a very general concept of a stream as water flowing. The fact that there are multiple meanings for word like stream (e.g. land streams vs. non-land streams) is a key reason that ontologies need to be distinguished from superficial terminologies. The concepts and relations in each specific ontology will be tagged with a terminological name, but that associated name does usually have the same meaning as ordinary words and terms in discourse. A good ontology will specify its meaning and chose appropriae terms (like land-stream) to avoid confusion.    (2T8M)

Application of Ontologies    (2T3V)

There are many reasons to build an ontology analyzing domain knowledge. An overall goal is that people often want to share a common understanding of the structure of information among people or software agents. In this case an ontology is used to make explicit the semantics and knowledge contained within efforts such as software applications as well as within enterprises and business modeling of particular domains.    (2JDG)

As part of such an effort an ontology can be used to enable reuse of domain knowledge and to make domain assumptions explicit or to separate domain knowledge in a declarative form from the operational knowledge which can be implemented in software.    (2JDH)

The recent [ http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011_Symposium Ontology Summit] at NIST provided an Application Framework which helps understand how ontologies are used. Four types of applications are proposed with ontologies providing a range of functions within these applications:    (2T3W)

1. Integration (usually via matching)    (2T4C)

Ontologies are typically used at runtime by application developers (whose job it is to write translators among the systems.)    (2T4D)

2. Decision Support and Automated Reasoning    (2T4H)

In this type of applications, some automated inference or question answering is the primary functionality. Ontologies are used at runtime, typically by knowledge workers and application users to provide proper inferences. Deduction from Axioms may be provided by general Theorem Provers of special, domain specific reasoning "rules". Usually ontology to support such reasoning are encoded in some form of logic, e.g., first order logic.    (2T4I)

3. Semantic Augmentation    (2T4J)

4. Knowledge Management (KM)    (2T4L)

This type of application supports discovery & categorization of information resources. Ontologies are use to discover patterns in data by matching against the classes, relations, and axioms of the ontology against the data. Ontologies may also be used to organize unstructured information resources by identifying classes and relations within the ontology with terms that appear in information resources. This makes geospatial ontology useful for discovering, connecting, integrating, aggregating and interpreting information of different kinds, from multiple sources, and at varying scales.    (2T4S)

Across all these applications functions that ontology may provide include:    (2T4N)

1. Matching / mapping of concepts supporting such things as Integration Task or direct Matching Tasks    (2T3X)

2. Automated inference support for such tasks as Query Formulation, Enhancement or Rewriting    (2T3Y)

3. Classification which may be involved in Search, Filtering, Indexing, or Annotation tasks    (2T3Z)

4. Specification such as used in Configuration tasks and    (2T40)

5. Terminological Clarification which is required in Mediation tasks, Collaboration or Personalization of knowledge.    (2T41)

As part of the development and use of such applications there are several intended participants including:    (2T42)

  1. Ontology Author    (2T43)
  2. Data Author    (2T44)
  3. Application Developer    (2T45)
  4. Application User    (2T46)
  5. Knowledge Worker    (2T47)

Ontology Engineering    (2T48)

A major challenge for ontology as a science supporting such applications is to make explicit intended meanings of terms in ways that at once faithful to people in a domain and processable by computational systems. Formal ontologies represent an attempt to constrain their expression in order to allow for a concrete interpretation of the vocabulary and symbols used.    (2KKV)

Ontology engineering is a relatively new discipline assembling a set of tasks for the development of ontologies. These may be a foundational set or for a particular domain. Ontology engineering, like data and software engineering involves a lifecycle starting with strategic views, going through the development activities of analysis and design to building the ontology using a representational language and testing of the final product - an ontology.    (2JDF)

The expectation of ontological engineering is to help solve such things as data and system interoperability problems that have at their base semantic (and pragmatic) issues. An example of this is challenge of bridging from human understanding of business and technical terms and how these are used in software applications. Ontology engineering For an introduction of this topic see: http://en.wikipedia.org/wiki/Ontology_engineering    (2JDD)

While Ontological engineering has many steps the more recent publicized work has emphasized improved representations for ontological products using specially designed languages such as OWL (Web Ontology Language) developed as part of the Semantic Web effort. Ontologies are a core building block of the semantic technology stack of the Semantic Web effort. See http://en.wikipedia.org/wiki/Ontology_language and http://en.wikipedia.org/wiki/Semantic_Web    (2JDE)

Simple Steps to Building an Ontology    (2T8Q)

There are practices for Ontological Engineering as discussed above, but a simple connection to vocaularies to start of on ontologies is outlined here.    (2T8R)

A start on analysis for developing ontology may be to look at simple lexicons and/or controlled vocabularies. As Noy and Mcguinness point out ontologies defines a common vocabulary for researchers who need to share information in a domain. This an other steps are discussed in Ontology Development 101: A Guide to Creating Your First Ontology. A challenge is to take informal definitions and elevate them to machine-interpretable definitions of the basic domain concepts and relations among them. If we are starting from scratch we may just begin as data modelers often do by listing terms that denote the entities, events, qualities, relationships, etc. in a domain. We saw something like this in the previous discussion of rivers. The same would apply in a "Tree" domain. We vaguely divide tree reality into various types and use terms like: tree, bush, shrub sapling. Behind these we have a theory to organize these concepts:    (2T8S)

- Tree are a woody perennial plant typically with a single stem or trunk growing to a considerable height and bearing lateral branches.    (2T8T)

One relation is that or Type and subtype. These can be structured by developing taxonomies where terms are related hierarchically and can be given distinguishing properties. Such efforts involve a chosen ontology engineering methodology over the ontology lifecycle. Often abstract concepts (e.g. Role, Situation) as organizing features, are employed to define new concepts. An ontology will often identify the relationships among the concepts and distinguish which concepts have instances properties.Formal, complex domain ontologies' design provides an overall conceptual structure of the domain. This typically identifies the domain's principal concrete concepts and their properties and where concepts have named relationships with other concepts, like "aligned-with" or "near-to".    (2T91)

1. Genus and differentia ala -one begins with the broadest genus containing the species to be defined, and divides the genus into two sub-genera by means of some differentia. 2. Causes ( Aristotle and others) - origin, material cause, formal cause, intentional cause) 3. Principal features - As previously notes river has what are called principle features. It is usually freshwater, flowing towards an ocean, a lake, a sea, or another river 4. Functions - a river, for example, has a function orrole in transportation 5. Division and component parts - a tree has branches, roots etc., Countries have cities 7. Purposes or interests, etc.    (2T8V)

One ontology often references or including foundational ontologies and in sense this provides supporting material. These may be more specific and/or more general such as a top-level ontology (e.g. DOLCE http://en.wikipedia.org/wiki/Dolce).    (2JD9)

Building Complex Domain Ontologies    (2JCR)

A caution is that as one starts on an ontology there are limits one must recognize. As human conceptualization about what exists in the world, all of our concepts may have fuzzy, overlapping, inconsistent and dynamic boundaries. This arises in part because our theories of reality have a limited scope and different degrees of formalization . In this sense all ontologies are incomplete simplifications. Some simplifications are so sketchy with notable gaps and inconsistencies that they can be called informal ontologies. By analogy Newton's theory of gravitation may be considered formal but incomplete since it formalized what was understood. It was a theory that was completely formalized in mathematics, but we know it incompletely models reality. A more complete theory that currently seems well established is quantum electrodynamics (QED). QED produces the most accurate physical predictions. However, since it mathematics is complex Newtonian theory is often used as an adequate approximation for most human-scale phenomena (such as traveling by foot, boat, car or place) and within accuracy limits of ordinary measuring instruments.    (2T8N)

What we elevate to a formal ontology may provide detailed axioms and definitions to supplement and make an organizational theory more understandable to humans. To do this we add comments to "explain" terms using natural languages. Another distinction that some make between terminologies and formal ontologies is in terms of the use of logics to represent the previously mentioned model-theoretic as part of definitions. If a formal definition in some form of logic is not required, then we have an informal vocabulary or terminology collection. These may be well understood by humans but not processable by computer systems and thus less useful in the IT concept of ontologies. What is a useful ontology is one that has a theory and vocabulary that is understandable to humans and yet formalized in a language that is capable of supporting automated deduction. In this view even detailed definitions, such as have been captured in data dictionaries comments on terms in data models and in enterprise models are extended vocabulary-terminology collections and not ontologies because the definitions are only stated in natural language. They may however be useful starting points that can be formalized. See http://ontolog.cim3.net/forum/ontolog-forum/2011-05/msg00000.html for a discussion and debate of this idea. More detail on methods from some approaches are at SOCoP/OntologyMethods    (2JDN)

More recently a thrust into Linked Data -Linked Open Data has been emphasized by the Semantic Web effort and recent SOCoP demonstration work has taken this direction by converting small samples of some U.S. based open source geo data (USGS, Geonames, Census, Linked Geo Data such as Open Street Map, DBPedia etc.) into RDF. These may be considered very focused use case efforts which resolves a small set of semantic issues. (see below for a discussion of RDF) and hosting these as SPARQL endpoint(s.)    (2JJJ)

Simple Tools - RDF and RDFS    (2JDJ)

One tool from Semantic Web work to start on the formalization of vocabularies is RDF (Resource Description Framework. RDF is an assertional language made up of three terms that was intended to be used to express propositions using precise formal vocabularies, particularly those specified using the REF Schema (RDFS) for access and use over the World Wide Web. in combination RDF and RDFS was intended to provide a basic foundation for more advanced assertional languages with a similar purpose. Essentially RDF triples denote relations between pairs of objects. Often the triple is thought of as a Subject-Verb Relation-Object. Thus RDF can informally express Circle isa Shape. While RDF was originally designed as a metadata data model for web information "resources" a semi-formal method has grown around the RDF formalism to capture simple conceptual description of information. Thus is added by the use of RDFS the simple RDF schema language RDFS (Resource Description Framework Schema).    (2JJN)

RDFS offers a simple vocabulary to model class and property hierarchies and other basic schema primitives that can be referred to from RDF models. Thus RDFS provides a simple ontology that particular RDFy documents may be checked against to determine semantic consistency.    (2JJO)

See http://www.w3.org/RDF/ and http://en.wikipedia.org/wiki/Resource_Description_Framework    (2JDK)

Using WGS84 as a reference model RDF has been applied by a W3C Semantic Web Interest Group (SWIG)to build namespaces to represent lat(itude), long(itude) and other information about spatially-located things.    (2JJP)

See http://www.w3.org/2003/01/geo/#status    (2JDM)