Track 4 - Large-scale Applications
Track
Co-Champions: SteveRay & TrishWhetzel
Mission Statement:
This track will help to ground the discussions in the other tracks and
bring key challenges to light by describing current large-scale systems
and systems of systems that either use, or could use, ontologies in
their deployment. "Large-scale" can mean either very large data sets,
very complex data sets, federated systems, highly distributed systems,
or real-time, continuous data systems. Examples of large data sets
might include scientific observations and studies; complex data sets
could be technical data packages for manufactured products, or
electronic health records; federated systems could include information
sharing to combat terrorism, highly distributed systems includes items
such as the smart electrical grid (aka Smart Grid), and real-time
systems include network management systems. Of course, some big systems
might include all five aspects.
Teleconferences
Date |
Title |
Chairs |
Panelists |
2012_02_16 |
Track-4:
"Large-Scale Domain Applications¨CI: Energy, Government and Geography" |
SteveRay & TrishWhetzel |
AndrewCrapo, KrzysztofJanowicz, BruceBauman, MillsDavis |
2012_03_08 |
Track-4:
"Large-Scale Domain Applications¨CII: Biomedical, earth &
environmental science & engineering" |
TrishWhetzel & SteveRay |
DavidPrice, MikeKellen, DamianGessler, BlazejBulka, IlyaZaslavsky, LinePouchard |
Track 4 - Large Scale Applicaion
Synthesis
In implemented systems, ontologies are...
- Strong for:
- Supporting change and aggregation
- Enabling community aggregation, annotation
- Automated data ingestion
- Data validation
- Ensuring consistency of terms across many data sets
(Distributed systems)
- Supporting reasoning
- Self describing systems
- Systems with many complex constraints, rules, laws,
with frequent changes (Dynamically changing systems)
- Data mining / semantic signature extraction
- Rapid system building
- Weak for:
- Being understandable by software engineers and
customers
- Query performance (compared to relational databases)
Needs:
- Need better standards for common elements:
- Datatypes
- Ontology patterns (e.g. whole/part patterns)
- Collect ontological primitives from observation data
- Need repositories
- Repositories of ontological patterns could be more
useful than repositories of ontologies
- Need industrial strength semantic services resident in the
cloud
- Need better visualization tools and approaches
- Need better tools to help interpret legacy systems,
transform into semantic systems.
- Need to establish feedback mechanisms from end users to
ontology designers directly from point of use.
Recommendations:
- Expose users to SKOS semantics; use more complicated
constructs only on back end if necessary.
- Look for the 80-20 rule of semantic development
- Use well defined and narrow use cases to demonstrate
benefits of semantic approaches
- Having explicit vocabularies (classifiers) is a must in a
distributed system;
- Community should be included in the development and
evolution of vocabularies
- It is critical to capture and evolve domain knowledge in a
form that the community is comfortable with
- Transition from implicit domain knowledge to explicit
encoding requires community consensus - and an organization to manage
the consensus
Other Observations / Lessons learned:
- UML to OWL is a common requirement for legacy systems
- Starting from scratch is rare.
- Ontology patterns are very helpful, and encourage model
reuse
- Semantic techniques work best when not compromised by
implementation tradeoffs
- Semantic methods are faster to implement and easier to
maintain
- Semantic approaches particularly suited to systems with
many complex constraints, rules, laws, with frequent changes
- Incremental implementation is possible through federation
of datastores
- Ontologies are not always applied to enable reasoners -
sometimes just as a more rigorous data modeling approach
- Engineers turned ontologists often don't have the necessary
background/skills
- Existing infrastructure supports traditional software
development far better than large-scale ontology development
- There are many ontologies of dubious quality
- Service-oriented architectures allow separation of code and
ontology updates
- Reasoner and query engine performance is highly dependent
upon the exact formulation of rules and queries
- No single technology/tool currently provides the best
solution across all large system use cases
--
maintained by the Track-4 champions: SteveRay & TrishWhetzel ... please do not edit