Ontology for Big Systems

Track 4 - Large-scale Applications

Track Co-Champions:  SteveRay & TrishWhetzel

Mission Statement: This track will help to ground the discussions in the other tracks and bring key challenges to light by describing current large-scale systems and systems of systems that either use, or could use, ontologies in their deployment. "Large-scale" can mean either very large data sets, very complex data sets, federated systems, highly distributed systems, or real-time, continuous data systems. Examples of large data sets might include scientific observations and studies; complex data sets could be technical data packages for manufactured products, or electronic health records; federated systems could include information sharing to combat terrorism, highly distributed systems includes items such as the smart electrical grid (aka Smart Grid), and real-time systems include network management systems. Of course, some big systems might include all five aspects.
 
Teleconferences

Date Title Chairs Panelists
2012_02_16 Track-4: "Large-Scale Domain Applications┬ĘCI: Energy, Government and Geography" SteveRay & TrishWhetzel AndrewCrapo, KrzysztofJanowicz, BruceBauman, MillsDavis
2012_03_08 Track-4: "Large-Scale Domain Applications┬ĘCII: Biomedical, earth & environmental science & engineering" TrishWhetzel & SteveRay DavidPrice, MikeKellen, DamianGessler, BlazejBulka, IlyaZaslavsky, LinePouchard
 

Track 4 - Large Scale Applicaion Synthesis


In implemented systems, ontologies are...   

  • Strong for:   
    • Supporting change and aggregation   
    • Enabling community aggregation, annotation   
    • Automated data ingestion   
    • Data validation   
    • Ensuring consistency of terms across many data sets (Distributed systems)   
    • Supporting reasoning   
    • Self describing systems   
    • Systems with many complex constraints, rules, laws, with frequent changes (Dynamically changing systems)   
    • Data mining / semantic signature extraction   
    • Rapid system building   
  • Weak for:   
    • Being understandable by software engineers and customers   
    • Query performance (compared to relational databases)   

Needs:   

  • Need better standards for common elements:   
    • Datatypes   
    • Ontology patterns (e.g. whole/part patterns)   
    • Collect ontological primitives from observation data   
  • Need repositories   
    • Repositories of ontological patterns could be more useful than repositories of ontologies   
  • Need industrial strength semantic services resident in the cloud   
  • Need better visualization tools and approaches   
  • Need better tools to help interpret legacy systems, transform into semantic systems.   
  • Need to establish feedback mechanisms from end users to ontology designers directly from point of use.   

Recommendations:   

  • Expose users to SKOS semantics; use more complicated constructs only on back end if necessary.   
  • Look for the 80-20 rule of semantic development   
  • Use well defined and narrow use cases to demonstrate benefits of semantic approaches   
  • Having explicit vocabularies (classifiers) is a must in a distributed system;   
  • Community should be included in the development and evolution of vocabularies   
  • It is critical to capture and evolve domain knowledge in a form that the community is comfortable with   
  • Transition from implicit domain knowledge to explicit encoding requires community consensus - and an organization to manage the consensus   

Other Observations / Lessons learned:   

  • UML to OWL is a common requirement for legacy systems   
    • Starting from scratch is rare.   
  • Ontology patterns are very helpful, and encourage model reuse   
  • Semantic techniques work best when not compromised by implementation tradeoffs   
  • Semantic methods are faster to implement and easier to maintain   
  • Semantic approaches particularly suited to systems with many complex constraints, rules, laws, with frequent changes   
  • Incremental implementation is possible through federation of datastores   
  • Ontologies are not always applied to enable reasoners - sometimes just as a more rigorous data modeling approach   
  • Engineers turned ontologists often don't have the necessary background/skills   
  • Existing infrastructure supports traditional software development far better than large-scale ontology development   
  • There are many ontologies of dubious quality   
  • Service-oriented architectures allow separation of code and ontology updates   
  • Reasoner and query engine performance is highly dependent upon the exact formulation of rules and queries   
  • No single technology/tool currently provides the best solution across all large system use cases   
 --
maintained by the Track-4 champions: SteveRay & TrishWhetzel ... please do not edit