OntologySummit2012: (Track-4) "Large-scale domain applications" Synthesis (32DJ)
Mission Statement: (32DK)
This track will help to ground the discussions in the other tracks and bring key challenges to light by describing current large-scale systems and systems of systems that either use, or could use, ontologies in their deployment. "Large-scale" can mean either very large data sets, very complex data sets, federated systems, highly distributed systems, or real-time, continuous data systems. Examples of large data sets might include scientific observations and studies; complex data sets could be technical data packages for manufactured products, or electronic health records; federated systems could include information sharing to combat terrorism, highly distributed systems includes items such as the smart electrical grid (aka Smart Grid), and real-time systems include network management systems. Of course, some big systems might include all five aspects. (32DZ)
see also: OntologySummit2012_Applications_CommunityInput (32EF)
In implemented systems, ontologies are... (386Y)
- Strong for: (386Z)
- Supporting change and aggregation (3870)
- Enabling community aggregation, annotation (3871)
- Automated data ingestion (3872)
- Data validation (3873)
- Ensuring consistency of terms across many data sets (Distributed systems) (3874)
- Supporting reasoning (3875)
- Self describing systems (3876)
- Systems with many complex constraints, rules, laws, with frequent changes (Dynamically changing systems) (3877)
- Data mining / semantic signature extraction (3878)
- Rapid system building (3879)
- Weak for: (387A)
Needs: (387D)
- Need better standards for common elements: (387E)
- Need repositories (387I)
- Repositories of ontological patterns could be more useful than repositories of ontologies (387J)
- Need industrial strength semantic services resident in the cloud (387K)
- Need better visualization tools and approaches (387L)
- Need better tools to help interpret legacy systems, transform into semantic systems. (387M)
- Need to establish feedback mechanisms from end users to ontology designers directly from point of use. (387N)
Recommendations: (387O)
- Look for the 80-20 rule of semantic development (387Q)
- Use well defined and narrow use cases to demonstrate benefits of semantic approaches (387R)
- Having explicit vocabularies (classifiers) is a must in a distributed system; (387S)
- Community should be included in the development and evolution of vocabularies (387T)
- It is critical to capture and evolve domain knowledge in a form that the community is comfortable with (387U)
- Transition from implicit domain knowledge to explicit encoding requires community consensus - and an organization to manage the consensus (387V)
- Some have recommended exposing users to SKOS semantics; use more complicated constructs only on back end if necessary. (387P)
Other Observations / Lessons learned: (387W)
- UML to OWL is a common requirement for legacy systems (387X)
- Starting from scratch is rare. (387Y)
- Ontology patterns are very helpful, and encourage model reuse (387Z)
- Semantic techniques work best when not compromised by implementation tradeoffs (3880)
- Semantic methods are faster to implement and easier to maintain (3881)
- Semantic approaches particularly suited to systems with many complex constraints, rules, laws, with frequent changes (3882)
- Incremental implementation is possible through federation of datastores (3883)
- Ontologies are not always applied to enable reasoners - sometimes just as a more rigorous data modeling approach (3884)
- Engineers turned ontologists often don't have the necessary background/skills (3885)
- Existing infrastructure supports traditional software development far better than large-scale ontology development (3886)
- There are many ontologies of dubious quality (3887)
- Service-oriented architectures allow separation of code and ontology updates (3888)
- Reasoner and query engine performance is highly dependent upon the exact formulation of rules and queries (3889)
- No single technology/tool currently provides the best solution across all large system use cases (388A)
-- maintained by the Track-4 champions: SteveRay & TrishWhetzel ... please do not edit (32DL)