OntologySummit2014: (Track-C) "Overcoming Ontology Engineering Bottlenecks" Synthesis (43DY)
Track Co-champions: KrzysztofJanowicz, PascalHitzler, MatthewWest (43DZ)
Background (43ZC)
Ontology Engineering is the development and use of ontology in any form as all or part of some system. This includes such areas as data integration, data mining, expert systems, data semantics and reasoning. Sometimes there are barriers to the use of ontologies because of the cost of development and deployment or the timeliness of being able to deliver solutions. This track aims to seek out the bottlenecks that represent the current barriers to use of ontologies and point towards the solutions or work towards the resolution of those bottlenecks. (43ZD)
Mission (43ZE)
To identify bottlenecks that hinder the large-scale development and usage of ontologies and identify ways to overcome them. (43ZF)
Examples (43ZG)
- a) Bottlenecks include: (44EY)
- Ontology engineering processes that are time consuming, (44EZ)
- Social, cultural, and motivational issues (44F0)
- Modeling axioms or knowledge representation language fragments that cause difficulties in terms of an increase in reasoning complexity or reducing the reuseability of ontologies (44F1)
- The identification of areas and applications that would most directly benefit from ontologies but have not yet considered their use and development. (44F2)
- b) Potential Solutions include: (44F3)
- Tools and techniques, (44F4)
- Research findings and methods, guidelines, documentation, and best practice, (44F5)
- Automation (44F6)
- the combination of inductive and deductive methods to scale the creation of axioms (44F7)
- The development of a set of reusable patterns that can ease ontology development and alignment (44F8)
- The identification of purpose-driven modeling granularities that provide sufficient semantics without over-engineering (43ZS)
- Lessons learned from ontologies that are seeing wide adoption (44F9)
- The development of tutorials and other educational materials (44FA)
- c) Pre-requisites (44FB)
- The track is not only concerned with the outline of possible resolutions to the bottlenecks identified but also with the identification pre-requisites to addressing the challenges, which might include agreements that need to be reached, or capabilities to be developed. (44FC)
Plan (43ZW)
- 1. Examine the processes to develop and use of ontology like artefacts in various contexts and identify where the weight of effort falls. (44FD)
- 2. Look for opportunities to simplify or automate problem processes. (44FE)
- 3. Develop an outline to debottleneck the process, and identify any pre-requisites. (44FF)
see also: OntologySummit2014_Overcoming_Ontology_Engineering_Bottlenecks_CommunityInput (43E2)
- What is it that takes a lot of time and effort? (4BVB)
- Broadly speaking, education and team buy-in. Most people don't understand ontologies well, and getting the supporting team to buy-in and build towards the success of the project can become time consuming and distracting. You really need strong leadership that is committed to project. I've also found the Semantic Web understanding of ontologies to actually be a hindrance for certain classes of applications. Re-education in terms of what is actually possible is sometimes an additional obstacle. (4BVC)
- There are 2 tasks that are rather time-consuming; (i) the extraction of the knowledge from Subject-Matter experts (SMEs), and (ii) the explanation of the model to developers using it. (4BVD)
- Well, for starters the subject is complex, and so far nobody has achieved this lifecycle information integration. Furthermore the ISO procedures are tedious (but worth it), and the technology is "bleeding edge". Standardization is finding a balance between large ego's, commercial politics, short-term thinking, hard-to-make paradigm shifts, and for the most lack of funding. (4BVE)
- (4BVF)
- Organizing ontologies for "efficient" processing. Avoiding combinatorial explosion when using machines that were designed to do arithmetic, not categories. (4BVG)
- Accommodating N different perspectives on the meaning of a term and its relation to other terms. Need a way of signifying "-nyms" and webs of -nyms. (4BVH)
- (4BVI)
- Identifying the scope of knowledge to be represented and the context of use or application. (4BVJ)
- Defining Concepts is somewhat easier than defining useful relationships that structure the ontology model. (4BVK)
- Refining the ontology during development to satisfy logical consistency (4BVL)
- Modifying an ontology to capture expanding knowledge while ensuring logical consistency. (4BVM)
(4BVQ)
- What is it that is very expensive? (4BVR)
- Access to SME's. The higher the skill and importance of the SME, the more difficult it is to get their time. Also, the lack of sophisticated ontology tooling also means that significant effort needs to be directed towards (4BVS)
- The extraction of the knowledge is expensive as most SMEs have an "ideal" vision of what their knowledge is and tend to differ from the actions / reasoning they actually use. For instance, doctors will answer questions theoretically when they're reasoning while consulting patients tend to be different. Amount of Labor involved. (4BVT)
- Cost of IT system for moving the data and resolving the alternatives. (4BVU)
- Refining and reviewing the ontology to satisfy requirements (4BVV)
- Integrating an ontology to an existing enterprise and software architecture (4BVW)
- What is it that is held up because of a lack of scarce resources? (4BVX)
- Generally, ontology development is bottlenecked because of access to SME's and access to software developers that need to provide adequate infrastructure. Moreover, ontology deployment when using functionality beyond the SemWeb stack is also hindered. Formal, computational ontologies in general are not well developed. For example, if you want to deploy an ontology-based application that can reason on natural language questions such as "Who is standing behind you?", "Who passed through this corridor in the last X hours?" and so on, you don't necessarily want to use a symbolic reasoner. Bindings into alternative reasoning algorithms and evaluation frameworks are still quite crude, and require a lot of wheel re-invention. This both slows down design-deployment time, drives up costs, and increases the overall risks of the project. (4BVY)
- Discovery of new and better ways to discover, express and process ontologies. Most current human 'resources' are too intellectually invested in current rules and tools (which are not adequate). (4BVZ)
- An overarching data architecture with long term evolution and application of consistently defined concepts that can be reused with evolving and new services and applications (4BW0)
- > Access to SME's. The higher the skill and importance of the SME, the more difficult it is to get their time.... (4BW1)
- I agree. But I believe that we need *radically* different tools. The SMEs should do their work in *their* preferred languages and notations. They should *never* be asked to learn anybody else's notations, conventions, or interfaces. (4BW2)
- > Generally, ontology development is bottlenecked because of access to SME's and access to software developers that need to provide adequate infrastructure... (4BW3)
- I agree. But the solution is to get the information from the same sources and tools that the SMEs themselves read, write, and use. (4BW4)
- > Formal, computational ontologies in general are not well developed... Bindings into alternative reasoning algorithms and evaluation frameworks are still quite crude, and require a lot of wheel re-invention... (4BW5)
- Those bindings should be made to the tools and resources the SMEs are already using to do their job. Any necessary ontologies should *help* the SMEs to do their work better and faster. (4BW6)
- Why is it that ontological approaches are not taken when they could/should be? (4BW7)
- There are a number of factors. (4BW8)
- Sometimes, the long pay-off time makes these interventions either riskier, or outside the expected pay-off for the decision-maker, and hence less attractive. (4BW9)
- Secondly, while the Semantic Web understanding of ontologies is useful for certain classes of applications, it is not well suited to many other applications. This can make it difficult to communicate the potential of ontologies, especially so if a culture has been "indoctrinated" with the SemWeb understanding of ontologies - in these cases, it is an uphill battle to get them to realize the value of a broader understanding of what ontologies can do. The recent mini-series on Rules, Reasoning and LP demonstrated this disconnect well, whereby one community differentiates between axioms and rules, while the other community does not restrict itself to considering "pure axioms" vs "rules". (4BWA)
- Thirdly, many interventions I've seen don't fully take into account the sociological factors of the solution - without a cogent understanding of the culture in which the technology intervention is taking place, there are many opportunities for misaligned expectations, yielding in gaps in implementation or improperly used technologies. (4BWB)
- Fourthly, a broad class of potential ontology based applications can be achieved with a non-ontological approach faster and cheaper. We assert that the long term value proposition in this instance is lower than a "proper" solution, but demonstrating and clearly communicating the opportunity cost can be difficult. (4BWC)
- Lastly, and this complements the previous point, there is a dearth of popular / well-known successful ontology-based solutions. Whereas the benefit of say a CRM or a DB is well known, in many instances, those involved in ontology need to reiterate the value proposition nearly from scratch. (4BWD)
- Time constraint on the delivery of the ontological artifacts mean that the model and its implementation are generally not separated. (4BWE)
- We are essentially dealing with migrating content from XML to a world where the true semantics of the underlying content can be expressed. When started our investigation, standards such as FRBR [1] were not very well defined, while ontologies (such as DoCO [2]) were not yet available. As a result, we created our (proprietary) own model to represent content and still looking at determining the best approach to document the ontology. We have tried using a simple graphical notification, UML, and HTML (generated using tools such as LODE [3]). Sadly, there is no tools that can provide the info for different type of users without having to spend hours on the documentation (which is a huge bottleneck at present). (4BWF)
- In terms of natural language tools, I investigated the use of controlled natural language to express ontological knowledge, but failed to find an approach to easily express ontological axioms. I have also tried to use Fluent Editor [4], but I found it rather counter-intuitive and the import was problematic for highly modular ontologies (such as the one we have developed). [1] http://www.ifla.org/publications/functional-requirements-for-bibliographic-records [2] http://purl.org/spar/doco [3] http://www.essepuntato.it/lode (4BWG)
- Current ontological approaches are too primitive. (4BWH)
- Challenges people to say what they mean and mean what they say. Seen as personal threat rather than fit for purpose. (4BWI)
- A basic reason is that no common language for technical literacy has been developed or even seriously considered as a possibility. On 7 leading edges currently used languages are deficient compared with a design distributed at IBM in 1973. See "Inexcusable Complexity for 40 years" on my web site. (4BWJ)
- Lack of knowledge about the importance of an overarching data model and the role that semantics plays in defining and offering a consistent interpretation of shared data among applications and services across the enterprise and across systems.. (4BWK)
- I have used natural language tools that can semi-automatically create an ontology from text sources, but refining the results of tens of thousands of concepts into a consistent model is hard. An ontology by its nature has some logical formalism that enables logical reasoning, but for this to work the ontology has to satisfy these constraints which NLP tools are deficient in. Also the extraction relies on the provenance of the text sources so garbage in garbage out again. (4BWL)
- > There are a number of factors. Sometimes, the long pay-off time makes these interventions either riskier, or outside the expected pay-off for the decision-maker, and hence less attractive... (4BWM)
- I agree. But those are symptoms of not having the right tools. (4BWN)
- > while the Semantic Web understanding of ontologies is useful for certain classes of applications, it is not well suited to many other applications... many interventions I've seen don't fully take into account the sociological factors of the solution...a broad class of potential ontology based applications can be achieved with a non-ontological approach faster and cheaper. (4BWO)
- More symptoms of inadequate tools. (4BWP)
- Fundamental principle: Ontology tools should *reduce* the expense by enabling SMEs to accomplish more in less time. The ontologies should be a *by-product* of the SMEs' normal work. (4BWQ)
- Recommendation: The ontology summit should devote more attention to cutting-edge research than to incremental improvements on inadequate tools. Some suggestions: (4BWR)
- See the slides and publications by the Aristo Project at AI2: http://www.allenai.org/TemplateGeneric.aspx?contentId=12 (4BWS)
- The IBM Watson project is also doing research on deriving knowledge from the same kinds of resources as AI2. (4BWT)
- Tom Mitchell at Carnegie Mellon developed the Never-Ending Language Learner (NELL): http://rtw.ml.cmu.edu/rtw/index.php . Or see http://wamc.org/post/dr-tom-mitchell-carnegie-mellon-university-language-learning-computer (4BWU)
- For the past few years, I've mentioned Cyc as an important project that is doing important research with the world's largest formal ontology. (4BWV)
- And from time to time, I cite the VivoMind work. For example, http://www.jfsowa.com/talks/goal7.pdf (4BWW)
- I won't claim that these projects will solve all the problems tomorrow. But I believe that tools based on some combination of these methods will solve the problems raised by Matthew's questions. They'll get better results faster than trying to "educate" developers about ontology. (4BWX)
- There are a number of factors. (4BW8)
- Ref the questions asked for the chat during the Track C session one: (4BWY)
- How to arrive at reusable patterns? (4BWZ)
- How many patterns are there? (4BX0)
- Are there types of patterns? (4BX1)
- Are all patterns domain-independent? (4BX2)
- Can we mine patterns from data? (4BX3)
- [11:06] MatthewWest: I'll try to answer the 1st question. There are an unlimited number of patterns, because there are an unlimited number of atomic elements. Some patterns at least are domain dependent. Yes we can mine patterns from data, in fact this is one of the best ways to develop patterns. (4BX4)
- [11:06] ToddSchneider: Without trying to be facetious, what is a pattern? How can one be identified? (4BX5)
- [11:19] MikeBennett: (Summarizing @Aldo's verbal remarks) Complete repository of archetypical patterns... (primitives). Questions as to whether this is feasible. My take on this: maybe feasible in a simpler domain like business, more challenging if pursuing notion of archetypes for all human experience (per Leibniz etc.). I'm hearing confidence that the former at least can be done :) See also DOLCE. (4BX6)
- [11:20] AldoGangemi: @Mike good summary (4BX7)
- [11:21] MikeBennett: @Aldo Thanks - glad I captured it OK. This is something I am very motivated about. (4BX8)
- Who will develop and maintain these patterns? (4BX9)
- Are there measures or at least experience reports on the robustness and usefulness of patterns? (4BXA)
- Are there success stories of large-scale pattern usage? (4BXB)
- [11:09] MatthewWest: 2nd Question. For true patterns, they will mostly be discovered, rather than invented. In the end, standards organizations will curate them. Good patterns are always useful, because they save effort and improve quality. The more a pattern is used the better it gets as bugs are eliminated. (4BXC)
- [11:14] KarlHammar: Agree with Matthews answer to 2nd question above. ChrisWelty did a keynote at Workshop on Ontology Patterns at ISWC 2010, touching upon exactly this. He called it "pattern archeology", i.e. the "digging up" of patterns from established systems/practices/models/etc. A process of discovery as opposed to design. Perhaps the keynote is available in WOP proceedings. (4BXD)
- [11:10] KrzysztofJanowicz: Kuhn's vision statement 'Modeling vs Encoding for the Semantic Web': http://www.semantic-web-journal.net/sites/default/files/swj35.pdf (4BXE)
- How to abstract from individual ontology designs? (4BXF)
- Do we need higher-level ontology modeling languages on top of knowledge representation languages? (4BXG)
- How to get community buy-in? (4BXH)
- [11:12] MatthewWest: 3rd Question: when you abstract from ontology designs you are usually moving up the subtype/supertype hierarchy rather than moving out class-instance, so you should not normally need another language. Buy in comes from utility plus ease of availability and use. (4BXI)
- How important is the selection of specific language constructs for the scalability and reuse of patterns? (4BXJ)
- It is first of all important that the language constructs can support the requirements of the application, otherwise all is lost. However, there is often a way to restate things that is more efficient from a processing perspective, and this can have obvious processing benefits, but may make the resulting ontology more opaque. Generating more efficient language forms from more understandable forms may be a way forward. (4BXK)
- How to arrive at reusable patterns? (4BWZ)
During the last session Track C session on Bottlenecks in Ontology Engineering, we were asked some questions. These questions are given below together with responses during the session and offline. (4BRH)
1. What are the lessons learned from in-the-wild ontology engineering projects? (4BRI)
Developing an OWL ontology has the same degree of difficulty as any other data modeling exercise (i.e. RDBs, ISO EXPRESS, E/R, UML and OWL require language and tool expertise to do anything real). The hard problems are the same in most cases : understanding the requirements and generating clear, accurate but concise definitions everyone agrees. (4BRJ)
The big benefits of OWL ontologies are that the more accurately reflect "what is" than other data models, that the underlying technology is so flexible that it enables very quick proof-of-concept and/or testing and it also allows throwing together almost anything and then fixing it up as you go. (4BRK)
If you've not tested your ontology against real data, it is definitely "wrong". If you have testing your ontology against real data, it less wrong but still wrong. Plan improvements over time into your project, even once the apps are operational. (4BRL)
2. How do challenges related to cultural and motivational issues relate to technical issues, e.g., tool support? (4BRM)
A key issue I see is the separation of ontology and the software apps that use them, and the skills required to cross that chasm. It's hard to get software developers to understand ontologies and it's hard to get ontologists to understand the needs of software developers (e.g the "human readable URI" discussion). We are a tool and solution vendor and have chosen the approach of fitting ontology development tools into a larger IDE for software development. Our view is that building ontologies is nice, but delivers very little business value compared with building complete Semantic Applications. (4BRN)
People have been told ontology development is very hard and costly (which is not true as a proportion of overall project costs, and timely development delivers benefits in any case). Once they have it in their heads, it's very hard to convince them otherwise. In any engineering or science discipline, what they've learned to do their job is orders of magnitude harder than ontology development. This adds to the challenge of making very robust tools that are simpler to use. They key is to make people understand that *with knowledge transfer* they can pick up enough to be productive quite quickly ... we run into many situations where short sighted people ignore that knowledge transfer, thinking they are saving money, and then they pay the price over and over again as the project progresses. (4BRO)
3. How to get community buy-in? (4BRP)
Buy-in of specific organisations is far more important, it is very hard to directly convince a community of anything. However, convince a few individual organisations and once they are successful, others will follow if only because of a fear of being left behind. Some large organizations (e.g. DoD have a community that follow them so orgs like that are great candidates to be the lead). Still it is important not to dictate models of the world. (4BRQ)
4. What are the tradeoffs between expressiveness vs. pragmatics? (4BRR)
Two kinds of ontology have different requirements. (4BRS)
The first sort I would call a descriptive ontology, where the purpose is to as accurate as possible to how some domain is, not so much for reasoning, but for documentation. In this situation expressiveness is everything. If you cannot say something that is true, then that is a severe limitation. (4BRT)
The second sort is aimed at solving a specific problem. This is likely a subset of some descriptive ontology (if such exists) where some specific constraints apply, which may enable more efficient reasoning to take place, or indeed make reasoning possible/practical. (4BRU)
There is only a problem, in my view, if we try to insist that there is only one type of ontology for a domain, rather than potentially more than one, with relationships between them. (4BRV)
Expressiveness is primary when you don't know the specific questions to be answered (e.g. in a data integration app). You can always transform to a less-express form when you finally do know the questions, but you have lost too much to easily go the other way if you start with the less-expressive ontology. (4BRW)
5. Who will develop all the ontologies we would ideally need? (4BRX)
Individual organisations will develop ontologies *they* need, not what others need. If they choose to share those openly that's great, but most commercial enterprises will not do so as they see them as a business advantage over their competitors. There are, of course, cases where standards need to be created and I expect most shared ontologies will come from standards bodies, consortia or national institutes over time. I'd say that there will eventually be foundations upon which you can build what you need, but nobody will every "develop all the ontologies we would ideally need". Focus on identifying and supporting those foundations is a good first step. For any domain of publicly available data (e.g. units of measure, company registration data, country codes), there ought to be an identifiable authoritative source. I would hope that those authoritative sources would eventually understand their responsibility to develop these ontologies. In many cases these authoritative sources will be public administration bodies, or standardisation bodies. This would at least be better than several bodies developing, say, Unit of Measure ontologies, as is the case at present. → Why would this need an identifiable authoritative source? E.g., the whole success history of the Web is the lack of such source. I think that is the wrong perspective. The success of the web is that anyone can be an authoritative source, not just those who control e.g. the media. Perhaps the confusion is what it means to be an authoritative source. It really means a first-hand source. So someone who was there, for an account of some event, is more authoritative than someone who reports what someone who was there said. But a lot of data is created intentionally, such as country boundaries. Do we want to look at what Joe Bloggs says the country boundaries are, or what the UN says the country boundaries are? Why do we even want to see what Joe Bloggs thinks about this? It just adds confusion. Of course, in a boundary dispute, the governments of the countries disputing the boundary are also authoritative sources for their own territorial claims. (4BRY)
6. What is the role of crowd-sourcing? (4BRZ)
None at all in the industries like engineering and life science industries. In life sciences, for example, even the areas of interest are confidential. I'm sure there are others where it has an interesting role ... particularly where vocabulary rather than ontology are primary. (4BS0)
On the other hand crowd-sourcing created some of the most used ontologies on the Web, e.g., in the geo-domain. (4BS1)
7. What is the state-of-the art with respect to quality control? (4BS2)
Test, test, test ... exactly as with any other software artefact. Therefore, the state of the art is automated testing driven by robust requirements and testing tools. We do not manage to do this properly for every app, but we do set up our ontology projects exactly as if they are normal software development projects and build in test cases, testing processes and validation/QA infrastructure even if we don't use it everywhere. In our best case of this set up completely, we can literally push a button and have a dashboard go green when all the automated test cases pass, which for our customer that means "acceptance test" is successful and we're ready to deploy to production. (4BS3)
In every app that will go operational, we obviously have the "production" deployment and a separate "testing" deployment environment. (4BS4)
There are some things that tools can help with, like logical consistency, but overall fitness for purpose is a human endeavour, and likely will be for some time. I hope with increasing computer assistance. (4BS5)
8. How is the industry addressing ontology engineering bottlenecks and what are the technological solutions available on the market today? (4BS6)
I have found that the more you treat ontology development as part of a larger software app project, where requirements management, testing, software change control, etc. are used - the better. In 90+ percent of cases where project problems/delays have occurred it's been a lack of clear requirements, changed requirements or customer-lack of understanding their requirements that have been the root cause. The "work" is not that hard if requirements are complete. The work is hard if you're given an undocumented XML Schema as the data requirements for your app (and that does happen). (4BS7)
Addressing ontology engineering bottlenecks can be approached in several ways. We think the following are important for RDF/OWL development. (4BS8)
- Everything is triples. That means we can apply Semantic Web tech to everything. XML and XML Schema are triples, spreadsheets are triples, RDBs are triples, SPARQL queries, SPIN rules/constraints and stored SPIN/SPARQL templates are triples ... you get the picture. (4BS9)
- Innovate but be standards-based. For example, we built a rules engine as an extension to SPARQL, called SPIN, which has been made a member submission back to W3C. Innovating over an existing standard means there's already a large number of people who know 75% of the technical solution, because they know SPARQL. We do a lot with SPIN and I'd say that SPIN/SPARQL are far more important than OWL itself as far as addressing ontology engineering bottlenecks (e.g. quick testing, access to real data). (4BSA)
Being a software company I guess it's obvious that I'll add that the TopBraid Suite is a technical solution available on the market today. TopBraid sits over eclipse so they should be mentioned as well. (4BSB)
Given that I said ontology development is really a component of software development, I will add that Jena, github, JIRA, SpiraTest, SOAPUI, Confluence, Google Docs, MySQL, GoToMeeting and Skype are all components the larger technical solution for distributed semantic app and ontology development. (4BSC)
9. How much (deep) semantics do customers really need? (4BSD)
There is no single answer, it depends on the industry. For example - Engineering : quite a lot. Life sciences : medium. Publishing : very little as they focus on vocabularies. The priority is identity (same name (ID) for the same thing across those that need to share information). → so what about the type level? Having data about the same car does not mean the data is compatible. I'm not sure what you mean here. At the type level, you want a common identifier for a type just as at the instance level. I also do not see how data about the same thing can be incompatible. I can have two pieces of data about a car where one says it is red and the other that it is blue, but either one of these is wrong, or they refer to the car at different times, neither of these makes the data incompatible though, so what does it mean for data about one thing to be incompatible? (4BSE)
Some other points that came out during the session are (4BSF)
Which ontology tool do we use? (4BV3)
Starting from excel (as per Oscar's slide 5) is often done and can be very useful, BUT things go awry when people forget that semantics are not explicit or enforced in entry. For example, I've seen folks send a group of domain experts off to build an initial concept capture using an excel template. The results vary widely in how different groups interpret the semantics of the template, and there isn't anything in excel-as-development-environment to help. Working *with* an ontologist, it's not so bad, as that person can be on the lookout for semantic drift. So Excel is a start, not the end. ROO ( available at: http://sourceforge.net/projects/confluence/ ) is a tool specifically designed to work with non Ontology-savvy audience. (4BSG)
How do we reuse other ontologies? (4BV4)
Start working with experts so that they provide their definitions, and get agreements on those Decide on reuse when you know what your requirements are. We need to remember that reuse is not an end in itself, but a possible means of delivering a solution quicker and cheaper. However, whilst reuse is not an end in itself, if there are good things to leverage it would help get towards standardization. In addition if one finds that something is not reusable, stating the defects helps the field. Reuse can reduce the cost because you do not have to redevelop. It can also help increase quality, reuse tends to get rid of bugs. Finally, if you have integration requirements across applications, then using the same ontology for both will reduce the costs of interfacing. These are all however ends, which reuse alone is not. We should not forget that it is not only about reusing other ontologies, but also allowing that the one that you create can be reused (e.g., in my examples, across the open data portals community in Spain). "Software engineers tend to have preference for 'their own' solutions". This generalizes way beyond SWE or data engineers or engineers as a whole. It more or less true of most of us. (4BSH)
The methodology tells me to (4BV5)
Rec: use an agile approach, based on sets of competency questions for each sprint There's a step to create between competency questions and user stories create competency questions from user stories (as an instrument designer I want to be able to be able to represent calibration data (4BSI)
Large groups work more slowly (4BV6)
Create a small team of experts (5?), who have the confidence of the larger group. Rec4: Avoid non-experts, and use all experts from the same level http://scienceblogs.com/effectmeasure/2009/01/15/the-right-or-wrong-size-for-a/ ... < 20, /= 8? (4BSJ)
But these ontologies to reuse are in English (4BV7)
Many ontologies intended for reuse are designed in English and it is assumed all users will use English this is not valid. It is pragmatic that IDs should be in the language of the developer, since this helps the development and debugging process. IDs should be hidden from end users, who should be able to choose the language for the labels they see I want my ontology to do inferences Just work with text patterns, and guide them to write good term definitions. Multiple projects over many years now have also found a sweet spot in form-based or diagram-based entry tools that are customized by an ontologist, for particular sets of SMEs & elicitation cases, and generate the formal ontology under the hood without showing it to the SMEs. This can be less lossy. Owlapi fixes a lot of broken stuff behind the curtain. Working to make these fixes more noisy in version 4. Can this be Controlled Natural Language (CNL)? ACE? Some find ACE to be too controlled and requiring obvious info to be useful for normal people. Using a more ambiguous grammar, with semantic disambiguation would be better for most, but editor support made a big difference. (For entry. Comprehension was good). Editor support could make a huge difference. Also need reverse verbalization support. If only there was a common sense knowledge base to start from - (4BV1)
https://github.com/Kaljurand/owl-verbalizer (4BV2)
ROO is better than ACE ( http://sourceforge.net/projects/confluence/ ). We have developed it in Leeds :-) EdBarkmeyer of NIST and FabianNeuhaus (when he was at NIST) were working on a controlled language on top of Common Logic (and a CL reasoner). I don't know the final state of their effort. Published version of TobiasKuhn's survey of CNL is finally out : http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00168 Final report on NIST effort (RECON) is at http://www.nist.gov/manuscript-publication-search.cfm?pub_id=911267 (4BSK)
I want my ontology to be light-weight (4BV8)
Rec: again, text patterns are the best option to follow here The ontology is done, but is it good? Its ok to run the reasoner, but that wont tell you enough. Go for other non-logical checks (e.g., use the Oops! Pitfall scanner) (4BSL)
== How do I tell others how to use the ont? Simple documentation (in HTML, in Word), with simple examples, with a link to the revised competency questions, and a simple diagram!! (4BSM)
Automation (4BV9)
Two of our presentations gave examples of automation to overcome what would otherwise be bottlenecks. Support of mundane tasks is a key way Dhaval Thakker "Modelling Cultural Variations in Interpersonal Communication for Augmented User Generated Content JohannesTrame presenting on behalf of PeterHaase .. "Developing Semantic Applications with the Information Workbench Aspects of Ontology Engineering" (4BSN)
Domain and range (4BVA)
Regarding the domain & range disuse view: I have run into this occasionally, and think it is bad practice and is based on a miss-diagnosis... The underlying problem is that the domain and range are set more restrictively than is really the case. Not specifying domain and range is recommended as a supposed fix for the frequent occurrence of properties that are not represented at a correct and consistent level of generality. It is sometimes as simple as the name of the property being too general (e.g. "controls" instead of "controllsFinancially"). Sometimes it is more complicated... An appropriate correction, at minimum, is to apply a bit of discipline in identifying what is the specificity of the property intended, naming and labelling it in a way that reflects that and setting a domain and range appropriately to that. It is also good practice to evaluate whether you can define a narrow property that you need immediately as a subPropertyOf a general property that already exists or that you can also create. This helps to define your specific property more clearly, as well as creating or connecting to reusable content. Given the intended meaning of a concept, it should surely set the domain (and maybe range) which corresponds to the meaning of the concept, e.g. a property that is explicitly about contracts should have a domain of Contract. But this requires imagination so that when you think about the meaning of a property you think about all the things it can be a property of and all the kinds of thing it can be framed in terms of - creating a sub-property or a restriction as appropriate for the concept you were originally thinking of. Why are domain/range constraints so problematic? They arise quite naturally from any UML class diagram These mistakes of over constraining domain and range are routinely made in UML diagrams, with relationships being stated at a lower level of abstraction than is really true. For example, an ontology for equipment, may say that one type of equipment must have another type of equipment as a part, but there are other things than equipment for which this is true. The problem is worst in OWL because people frequently misunderstand the effect of domain and range there. I have only seen this disuse recommendation there, perhaps because it is harder there (than in more expressive languages) to say what you mean to say about domain and range. This is because in OWL they have an inferential semantics and most non-DL conceptual modellers do not know that and think of them as constraints. This makes their usage difficult and often problematic. The constraint vs type-inference consequences are a big source of confusion. It is exacerbated by the difficulty of creating the constraint-like d/r in OWL, versus other languages. In some languages, there are simply alternative properties to use depending on which type of assertion you mean to make (see schema:domainIncludes or Cycl arg constraints for example. There's also N-BOXes which are attempts to add NAF to OWL see: http://trowl.eu/ FIBO started out with what's on the corresponding UML class diagrams, and created a deep subsumption hierarchy of properties. This wasn't ideal for OWL usage since in many cases the multiple properties represented the same meaning with some changes to range. The balance we are trying to aim for is to have a separate property only when there is an identifiably new meaning in play. However if I'm honest we haven't achieved that in the current version (someone decided to promote loads of properties to have no domain or range!!) You can use Events and States as classes, both for NLP and other uses, and so will have a Stative like Possess, which is generic but has local property restrictions for generic thematic participants (doing the job of domains/ranges), then have more specific events/states under these with more specialized property restrictions. The point is that the discussion of domain/range is part of the ontological analysis phase of ontology design, but that it is not some new concept that is foreign to someone who knows UML class diagrams. See this G+ post from BernardVatant this morning, and the related comments (on domain range specification in LOV vocabularies) https://plus.google.com/114406186864069390644/posts/D3kkqNCoQZ9 You can conclude what is a range/domain from a restriction, but without at least saying what is the domain/range of a property, how can you relate concepts with one another? So I'd say that domain/range is a minimum to imply some structure on an ontology. (4BSO)
-- maintained by the Track co-champions ... please do not edit (43E3)