OntologySummit2014_Hackathon - Project: (48H4)
Post-event updates (4BM1)
Short report, also provided to Summit community on 2014-04-01 by discussion list (4BM2)
Ontological Catalog became VOCREF: Vocabulary and Ontology Characteristics Related to Evaluation of Fitness. (4BM3)
On Saturday, 29 March, Core team consisting of AmandaVizedom, AndreaWesterinen, SimonSpero (all U.S., ETD) worked a very full day, in excess of 12 hours. MiezaHamka (Johor, Malaysia) and LamarHenderson (Maryland, USA) each observed and chatted for periods of time. TrishWhetzel (CA, USA) had intended to participate, but was unable to make it due to family emergency. Both AnatolyLevunchuk and DanBrickley joined for periods of time, Dan weighed in substantially on some issues. All three core team members returned for substantial periods of work on Sunday. (4BM4)
Highlights of activities and accomplishments: (4BM5)
- Established GitHub Repository. Moved to user-independent github space so that all participants could work in fork away from master. Repository is at https://github.com/vocref/vocref and expected to stay there as project continues. (4BM6)
- Established OWL Functional Syntax as the stored serialization form, due to its vastly better behavior with version control systems (compared, for example, to OWL RDF). (4BM7)
- Established modularity goals. Worked out a plan for enabling both (a)small ontology modules with lowest feasible import burden and (b)additional modules (importing both the content ontologies modules and external vocabularies and ontologies) containing mappings between VOCREF and those external resources. Essential structure is spindle-style. Primary, content ontology levels include: top-level framework ontology (vocref-top) containing the essential concepts for representing vocabularies and ontologies, evaluation-related characteristics, and relationships between the two, as well as certain core classes of characteristics; small content ontologies inheriting from vocref-top, modeling families of characteristics; and an all-vocref ontology inheriting all vocref-internal content. For mapping, ontology modules will exist to as needed, importing from particular content modules and external ontologies or vocabularies to which they should be aligned, containing mapping assertions. This overall structure should enable users to use pieces of vocref appropriate to their application, with as little weight beyond the need as feasible, while maintaining semantic integrity. This initial pass is necessarily rough. Refactoring is a possibility if indicated by test application and evaluation down the road a bit. (4BM8)
- We focused most hackathon-day work on top-level module, vocref-top, where core design and concepts for describing Vocabulary and Ontology characteristics are established. (4BM9)
An enormous amount of work got done populating and refining vocref-top. Many representations and concepts related to ontologies and evaluation, from a variety of sources, were considered and weeded out (as, for example, out-of-scope or specific to the assumptions and focus or their original context, or of dubious quality in one way or another). As a result, we arrived at a satisfyingly solid vocref-top, where work will continue. (4BMA)
A good initial dent was also made in the review of candidate ontologies, vocabularies, and concepts from evaluation tools and methods for whole or partial reuse. Some were filtered out, some were partially integrated. We gathered a good stack of such resources to evaluate; and they are stored in the repository accordingly. (4BMB)
- We built a good issue store. We left a few documented issues behind in moving the repository. Most had already been closed, but a few, especially related to documentation for newcomers, need to be copied over and addressed in the near term. In the current repository, 40 issues (enhancement tasks, bug reports, and questions) have been opened. 28 were resolved and closed during the hackathon; 12 remain open. (4BMC)
One issue that was tagged for completion during the hackathon remains incomplete. This consists of adding sufficient annotations to concepts that were reused from the 2013 "ontology of ontology evaluation" output. This task is expected to be completed within the next week or so. An additional batch of three issues were tagged for completion by the end of the Ontology Summit. Eight issues have not been tied to any schedule, though several of these are still undergoing work and may be closable soon. A number of additional points and to-do items were noted in discussion and will be added. (4BMD)
Some issues could not be worked on simultaneously simply because of the lack of sufficient automated merge capabilities when working on the same module (vocref-top). We did need to take turns with substantial changes to this ontology. (4BME)
Overall, the hangout was incredibly productive. The collaboration time enabled the team to sort through considerable material effectively, consulting one another and bouncing ideas regularly. Pending a bit more documentation for newcomers should be provided (one of the near-term issues), the base now established is good and can support ongoing, asynchronous development of vocref as an open source resource. (4BMF)
Best Regards, AmandaVizedom on behalf of the "Ontological Catalog" / VOCREF team. (4BMG)
Post-event commentary on relation of VOCREF hackathon to OntologySummit2014 themes and developing synthesis (4BMH)
...coming (4BMI)
An ontological catalogue of ontology and metadata vocabulary characteristics relevant to suitability for semantic web and big data applications (48H5)
Team (49QQ)
- AmandaVizedom (EST, UTC-5) (Lead) (49QR)
- email: amanda(dot)vizedom(at)gmail(dot)com (4AUF)
- skype: ajvizedom (4AUG)
- google plus: google.com/+AmandaVizedom (4AUH)
- twitter: ajvizedom (4AUI)
- AndreaWesterinen (49QS)
- TrishWhetzel(Pacific Time) (4AXJ)
- SimonSpero (4B0X)
- LamarHenderson (4B7X)
- Please add yourself to the list if you are willing and able to participate! (4B78)
- Include your time zone after your name, for use in planning & coordination. (49QU)
- Please either include an email or other contact info, or, if you don't want to post that here, send it privately to AmandaVizedom. This info will only be used to communicate about the project, for example if there is a problem with our real-time communications plans during the hackathon or if some preparatory or follow-up action or info is needed. (4B79)
- Google Hangout: https://plus.google.com/hangouts/_/72cpiepm246a313g66jqm7pb7s started at 10am 29 Mar. If you have trouble joining the hangout on Saturday, contact Amanda or another team member so they can invite or re-invite you. (4B85)
- (4B7A)
- Please also create an account on GitHub.com, so that you can participate fully as a collaborator using the project repository at https://github.com/vocref/vocref. {nid 4B7B} (4B8A)
- ^^Note updated repository location^^ (4B8B)
Additional team members are welcome and appreciated. Direct ontology development from gathered resources will be the a primary activity. We would also welcome team members willing and able to use those resources (and/or own experience) to identify and document competency questions for this ontology. What questions and answers should it support about particular ontologies and vocabularies? Another valuable contribution would be the creation of examples in which the ontology developed so far is applied to one or more specific ontologies or vocabularies, to illustrate the use of the ontology and find coverage caps, reporting these as issues. If you would like to contribute in another way not mentioned, please contact us! (4AWX)
Goals (49QV)
The following are the goals of this hackathon: (49QW)
- Creation of a catalogue, represented in an ontology, of ontology and metadata vocabulary characteristics potentially relevant to an artifact's suitability for some semantic web or big data application. (49QX)
- Create this catalogue as an open-source, collaboratively developed and publicly available resource. (49QY)
- Create this catalogue in a form usable by both humans and machine applications, suitable for use in such contexts as: manually creating characteristics metadata for an ontology or metadata vocabulary; representing the outputs of evaluations of such artifacts; listing use-relevant metadata for such artifacts in repositories, registries, and other places and forms in which ontologies and metadata vocabularies may be presented for use or consideration. (49QZ)
Activities (49R0)
During this hackathon, we will: (49R1)
- Go directly to work on developing this catalog as a formal ontology. (49R2)
- Current plan is to create this ontology in the OWL 2 language. We will work in parallel, each using our OWL development tool of choice. The language selection is based on existing familiarity and level of use among semantic web and big data application designers, as well as ease of integration into existing repository and other environments. (4A77)
- We will maintain version control of the ontology using GitHub. The project repository is at https://github.com/vocref/vocref (49R3)
- We will communicate throughout via a Google Hangout. We'll coordinate work batches (maintaining as much modularity and parallelism as we can) verbally and via chat in the hangout. (4A78)
- Use our own experiences, discussions and presentations from this ontology summit, and reuse the material gathered and generated during last year's "Ontology of Ontology Evaluation" hackathon. (49R4)
- Applying lessons learned from last year's hackathon, and applying the focusing scope of this year's summit, we will not dedicate initial time to informal representation on discussion. Rather, we will go directly to formal representation, and use the process of such work to bring out whatever issues we need to discuss and resolve informally. (49R5)
Motivation (49R6)
In selecting, creating, or effectively using an ontology or metadata vocabulary, it is important to understand not only the requirements of the intended use, but the characteristics of any candidate ontology or vocabulary. (49R7)
However, understanding the characteristics of a given ontology or vocabulary is made more difficult by the following: (49R8)
- There is no consistent, meaningful, and widely adopted framework for describing such artifacts. (49R9)
- Much of the community-wide time spent on overall analysis of such artifacts focuses on attempting to classify them by overall type, in their entirely (e.g., what makes something an ontology vs. a subject classification system, taxonomy, thesaurus, or other artifact). These attempts tend to generate much heat without resolution. Moreover, even were such resolution were to occur, it would be of limited use in addressing the need for operational understanding. The granularity of such classifications is to high either to fit the diversity of ontologies and metadata vocabularies already in use or to help match such artifacts to specific use requirements. (49RA)
- Evaluation methodologies and tools vary widely in what aspects of ontologies or metadata vocabularies they assess, and how those aspects are modeled, and in whether relationships between those aspects and suitability for varied uses is addressed in any depth. (49RB)
- Reflecting the above, repositories, registries, and information about existing ontologies and vocabularies provide limited characterizations of those artifacts themselves. It is not feasible, at this stage, to useful list, or even ascertain, a set of characteristics that, if provided as metadata themselves, would be helpful to those looking for ontologies or metadata vocabularies suitable for some particular semantic web or big data application. (49RC)
- The above points also apply to ontologies and metadata vocabularies for use in other sorts of applications. In keeping with the Ontology Summit 2014 theme, however, the focus for this hackathon will be on characteristics relevant to use in semantic web and big data applications. (49RD)
Resources (49RE)
In the GitHub repository for this project, in the directory References and Resources > Papers and Specs, are a growing sample of documents providing or describing systems for ontology evaluation. Each of these provides a more or less explicit listing of ontology characteristics for evaluation. (4AWY)
There is less overlap than one might expect; while some expressions recur ("usability" or "accuracy," for example), discussion generally reveals that the underlying concepts are different, at least at the measurable / evaluable level. (4AWZ)
It is a goal of this project to develop a vocabulary sufficient for expressing the ontology characteristics listed and described in these documents. (4AX0)
In particular, some of these documents describe evaluation methods that have some implemented tools and/or user base. The ontology characteristics described in such documents should be given higher priority than those which may be more speculative. Insofar as there is more likely to be evaluation data based on these in-use characteristics, their coverage in VOCREF is most likely to be useful, or at least to have its usefulness put to the test. (4AX1)
We've also collected some ontologies and vocabularies that cover some portion of the relevant metadata. We'll evaluate them for reuse, likely incorporating some and aligning to others. (4AX2)
We will be reusing a portion of the draft work done during last year's "Ontology of Ontology Evaluation" Hackathon (49RO)