ontology-summit
[Top] [All Lists]

Re: [ontology-summit] [ReusableContent] Partitioning the problem

To: Ontology Summit 2014 discussion <ontology-summit@xxxxxxxxxxxxxxxx>
From: Amanda Vizedom <amanda.vizedom@xxxxxxxxx>
Date: Sat, 1 Feb 2014 18:59:26 -0500
Message-id: <CAEmngXtwukiuF3Sea26urs=Vm4fU4GkO8cT0vMMLZLTZCCke-w@xxxxxxxxxxxxxx>



On Sat, Feb 1, 2014 at 6:01 PM, Ali SH <asaegyn+out@xxxxxxxxx> wrote:
Dear Amanda, Kingsley and David,

On Sat, Feb 1, 2014 at 3:04 PM, Amanda Vizedom <amanda.vizedom@xxxxxxxxx> wrote:

Your proposed solution - as best I can tell, to choose one target set of humans and make the (meant for machine consumption) URIs (or even names!) understandable to them, while ignoring the polysemy-tolerant, built-for-natural-language labeling features of the ontology language, is inherently antithetical to reuse (including use over time). 

I don't believe David is saying this. I sympathize with his conundrum. He isn't saying that the human readable URI's are intended to exactly denote the semantics of what is represented in the ontology.  Rather, that people who are using these URI's to build applications, in the form of code or queries riding on top of the ontologies have more difficulty if they are anchored in a completely opaque naming system.

In my experience, that just isn't true. 

Ignore that examples of really long and confusing identifiers have been thrown around, here. Much shorter and simpler character strings can be used for IDs within an ontology. Sure, use namespaces or other mechanism to localize to the particular ontology (or microtheory, or ...?); that's great. 6 hexadecimal char strings, for example, are well within the capability of most coders to compare. I am relatively poor at number and non-word recall, and I found one such system, quite large, to be easy to work with. Did I memorize what concepts each of these strings corresponded to? No; whether I was working on the ontology directly, browsing it, looking stuff up in it, querying, developing pattern-matching code that used the ontology, debugging weird test results from and indexing run, or what have you, the hex code ID could (usually was, by default) shown *with* a pref label for my language. Folks working in extending the French lexicalization or doing QA testing for a francophone localization could have the default show pref label in (fr) or some localization thereof, for example. So I might see 4G61XS (dog) and Claude might see 4G61XS (chien) in the indexer results or while browsing the ontology. If we were developing rules or tests and couldn't remember the name of the concept we wanted was, I could search on "dog" and compare the returns (multiple, since labels aren't unique) to find the right one, and Claude could do the same searching on "chien." Both of us would be reminded and motivated to check the other "dog"/"chien" matches.  In my experience, that apparent burden in fact results in a greater efficiency and accuracy; without that check, and with a suggestive name or label-only view, the rate at which people guess or assume and use the wrong one is high enough to cause a lot of extra work.
 
His example with the SPARQL queries is spot on, and something I've run into as well. When queries are written using completely opaque URI's, the task of maintaining, debugging and updating them is significantly complicated, leading to more opportunities for errors.

I understand, but I think it is mostly a tooling problem. The tools do not use the appropriate formal language features. Humans shouldn't be writing or debugging SPARQL queries with only the concept ID visible, whether it is opaque or suggestive. Either way, there is extra lookup (out of the cognitive task space) and a greater likelihood of error than is really tenable. Unfortunately, that is mostly the state of the art in open/COTS tools, but the way to fix it isn't to make the IDs more suggestive (and conducive to error); it's to make the tools use the human-oriented features of the language when interfacing with humans. BTW, I specified state of the art in *COTS* tools, because I've seen a number of proprietary tools, developed for use within an company only, that don't make this same error. I'm perpetually frustrated that we don't have the same level of tooling in the open-source or COTS worlds. But it is not a coincidence that the companies in question have done well in developing semantic enterprise or web systems with those ontologies as components. They take their ontologies, and the processes concerning them, rather seriously. 
 
If I've understood David's point correctly -- the same way that software developers employ useful NL analogues for the variable / class names to make the code more readable, ontologists should consider using similarly somewhat accessible labels. As someone who has had to debug SPARQL queries written using esoteric naming systems, the fact that those terms had "pref-labels" in a multitude of language did not help one iota. I had to constantly look up what the term referred, and it increased the debugging time by perhaps an order of magnitude.

As I suggested in a previous email, there's a balance to be struck, since a pure linguistic ID can indeed lead to unintended or hopeful semantics. But something like:

human.n.05

is readable to a human, and also clearly not intended to be interpreted naiively. One can still use labels (a la SKOS) to display different terms (e.g. homme) when presenting such concepts to SME's or other targeted audiences, but when one is building applications using the ontology identifiers, having something like human.n.05 vs RD54383 is much easier to follow the logic and debug.

That's in between, I think. You will still have to look up which human-related concept that is, and to Claude or someone else they may be equally opaque. I still don't see the advantage over having an IDE, or parts thereof that (a) shows you a prefLabel along with ID, according to your settings. 
 
As Simon and Ed alluded to, our brains have developed ways for holding various referents in our heads. We detect and utilize name patterns based on the shape and length of words. When the naming system follows an esoteric style, we don't have the ability to use these facilities, leading to potential errors and slower work.

True, but it is still the case that intelligible to some is opaque to others, and that suggestive often means giving rise to misuse. OWL has the built-in capabilities to give us precision and developer-appropriate language suggestions together. It's perfectly feasible to build efficient development tools that do so; one can even get a little fancier and connect tools to ontology to allow, for example, using the language part to look at alternatives without leaving the cognitive space. I know this is possible because I've used them. But the majority  of tools, and all the open or COTS ones I know of, just haven't had this kind of Human/cognitive interface attention given to them. 

I just don't think the solution is to treat the ontology language as more impoverished than it really is. We know there is far to go in improving tools, anyway. I'd say that one of the improvements should be to make tools that use the existing support for co-existing human-readability and machine-uniqueness.

Amanda


 

--


(•`'·.¸(`'·.¸(•)¸.·'´)¸.·'´•) .,.,


_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
Community Files: http://ontolog.cim3.net/file/work/OntologySummit2014/
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2014
Community Portal: http://ontolog.cim3.net/wiki/


_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/   
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontology-summit/  
Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
Community Files: http://ontolog.cim3.net/file/work/OntologySummit2014/
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2014  
Community Portal: http://ontolog.cim3.net/wiki/     (01)
<Prev in Thread] Current Thread [Next in Thread>