[Top] [All Lists]

Re: [ontolog-forum] Meaningful labels [was: Fixed labels in software?]

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Rob Freeman <lists@xxxxxxxxxxxxxxxxxxx>
Date: Thu, 18 Feb 2010 20:55:20 +1300
Message-id: <7616afbc1002172355i61296959h99581e0fdc3ffeca@xxxxxxxxxxxxxx>
Hi David,    (01)

A fairly loose reply. Hope it hits some buttons for you.    (02)

You might summarize one big point as to contrast the way you are
looking at the problem, which is to make people conform (with "good
labels"), and the way I am looking at the problem, which is to make
computers adapt.    (03)

Under it all is this idea that "meaning" is multiple and interpretive.    (04)

On Wed, Feb 17, 2010 at 12:00 PM, David Eddy <deddy@xxxxxxxxxxxxx> wrote:
> ...
> IF we accept that one of the major expected  benefits of ontologies
> is to make it easier for information systems (e.g. software) to interoperate
> (e.g. exchange accurate, meaningful data), then isn't it necessary to
> have "meaningful" labels in & around software?    (05)

Computers just deal with bits. The "meaningful label" problem is
purely a human one.    (06)

The way I am understanding your problem is where a programmer is
directing the computer to interact with some data, based on the way
that programmer has interpreted the label on that data, but that
interpretation might not match the way someone else, say a user,
interprets it.    (07)

That's a problem where two humans have labeled different data the same way.    (08)

Or there might be a problem where you have two labels, and you would
like the computer to deal with the data in the same way, but there is
no way for it to know they are the same.    (09)

That's a problem where two humans have labeled the same data different ways.    (010)

Labels are essentially a human problem.    (011)

Ideally we could find some way for the computer to solve this human
problem by interpreting labels in ways similar to humans.    (012)

> Personally I have experienced that "good labels" are extremely useful.
> In systems when the labels also reliably indicate what the data is,
> that's bordering on miraculous.  The combination of the two is an extremely 
> happenstance.    (013)

I'm guessing "good labels" means that the program designer was
particularly good at communicating with you.    (014)

The computer won't care, so long as the data is what it expects.    (015)

> I am reading your statements to mean that good labels are not necessary.
> What magic associated with ontologies will make good labels unneeded?    (016)

There is no magic which will make labels unneeded for humans. Humans
will always need them.    (017)

What would be useful is if there were a way to program a computer to
interpret labels in ways similar to humans. That would prevent
problems where two humans have interpreted labels in different ways.    (018)

> How about if there were a mechanism (clearly yet to be defined) whereby
> humans (maybe even programs?) could verify what a label means?    (019)

You mean interpret it in a way similar to the way a human would. A
label doesn't "mean" anything.    (020)

> Example: (Not necessarily universal, but within the context of a company
> & its operating subsidiaries)
> [A]  PostalZip Code   (means the string of numbers or letters used in
> a legal
> mailing address)
> [B]  Postal Code (the 7 character string representing .....)
> [C]  Zip Code (the 5, 9, 11, or 13 set of digits in a mailing
> address...)
> The problem today, of course, is that in a program or database a
> field labeled Zip Code, depending on how it's technically implemented 
> not know unless you're deep inside the code of the system) could actually also
> contain Postal (or Post) Codes.
> Such "minutia" plays havoc with interoperability efforts.
> I would posit that if I were a clueless programmer (explaining WHY a
> programmer is doing something is typically not in the cards) faced with an 
> defined data interoperability task, if I had a mechanism that I could easily
> (<--- key idea) look up what a label means in the relevant context of the
> moment, that would be a huge inch-pebble leap forward.
> Currently what labels (both good & bad) mean is walking around in
> peoples heads & not accessible to automation.    (021)

I agree. That is the big problem: automating the interpretation process.    (022)

I think the way to do that is to look at some set defining the context
of the label you are using, and pull up other uses of that label with
similar context sets. As Kuhn says of Wittgenstein "a discussion of
_some_ of the attributes shared ... often helps us learn how to employ
the corresponding term".    (023)

You would need to work with raw attributes (e.g. examples) because
there is no fixed set you can find which will work in all cases. But
usually you will be able to find different subsets which will provide
some overlap and help make a distinction.    (024)

>>> Or is there some means to avoid opaque, difficult-to-understand,
>>> ambiguous labels in software?
>> I think you could avoid the opaqueness, but then you would have to
>> have some way to make the labels more ambiguous. Perverse, I know.
> MORE ambiguous!?
> Clearly you have something in mind... please to explain.    (025)

I just meant make the way a computer uses a label more subject to
interpretation. You need to make it interpret them in a human-like
way, of course.    (026)

>> Actually, because the terms used in actual code are usually quite
>> rigorously defined, software is one of the few places where you will
>> find unambiguous labels.
> It's been my experience that terms in code are NOT rigorously defined.
> They may start that way, but over time the tendency is to decay or
> wander.    (027)

I was thinking of sub-routine labels in higher-level programming
languages. They always do the same thing.    (028)

> I am thinking of the "nouns" in code... not the "verbs" of the actual
> software language.  The nouns are the business stuff that gets
> collected, stored & moved around.    (029)

Yes, exactly. The "verbs" are fixed by their procedural
interpretation. The "nouns" vary because people will put stuff in them
according to their own interpretation. If the computer code interprets
them inflexibly in some other way, you have a problem.    (030)

As I say, as a very rough idea, what you might do is attach sets of
examples to your labels to match different typical use cases. Then you
could select a use case for a label according to the best matching set
of examples.    (031)

Sets of actual contents of the data objects in one or other use case
might suffice as sets of distinguishing "attributes".    (032)

So, if you have a process COMPUTE A = B + C, and two programmers
interpret A, B, C, differently. Attach sets of examples to the
different use cases of A, B, and C, and use an actual use, or set of
uses, to select an interpretation.    (033)

The advantage of using sets is that you could resolve ambiguities
where different use cases happen to have some overlap, they generalize
better, etc. There's no single set of attributes you could use, so you
would need to find overlaps between sets, ad-hoc.    (034)

I would call that constructing meaning for a label in a human-like way.    (035)

> What's important for good labels (names) is consistency (e.g. this means
> assisting humans in a task that humans are not very good at) and some
> sort of mechanism that attaches definitions/meanings & context to the labels.    (036)

Consistency is what computers demand, but forcing people to do what
computers want is not ideal. Even if you "assist" them with lots of
training and big penalties.    (037)

Much better to find a way to assist the computer to interpret data the
same way as humans. If you try to assist the human to reach the right
decision by attaching lots of context, why not find a way to assist a
computer to use that context instead.    (038)

> Perhaps querying the mystical ontology repository could be such a
> mechanism?    (039)

I like John's idea because it incorporates at base the idea that no
single complete theory is possible. So it starts from the premise that
labels will be associated with multiple meanings. That's a good start,
and light-years ahead of what we have had up to now, as a theoretical
perspective, anyway.    (040)

But I don't think it is perfect.    (041)

The first crunch will be mapping between those different "meanings" I'm sure.    (042)

I think mapping using sets of attributes (observations about the
world, by preference) is the way to go. It best fits the way humans do
it. To do that an OOR might have sets of examples attached to its
labels, with the examples identifying particular theories, and use
ad-hoc sets of those examples to make particular choices.    (043)

Another crunch will be enumerating enough theories. I would like to
see that done automatically from sets of examples too. Some kind of
ad-hoc machine learning.    (044)

But I am happy as a start to see widespread adoption of the premise
that there is no single complete theory.    (045)

-Rob    (046)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (047)

<Prev in Thread] Current Thread [Next in Thread>