ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] doing standards [was - Re: Webby objects]

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "doug foxvog" <doug@xxxxxxxxxx>
Date: Wed, 28 Nov 2012 14:01:07 -0500
Message-id: <4a4433b8de7238f63fc57b64f222be86.squirrel@xxxxxxxxxxxxxxxxx>
On Wed, November 28, 2012 08:07, Andries van Renssen wrote:
> Addressing this issue requires the formalization of the language and
> homonym management within it.    (01)

Yes.    (02)

> When a system uses a formalized form of a natural language, then the
> system is able to determine whether an entered term has homonyms
> in the formal language.    (03)

A system can deal with homonyms whenever the system has mappings
between terms in the ontology and terms in the language.  It does not
matter if the system uses a formalized form of NL, templates for
generating NL output, has a set of NL rules for interpreting NL
statements, or gives NL menu options to a user .    (04)

> The various concepts denoted by the homonyms will have different unique
> ID's in the background.    (05)

Right.    (06)

Here, you are discussing an interactive system that uses an ontology.
To such a system, the local IDs are irrelevant.  Much of the discussion
up to this point has dealt with how the terms should be named.  So
long as there is a UI such as you are discussing, such naming issues
are irrelevant.  It is only relevant to those who look at the code and
raw data files.    (07)

Normal users don't look at raw data files in financial systems, personnel
systems, navigation systems, seismic analysis systems, etc. -- why
should they look at raw data files if they happen to be encoded using
an ontology?    (08)

> In case of homonyms the system can ask the data entry user to select
> the proper concept from the list of homonyms, each presented
> with a part of its definition model. In my experience the presentation of
> the 'language community' in which the term is based
> provides sufficient context to distinguish the homonyms.    (09)

Sure.  I note that you are still discussing the UI, not the identity of
terms in the ontology.    (010)

> Eventually the user should be able to issue a new homonym or term-UID
> combination with its definition for review and addition.    (011)

This would be a nice feature -- which doesn't exist in normal software.
It would be an advantage that ontologies could add to a system.    (012)

I note that such a feature would not require that the user know anything
about the term's ID.  The user would be adding a new phrase for some
existing concept, for example by selecting the term "enthalpy of fusion"
and providing "heat of fusion" as an alternate term for the concept -- and
specifying a preference for using that term.    (013)

> This will reduce the ambiguity at the data entry side.    (014)

It might, but i'm not sure of that.    (015)

> A receiving system that uses the same formal language will know which
> concepts are meant among the homonyms by inspecting the UID's.    (016)

Huh?  If, by "receiving system", you mean a computer process which
receives data from another computer process, it would receive the UIDs,
not any NL.  If it was to output received data for a human to read, then
it would select among the various possible terms that could be used
according to a set of rules, including preferences based on the user's
field(s) of expertise, group(s) the user may be part of, and the user's
stated preferences.    (017)

> During a search the searching user can also be presented the list of
> homonyms with their language communities.    (018)

Perhaps, you mean that the user would be presented with a list
of definitions for the term s/he is searching for, and asked to select
among them.  Identifying the communities that use each term could
help the user to narrow the search.    (019)

Such a system could provide a few answers with each of the meanings
when the disambiguation is asked for -- both to help the user to decide
on the desired meaning and to possibly provide a needed result without
the user having to take a additional step.    (020)

> Btw.: Why are ontologists creating non-natural language terms by
> concatenating words and adding capitals?    (021)

* Because most computer systems take a space as a term separator.
* Because it distinguishes them from natural language terms.
* Because "camel case" is easier to read than running together lower
   case words.
* Because some systems disallow one or more of the characters: _ - .
   from being part of a term.    (022)

> For example the concept denoted by the artificial term NewsArticle    (023)

All terms are artificial.    (024)

> in Schema.org. Let's not invent
> new words and call such a thing just a news article.    (025)

These are not words.  No new words are being invented.  If you think you
know what "news article" means, but are unsure of what "NewsArticle"
means, good!  That might entice you to look up its meaning -- and
depending upon the definition you might find out if the referent is to
a computer file, a conceptual work, a text string, a portion of a video
broadcast, a physical clipping from a physical newspaper, a posting
on UseNet, some combination of two or more of these, some variation
of one or more of these, or something else all together.    (026)

-- doug foxvog    (027)

> Andries
>
>> -----Oorspronkelijk bericht-----
>> Van: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
>> [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] Namens
>> John F Sowa
>> Verzonden: woensdag 28 november 2012 6:21
>> Aan: ontolog-forum@xxxxxxxxxxxxxxxx
>> Onderwerp: Re: [ontolog-forum] doing standards [was - Re: Webby objects]
>>
>> Leo, Amanda, Ed, Doug, and Pat C,
>>
>> I agree with all your hopes, fears, and observations.
>>
>> Unfortunately, every system that allows humans to enter data and read
>> the results must face the fact
>> that nobody will read, understand, remember, and use the definitions
>> correctly and consistently.  Even
>> the people who write the definitions don't always use their own
>> definitions consistently.
>>
>> Leo
>> > ... people will fight to the death to include their "words", mistaking
>> > these for the concepts behind them.
>>
>> That's true, but understandable.  Since most people never read the
>> definitions, the words will have
>> more influence.
>>
>> AV
>> >  I have also experienced the significant gains in usability and
>> > efficiency that can come from using concept IDs that are not easily
>> > interpreted by humans (e.g., hexadecimals at Convera). IMHO, that is
>> > the way to go -- it's amazing how much confusion is avoided...
>>
>> EB
>> > Yes, if you can get the community to maintain the discipline...
>>
>> I agree with Ed's caveat.  That kind of discipline can be maintained
>> with a small group of highly
>> motivated developers.  But it is extremely hard to continue it as the
>> group expands and they have to
>> train new hires and customers.
>>
>> EB
>> > They pronounce the codes, which mean nothing to the accountant or the
>> > man on the dock.
>>
>> Yes.  How many people who have a 401K plan have read, understood, and
>> remembered the definition?
>> Anybody who learns those codes just uses them in the same way that they
>> use any other words or
>> phrases.
>>
>> DF
>> > You can find lots of terms in OpenCyc whose names could be interpreted
>> > in multiple ways, yet whose #$comments do little more than restate the
>> name.
>>
>> That is also true of 99% of the published OWL ontologies.  In fact, most
>> of the OWL ontologies are
>> grossly underspecified.  The only so-called "definitions" are English
>> comments that the computer
>> ignores.
>>
>> DF
>> > Many ontologies (e.g., the OBO ontologies) get around this by using
>> > numeric strings as IDs of the ontology terms, forcing people to look
>> > at various comments and alternate names to understand what is
>> > intended.
>>
>> That practice began long before OBO.  SNOMED used 4-digit ids with each
>> digit as a key to some branch
>> of their ontology.  As they grew, they added more digits.  But encoding
>> meaning in the digits of an
>> identifier is a bad practice that computer scientists have been warning
>> against for years.
>>
>> In practice, most of the readable terms in OBO are univocal: drugs,
>> biological species, diseases,
>> medical instruments and procedures.
>> The string 22298006 is no more precise than 'myocardial infarction'.
>>
>> But note Ed's observation.  Unreadable codes don't force anybody to read
>> the definitions.  People
>> still start with the glosses, and rarely study the formal definitions.
>>
>> AV
>> > I have also experienced the significant gains in usability and
>> > efficiency that can come from using concept IDs that are not easily
>> > interpreted by humans (e.g., hexadecimals at Convera).
>>
>> Specialists in every branch of science, engineering, medicine, and the
>> arts have developed precisely
>> defined terminologies that are unknown to the unwashed.  It's possible
>> (but not easy) to select
>> univocal phrases.
>>
>> PC
>> > So, instead of, e.g. a term "Process" (never defined the same way in
>> > any two upper ontologies I have seen), we might have
>> "ContinuousProcess"
>> > for phenomena describable by differential equations, or
>> "DiscontinuousProcess"
>> > for an Event represented as a series of steps (which may be considered
>> > instantaneous, or take some finite time, which distinction may lead to
>> > more refinement or expansion of the labels). When more than one
>> > ontology is being considered, the formal way of doing this is just to
>> > use namespace prefixes so everyone can use the same label, but specify
>> > the namespace when creating the logical specification.
>>
>> I agree.  Martin Hepp put a lot of effort into choosing meaningful and
>> readable names for his
>> GoodRelations ontology.  That is one reason why it became popular.  And
>> that's also why Google,
>> Microsoft, Yahoo, and Yandex adopted it for Schema.org.
>>
>> Note that Schema.org uses readable English terms (mostly multi-word) for
>> their ontology.  And note
>> that the primary Google spokesperson for Schema.org is Guha -- who had
>> been the associate director of
>> Cyc, the chief designer of RDF, and the co-author (with Pat Hayes) of
>> the logic base (LBase) for RDF,
>> which uses a version of the same semantics as Common Logic.  So he's
>> familiar with the issues.
>>
>> In Summary, I agree that there is a problem with people reading just the
>> names instead of looking at
>> the definitions.  But making the names unreadable is not a solution.  I
>> don't agree with Guha about
>> everything, but I think that he (and the other Schemers) made some good
>> choices for Schema.org.
>>
>> John
>>
>> _________________________________________________________________
>> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
>> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
>> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
>> Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
>> http://ontolog.cim3.net/wiki/ To join:
>> http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>>
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>
>    (028)



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (029)

<Prev in Thread] Current Thread [Next in Thread>