Addressing this issue requires the formalization of the language and homonym
management within it.
When a system uses a formalized form of a natural language, then the system is
able to determine whether an entered term has
homonyms in the formal language.
The various concepts denoted by the homonyms will have different unique ID's in
the background. (01)
In case of homonyms the system can ask the data entry user to select the proper
concept from the list of homonyms, each presented
with a part of its definition model. In my experience the presentation of the
'language community' in which the term is based
provides sufficient context to distinguish the homonyms.
Eventually the user should be able to issue a new homonym or term-UID
combination with its definition for review and addition. (02)
This will reduce the ambiguity at the data entry side.
A receiving system that uses the same formal language will know which concepts
are meant among the homonyms by inspecting the UID's.
During a search the searching user can also be presented the list of homonyms
with their language communities. (03)
Btw.: Why are ontologists creating non-natural language terms by concatenating
words and adding capitals? For example the concept
denoted by the artificial term NewsArticle in Schema.org. Let's not invent new
words and call such a thing just a news article. (04)
Andries (05)
> -----Oorspronkelijk bericht-----
> Van: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
>[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] Namens
> John F Sowa
> Verzonden: woensdag 28 november 2012 6:21
> Aan: ontolog-forum@xxxxxxxxxxxxxxxx
> Onderwerp: Re: [ontolog-forum] doing standards [was - Re: Webby objects]
>
> Leo, Amanda, Ed, Doug, and Pat C,
>
> I agree with all your hopes, fears, and observations.
>
> Unfortunately, every system that allows humans to enter data and read the
>results must face the fact
> that nobody will read, understand, remember, and use the definitions
>correctly and consistently. Even
> the people who write the definitions don't always use their own definitions
>consistently.
>
> Leo
> > ... people will fight to the death to include their "words", mistaking
> > these for the concepts behind them.
>
> That's true, but understandable. Since most people never read the
>definitions, the words will have
> more influence.
>
> AV
> > I have also experienced the significant gains in usability and
> > efficiency that can come from using concept IDs that are not easily
> > interpreted by humans (e.g., hexadecimals at Convera). IMHO, that is
> > the way to go -- it's amazing how much confusion is avoided...
>
> EB
> > Yes, if you can get the community to maintain the discipline...
>
> I agree with Ed's caveat. That kind of discipline can be maintained with a
>small group of highly
> motivated developers. But it is extremely hard to continue it as the group
>expands and they have to
> train new hires and customers.
>
> EB
> > They pronounce the codes, which mean nothing to the accountant or the
> > man on the dock.
>
> Yes. How many people who have a 401K plan have read, understood, and
>remembered the definition?
> Anybody who learns those codes just uses them in the same way that they use
>any other words or
> phrases.
>
> DF
> > You can find lots of terms in OpenCyc whose names could be interpreted
> > in multiple ways, yet whose #$comments do little more than restate the name.
>
> That is also true of 99% of the published OWL ontologies. In fact, most of
>the OWL ontologies are
> grossly underspecified. The only so-called "definitions" are English
>comments that the computer
> ignores.
>
> DF
> > Many ontologies (e.g., the OBO ontologies) get around this by using
> > numeric strings as IDs of the ontology terms, forcing people to look
> > at various comments and alternate names to understand what is
> > intended.
>
> That practice began long before OBO. SNOMED used 4-digit ids with each digit
>as a key to some branch
> of their ontology. As they grew, they added more digits. But encoding
>meaning in the digits of an
> identifier is a bad practice that computer scientists have been warning
>against for years.
>
> In practice, most of the readable terms in OBO are univocal: drugs,
>biological species, diseases,
> medical instruments and procedures.
> The string 22298006 is no more precise than 'myocardial infarction'.
>
> But note Ed's observation. Unreadable codes don't force anybody to read the
>definitions. People
> still start with the glosses, and rarely study the formal definitions.
>
> AV
> > I have also experienced the significant gains in usability and
> > efficiency that can come from using concept IDs that are not easily
> > interpreted by humans (e.g., hexadecimals at Convera).
>
> Specialists in every branch of science, engineering, medicine, and the arts
>have developed precisely
> defined terminologies that are unknown to the unwashed. It's possible (but
>not easy) to select
> univocal phrases.
>
> PC
> > So, instead of, e.g. a term "Process" (never defined the same way in
> > any two upper ontologies I have seen), we might have "ContinuousProcess"
> > for phenomena describable by differential equations, or
>"DiscontinuousProcess"
> > for an Event represented as a series of steps (which may be considered
> > instantaneous, or take some finite time, which distinction may lead to
> > more refinement or expansion of the labels). When more than one
> > ontology is being considered, the formal way of doing this is just to
> > use namespace prefixes so everyone can use the same label, but specify
> > the namespace when creating the logical specification.
>
> I agree. Martin Hepp put a lot of effort into choosing meaningful and
>readable names for his
> GoodRelations ontology. That is one reason why it became popular. And
>that's also why Google,
> Microsoft, Yahoo, and Yandex adopted it for Schema.org.
>
> Note that Schema.org uses readable English terms (mostly multi-word) for
>their ontology. And note
> that the primary Google spokesperson for Schema.org is Guha -- who had been
>the associate director of
> Cyc, the chief designer of RDF, and the co-author (with Pat Hayes) of the
>logic base (LBase) for RDF,
> which uses a version of the same semantics as Common Logic. So he's familiar
>with the issues.
>
> In Summary, I agree that there is a problem with people reading just the
>names instead of looking at
> the definitions. But making the names unreadable is not a solution. I don't
>agree with Guha about
> everything, but I think that he (and the other Schemers) made some good
>choices for Schema.org.
>
> John
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
>http://ontolog.cim3.net/wiki/ To join:
> http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> (06)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (07)
|