Doug, (01)
> -----Oorspronkelijk bericht-----
> Van: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
>[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] Namens
> doug foxvog
> Verzonden: woensdag 28 november 2012 20:01
> Aan: [ontolog-forum]
> Onderwerp: Re: [ontolog-forum] doing standards [was - Re: Webby objects]
>
> On Wed, November 28, 2012 08:07, Andries van Renssen wrote:
> > Addressing this issue requires the formalization of the language and
> > homonym management within it.
>
> Yes.
>
> > When a system uses a formalized form of a natural language, then the
> > system is able to determine whether an entered term has homonyms in
> > the formal language.
>
> A system can deal with homonyms whenever the system has mappings between
>terms in the ontology and
> terms in the language. It does not matter if the system uses a formalized
>form of NL, templates for
> generating NL output, has a set of NL rules for interpreting NL statements,
>or gives NL menu options
> to a user .
>
[AvR] By creating 'ontology terms' in addition to 'NL terms' for the same
concepts, we only add synonyms to denote those concepts.
I think this should be avoided. (02)
Since long e.g. the Chambers English 'Dictionary of Science and Technology'
provides for every term a 'language community' within
which the term is a unique denotation for a concept. Thus by combining a
language, a language community and a term we have a unique
key for a concept. If we want to allow in a formal language that multiple
unique combinations can denote the same concept, just as
in NL, and if we want to avoid concatenated codes, then we should represent the
concept in the formal language also by an arbitrary
key (which I call a UID).
This supports the use of homonyms without creating separating NL terms from
ontology terms and then mapping them. (03)
The formalization of a language enables that a formal language dictionary is
adopted for expressing ontologies as well as other
formal language expressions. (04)
> > The various concepts denoted by the homonyms will have different
> > unique ID's in the background.
>
> Right.
>
> Here, you are discussing an interactive system that uses an ontology.
> To such a system, the local IDs are irrelevant. Much of the discussion up to
>this point has dealt
> with how the terms should be named. So long as there is a UI such as you are
>discussing, such naming
> issues are irrelevant. It is only relevant to those who look at the code and
>raw data files.
>
[AvR] Local ID's are indeed irrelevant, but I mean formal language wide UID's,
accompanied by multiple (synonym) terms and language
communities. (05)
> Normal users don't look at raw data files in financial systems, personnel
>systems, navigation systems,
> seismic analysis systems, etc. -- why should they look at raw data files if
>they happen to be encoded
> using an ontology?
>
[AvR] I agree. Users deal with the terms and sometimes with the language
communities within which the terms uniquely denote a
concept. (06)
> > In case of homonyms the system can ask the data entry user to select
> > the proper concept from the list of homonyms, each presented with a
> > part of its definition model. In my experience the presentation of the
> > 'language community' in which the term is based provides sufficient
> > context to distinguish the homonyms.
>
> Sure. I note that you are still discussing the UI, not the identity of terms
>in the ontology.
>
[AvR] The UID's formally represent the concepts within the formal language. The
terms represent the concepts in a user interface. (07)
> > Eventually the user should be able to issue a new homonym or term-UID
> > combination with its definition for review and addition.
>
> This would be a nice feature -- which doesn't exist in normal software.
> It would be an advantage that ontologies could add to a system.
>
[AvR] I consider this an important part of any practical implementation of an
ontology in application systems. (08)
> I note that such a feature would not require that the user know anything
>about the term's ID. The
> user would be adding a new phrase for some existing concept, for example by
>selecting the term
> "enthalpy of fusion"
> and providing "heat of fusion" as an alternate term for the concept -- and
>specifying a preference for
> using that term.
>
[AvR] Indeed, the UID's should not be allocated by the users, but by the formal
language terminology manager, or by a system under
his responsibility. (09)
> > This will reduce the ambiguity at the data entry side.
>
> It might, but i'm not sure of that.
>
[AvR] At data entry at least a choice is made between concepts that are denoted
by homonyms. (010)
> > A receiving system that uses the same formal language will know which
> > concepts are meant among the homonyms by inspecting the UID's.
>
> Huh? If, by "receiving system", you mean a computer process which receives
>data from another computer
> process, it would receive the UIDs, not any NL. If it was to output received
>data for a human to
> read, then it would select among the various possible terms that could be
>used according to a set of
> rules, including preferences based on the user's
> field(s) of expertise, group(s) the user may be part of, and the user's
>stated preferences.
>
[AvR] Such a receiving 'computer process' receives the UID's and has access to
the formal language dictionary (possibly in his
language and local language community synonyms). Thus the user interface can
present the dictionary terms, if required also at least
with their language communities. (011)
> > During a search the searching user can also be presented the list of
> > homonyms with their language communities.
>
> Perhaps, you mean that the user would be presented with a list of definitions
>for the term s/he is
> searching for, and asked to select among them. Identifying the communities
>that use each term could
> help the user to narrow the search.
>
[AvR] I mean a list of terms, whereas each term is accompanied by a language
community. The combination uniquely denoted a concept.
The user needs to choose the combination (the concept) from the list.
Such a mechanism to help a user to select a concept, instead of selecting an
(ambiguous) term, can also be used to disambiguate a
query.
For example, the term building is a homonym that may be an activity or a
physical object, which are based e.g. in language community
'constructing' and in language community 'architecture'.
The query in a formal language:
what <is classified as a> building
can trigger a disambiguation, because the formal language processor will ask
the user which of the two concepts are meant: building
(constructing) or building (architecture) (012)
> Such a system could provide a few answers with each of the meanings when the
>disambiguation is asked
> for -- both to help the user to decide on the desired meaning and to possibly
>provide a needed result
> without the user having to take a additional step.
>
> > Btw.: Why are ontologists creating non-natural language terms by
> > concatenating words and adding capitals?
>
> * Because most computer systems take a space as a term separator.
> * Because it distinguishes them from natural language terms.
> * Because "camel case" is easier to read than running together lower
> case words.
> * Because some systems disallow one or more of the characters: _ - .
> from being part of a term.
>
[AvR] These are implementation reasons. However, names of concepts should be
computer system and implementation independent.
Thus they are not valid reasons. (013)
> > For example the concept denoted by the artificial term NewsArticle
>
> All terms are artificial.
>
[AvR] Only if you call a natural language also an artificial language. (014)
> > in Schema.org. Let's not invent
> > new words and call such a thing just a news article.
>
> These are not words. No new words are being invented. If you think you know
>what "news article"
> means, but are unsure of what "NewsArticle"
> means, good! That might entice you to look up its meaning -- and depending
>upon the definition you
> might find out if the referent is to a computer file, a conceptual work, a
>text string, a portion of a
> video broadcast, a physical clipping from a physical newspaper, a posting on
>UseNet, some combination
> of two or more of these, some variation of one or more of these, or something
>else all together.
>
[AvR] The question is whether also a computer system automatically knows that a
NewsArticle denotes the same concept as a news
article.
And whether a computer system can distinguish between a Mm and a Mm, when the
first denotes a millimeter and the second a megameter.
And why we would need to create synonyms such as Article <is a synonym of>
article or Article <is mapped to> article, whereas
Article denotes an ontology concept and article denotes the same concept in
NL?? (015)
Andries (016)
> -- doug foxvog
>
> > Andries
> >
> >> -----Oorspronkelijk bericht-----
> >> Van: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
> >> [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] Namens John F Sowa
> >> Verzonden: woensdag 28 november 2012 6:21
> >> Aan: ontolog-forum@xxxxxxxxxxxxxxxx
> >> Onderwerp: Re: [ontolog-forum] doing standards [was - Re: Webby
> >> objects]
> >>
> >> Leo, Amanda, Ed, Doug, and Pat C,
> >>
> >> I agree with all your hopes, fears, and observations.
> >>
> >> Unfortunately, every system that allows humans to enter data and read
> >> the results must face the fact that nobody will read, understand,
> >> remember, and use the definitions correctly and consistently. Even
> >> the people who write the definitions don't always use their own
> >> definitions consistently.
> >>
> >> Leo
> >> > ... people will fight to the death to include their "words",
> >> > mistaking these for the concepts behind them.
> >>
> >> That's true, but understandable. Since most people never read the
> >> definitions, the words will have more influence.
> >>
> >> AV
> >> > I have also experienced the significant gains in usability and
> >> > efficiency that can come from using concept IDs that are not easily
> >> > interpreted by humans (e.g., hexadecimals at Convera). IMHO, that
> >> > is the way to go -- it's amazing how much confusion is avoided...
> >>
> >> EB
> >> > Yes, if you can get the community to maintain the discipline...
> >>
> >> I agree with Ed's caveat. That kind of discipline can be maintained
> >> with a small group of highly motivated developers. But it is
> >> extremely hard to continue it as the group expands and they have to
> >> train new hires and customers.
> >>
> >> EB
> >> > They pronounce the codes, which mean nothing to the accountant or
> >> > the man on the dock.
> >>
> >> Yes. How many people who have a 401K plan have read, understood, and
> >> remembered the definition?
> >> Anybody who learns those codes just uses them in the same way that
> >> they use any other words or phrases.
> >>
> >> DF
> >> > You can find lots of terms in OpenCyc whose names could be
> >> > interpreted in multiple ways, yet whose #$comments do little more
> >> > than restate the
> >> name.
> >>
> >> That is also true of 99% of the published OWL ontologies. In fact,
> >> most of the OWL ontologies are grossly underspecified. The only
> >> so-called "definitions" are English comments that the computer
> >> ignores.
> >>
> >> DF
> >> > Many ontologies (e.g., the OBO ontologies) get around this by using
> >> > numeric strings as IDs of the ontology terms, forcing people to
> >> > look at various comments and alternate names to understand what is
> >> > intended.
> >>
> >> That practice began long before OBO. SNOMED used 4-digit ids with
> >> each digit as a key to some branch of their ontology. As they grew,
> >> they added more digits. But encoding meaning in the digits of an
> >> identifier is a bad practice that computer scientists have been
> >> warning against for years.
> >>
> >> In practice, most of the readable terms in OBO are univocal: drugs,
> >> biological species, diseases, medical instruments and procedures.
> >> The string 22298006 is no more precise than 'myocardial infarction'.
> >>
> >> But note Ed's observation. Unreadable codes don't force anybody to
> >> read the definitions. People still start with the glosses, and
> >> rarely study the formal definitions.
> >>
> >> AV
> >> > I have also experienced the significant gains in usability and
> >> > efficiency that can come from using concept IDs that are not easily
> >> > interpreted by humans (e.g., hexadecimals at Convera).
> >>
> >> Specialists in every branch of science, engineering, medicine, and
> >> the arts have developed precisely defined terminologies that are
> >> unknown to the unwashed. It's possible (but not easy) to select
> >> univocal phrases.
> >>
> >> PC
> >> > So, instead of, e.g. a term "Process" (never defined the same way
> >> > in any two upper ontologies I have seen), we might have
> >> "ContinuousProcess"
> >> > for phenomena describable by differential equations, or
> >> "DiscontinuousProcess"
> >> > for an Event represented as a series of steps (which may be
> >> > considered instantaneous, or take some finite time, which
> >> > distinction may lead to more refinement or expansion of the
> >> > labels). When more than one ontology is being considered, the
> >> > formal way of doing this is just to use namespace prefixes so
> >> > everyone can use the same label, but specify the namespace when creating
>the logical
> specification.
> >>
> >> I agree. Martin Hepp put a lot of effort into choosing meaningful
> >> and readable names for his GoodRelations ontology. That is one
> >> reason why it became popular. And that's also why Google, Microsoft,
> >> Yahoo, and Yandex adopted it for Schema.org.
> >>
> >> Note that Schema.org uses readable English terms (mostly multi-word)
> >> for their ontology. And note that the primary Google spokesperson
> >> for Schema.org is Guha -- who had been the associate director of Cyc,
> >> the chief designer of RDF, and the co-author (with Pat Hayes) of the
> >> logic base (LBase) for RDF, which uses a version of the same
> >> semantics as Common Logic. So he's familiar with the issues.
> >>
> >> In Summary, I agree that there is a problem with people reading just
> >> the names instead of looking at the definitions. But making the
> >> names unreadable is not a solution. I don't agree with Guha about
> >> everything, but I think that he (and the other Schemers) made some
> >> good choices for Schema.org.
> >>
> >> John
> >>
> >> _________________________________________________________________
> >> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> >> Config Subscr:
> >> http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> >> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> >> Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
> >> http://ontolog.cim3.net/wiki/ To join:
> >> http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> >>
> >
> >
> > _________________________________________________________________
> > Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> > Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> > Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> > Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
> > http://ontolog.cim3.net/wiki/ To join:
> > http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> >
> >
>
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
>http://ontolog.cim3.net/wiki/ To join:
> http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> (017)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (018)
|