ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Re Foundation ontology, CYC, and Mapping

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "doug foxvog" <doug@xxxxxxxxxx>
Date: Mon, 29 Mar 2010 02:29:29 +0100 (IST)
Message-id: <1350.89.101.4.18.1269826169.squirrel@xxxxxxxxxxxxxx>
gary.h.merrill@xxxxxxxxxxxxxx wrote;
> On Sun, 2010-03-28 at 19:51 +0100, doug foxvog wrote:
>> Pavithra wrote:    (01)

>> > ...
>> > I agree, that  in Classes, one has to define the attribute as  "Name"
>> and
>> > specify it as a character string which can have a value "Kermit"!    (02)

>> I prefer John Sowa's distinction, in which the name is not a character
>> string, but something that has a character string as a spelling.    (03)

If desired, a nameString property could be used to compose these two
attributes:
  nameString (A, S) <- name(A, N) & spelling(N, S).    (04)

> I am not at all sure what the point is here, but I sense some possibly
> dangerous confusions.  One involves use of the phrase "the name" (rather
> than, say, "a name of").    (05)

This was not my distinction.  I meant the name attribute which would refer
to "a name of".  I was not suggesting a unique name.    (06)

> Almost always a thing will have multiple names,    (07)

Named things often will.    (08)

> each of which is traditionally taken to be a character string    (09)

This is where we disagree.  Names (and words) are part of language.  And
language is traditionally comprised of sounds or gestures.  Character sets
have been devised to represent words in standardized ways.  A word can be
represented by one or a pattern of characters from a given set.  There may
be multiple patterns from the same set, and the same word can have multiple
representations in different sets (e.g. Japanese, Chinese, Serbo-Croatian).    (010)

For someone who considers a name to be a "character string", is a name
represented in ASCII the same character string as it represented in
UNICODE?  How about EBSIDIC? ... Braille?  ... Class II Braille?  Can one
have a name in AMSLAN?    (011)

> and
> to refer to the same thing (the referent, the denotatum, the thing
> named).  For various purposes one might then select a "standard name" or
> a "canonical name" or a "print name" or a "display name", etc.    (012)

These last three are features of computer programs, not of the names.    (013)

> If on the other hand you take a name to be something that has a
> character string as a spelling,    (014)

A character string does not necessitate a spelling -- an alphabetic string
does.    (015)

> then you take a name to be an abstract
> object of some sort:  an equivalence class, related to one or more
> spellings.   (I assume we adopt the view that each such name must have
> at least one spelling, although in the case of names for entities in an
> uncountable universe and relative to languages with countably many
> expressions, there are some  interesting question concerning this -- but
> that is a digression.)    (016)

A name may have no spelling (in non-literate cultures) or a set of
non-letter characters (as in Chinese before Latin transliterations were
devised).  Note that many distinct Chinese characters would have the same
spelling (whether in Pinyin, Wade-Giles, or whatever other system you
choose).    (017)

Other names may have one spelling (in many modern cultures), or multiple
spellings (e.g., in cultures that haven't standardized spellings or which
use multiple alphabets).    (018)

A personal name may reference another person, e.g. one with Jr. or a
patronymic, a place (von X, de Y, Scandinavian farm names), or a
family.    (019)

> Further, this allows (at least in principle) for
> two distinct names to have associated with them the same spelling.    (020)

Agreed.    (021)

> (This in fact happens with some frequency in natural language and in
> mixing domains in scientific languages.  Consider 'mole', for example.)    (022)

The discussion just shifted from "names" to "words".  There are different
models that can be used for words.  The standard definition of "homonym"
implies that multiple words can have the same spelling.  Linguistics
refers to multiple forms of a word (e.g., plural, past tense, etc.)    (023)

> At the very least, you then need a principle of individuation for names
> -- since the principle of individuation for character strings no longer
> suffices once you have broken the strict relation between a name and a
> (unique) character string.  When are two names the same?  Well, an
> obvious answer would be "Two names are the same just in case their
> equivalence classes of spellings are identical."  But it's really not
> clear what advantage such added complexity has.  Why not say "Well,
> 'mole' is a name but its referent and meaning differs with
> context." (the traditional approach that allows for ambiguity of names)
> as opposed to "Well, there are two distinct names, and 'mole' is a
> spelling of each."?  (Note that the latter MAY have the interesting
> consequence that names cannot be ambiguous, but that depends upon how
> the surround theories of reference and meaning are developed.)    (024)

The problem of identity exists for words and names -- and for other
individuals.  The issues discussed here seem more to be NL issues than
ontology issues.    (025)

> This seems an odd and unnecessarily cumbersome direction in which to go.
> It is not necessarily incorrect, but the motivation and advantage to it
> is unclear.  Why adopt a notion of name according to which one name
> denotes a furry creature and another name denotes a skin lesion, and
> each of those names has the same spelling -- rather than an approach in
> which there is one name ('mole') which has different referents (and
> different meanings) depending upon the context in which it is used?    (026)

It could be useful to allow for a single word to have multiple denotations.
However, from a NL standpoint, in a language in which a single word can
have multiple forms (conjugations, declanations, plurals, objective/sub-
jective forms, etc.) it is far more compact and (imho) easier to understand
if the denotation of the word is specified once instead of 6 or a dozen or
more times.  This is especially useful if the alternate forms are not
totally regular.    (027)

> Again, certainly one may proceed along such a path, but there is a price
> to pay in complexity at various levels.  Why not treat names as
> character strings (which are very well behaved and whose theory we know
> very well), and then when necessary (usually for computational purposes)
> introduce a "canonical name" for each entity -- one that may either be
> among the names in the intended equivalence classes or (generally better
> in formal and computational contexts) one that is created so as to be
> different from any such "normal" name?    (028)

This seems to be a discussion of coding efficiency, not of ontologies.
The task that this coding is for seems to be to represent names for
terms in the ontology.  A different encoding could be more efficient for
NL purposes.    (029)

> An example of the complexity of such issues may be found in the Unified
> Medical Language System where a 'term' is not what most of us would
> think of as a term.  A UMLS term is in fact an equivalence class of
> "atoms", where an atom is (roughly) an occurrence of a string within a
> context.   Within the UMLS there are some clear motivations for treating
> terms as abstract objects, but there is also a substantial amount of
> resulting confusion pertaining to what conditions must be met in order
> for an atom to be included in the equivalence class that constitutes a
> particular term and what the consequences of this are for the
> characterizations of meanings, represented in the UMLS as "concepts" --
> each of which has it's own unique identifier (can we say "canonical
> name"?).  And this boils over into a somewhat confused approach to
> synonymy as well.  There are powerful ripple effects to decisions one
> makes concerning what a name may be.    (030)

UMLS has taken a step towards ontology in creating semantic "concepts",
and mapping them to terms in numerous biomedical term sets.  The terms
in each of the term sets have precisely defined meanings.  As long as the
term set used is known, the meaning of a defined term should be clear
since it is uniquely defined in its term set.  UMLS allows for translation
between encodings in multiple termsets.    (031)

Some term sets have both unique IDs for their terms & character strings
to represent the terms.  When multiple terms in the same term set use the
same character string, then the conversion from a string to a term is not
unique.    (032)

UMLS does assign a unique ID for each of its concepts.  This may be a
"cannonical name" in the UMLS system.  To express this seems to require
a ternary predicate -- relating the concept, the system of nominclature,
and the character string.    (033)

> Talk of "the name" has the flavor of a metalinguistic context in which a
> unique name is needed for certain purposes.    (034)

As mentioned above, that was not the meaning.  "The" was chosen to refer
to a previously mentioned name attribute.    (035)

> ...  And beware of building semantics and pragmatics, in an
> unsuspecting way, into the approach being taken.    (036)

That was the problem that i found with treating a linguistic component such
as a name (or a word) as a character string.    (037)

> The decision to treat a name as a class of expressions    (038)

I was treating a name as an individual object, not as a class.    (039)

> is in fact a SEMANTIC decision    (040)

True.    (041)

> to relate a set of character strings to one another    (042)

I beg to disagree.  Relations are needed to relate character strings to
names (and words).    (043)

> ...    (044)

> In short (too late for  that, I'm afraid), while you can decide to adopt
> a view of names that makes them distinct from character strings, there
> are significant consequences to this.  Beware.  Here there be dragons.    (045)

I find the dragons in the conflation in the general case.  But for specific
purposes, it could be quite useful.  A generic ontology would allow one to
add a theory that imposes a conflation of words and character strings.
Just use
  wordStringFor (A, S) <- wordFor(A, W) & spelling(W, S).    (046)

-- doug foxvog    (047)

> ________________________________________________________________________
>
> Gary H. Merrill
> Ontolytics, LLC
>
> +1 919.271.7259
>    (048)


=============================================================
doug foxvog    doug@xxxxxxxxxx   http://ProgressiveAustin.org    (049)

"I speak as an American to the leaders of my own nation. The great
initiative in this war is ours. The initiative to stop it must be ours."
    - Dr. Martin Luther King Jr.
=============================================================    (050)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (051)

<Prev in Thread] Current Thread [Next in Thread>