ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] OntoNotes and the Omega ontology

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Sun, 26 Sep 2010 23:43:40 -0400
Message-id: <4CA012EC.2040103@xxxxxxxxxxx>
David,    (01)

There are redundancies in everything.  But please note that Cyc has
formal definitions for over 600,000 concepts.  When they started working
on Cyc in 1984, they had no idea that they would require so many.  If
they had, nobody would have funded that project.  If the number of
necessary concepts was as small as you claimed, Cyc would have solved
all the problems of AI and formal ontologies 20 years ago.    (02)

JFS>>  If you add up all the concepts in all the official IBM publications,
>>  internal reports, emails (just the business related ones, not the
>>  personal messages), research reports and articles, patents, etc.,
>>  the total number of distinct business-related concepts would be
>>  in the millions -- and any one of them could be important for
>>  some software application.    (03)

DE> But there were HUGE redundancies in that environment.  IBM was
> a monopoly,  with a license to print money&  there was
> negative motivation to be efficient with language.    (04)

I find that objection so irrelevant that I can't believe that you
read my previous note.  I'm repeating it below.    (05)

> Few concepts, many words.    (06)

On the contrary.   Many of the concepts defined in Cyc and other
large ontologies and terminologies are expressed by multi-word
phrases.    (07)

John    (08)

-------- Original Message --------
Subject: Re: [ontolog-forum] OntoNotes and the Omega ontology
Date: Sat, 25 Sep 2010 22:42:40 -0400
From: John F. Sowa <sowa@xxxxxxxxxxx>
To: ontolog-forum@xxxxxxxxxxxxxxxx    (09)

David and Cecil,    (010)

On 9/25/2010 4:29 PM, David Eddy wrote:
 > It is my thesis that the concepts needed for an organization to
 > function is in the 1500 to 6000 range.    (011)

On 9/25/2010 5:14 PM, Cecil O Lynch wrote:
 > I have a colleague who has looked at more than 500,000 medical 
records from
 > a series of hospitals and has distilled that vocabulary to less than 5000
 > concepts as well in that usage scenario, though SNOMED has over 300,000
 > unique concepts (so it seems that general usage limits this to some 
extent).    (012)

I won't deny that you (or your colleagues) derived those numbers.
Nor will I deny that you can develop many useful application suites
that have numbers of special-purpose concepts in that range.    (013)

But I don't believe that you have counted all the concepts in all
the supporting hardware and software that the programmers routinely
use.  Nor do I believe that you counted all the subject-matter
concepts that the domain experts, knowledge engineers, and
systems analysts used in work that led to that software.    (014)

Furthermore, if you look at any medium-sized business (say over a
few hundred employees), you will have people and departments doing
a very large range of activities:  product development, manufacturing,
sales, finance, accounting, personnel, compensation, legal, logistics,
shipping, services, building and grounds, ...    (015)

In medicine, different physicians may use different subsets of the
SNOMED vocabulary, but in a large hospital, there is probably somebody
who sometimes will need to use nearly every one of them.  You never
know when some patient had visited some remote part of the globe
and acquired some disease that nobody on the staff was familiar with.    (016)

But even the SNOMED vocabulary is insufficient to cover all the
aspects of what happens in a hospital -- for example, the lighting
and air conditioning requirements in an operating room, the legal
policies, the health insurance, the scheduling of facilities, rooms,
equipment, services, etc.    (017)

If you get to a large corporation, the numbers grow even faster.
Somebody has to worry about international legal, currency, taxation,
import/export, subsidiaries, competitors, suppliers, clients, ...    (018)

Not too long ago, people talked about the amount of documentation
for the Boeing 747.  If all of it was printed out in one batch,
it would be too voluminous to fit in a 747.  And even if it could
fit, its weight would prevent the 747 from taking off.    (019)

The amount of detail that has been documented has grown much
much larger for the 757, 767, 777, and 787.  Those number quoted
above would be woefully inadequate.    (020)

 > Most organizations focus on a relatively narrow niche of "reality"...
 > HP & IBM do hardware, software & consulting... but stocks & bonds are
 > not a core part of their business.    (021)

I spent 30 years working at IBM, and the range of topics that various
groups addressed was enormous.  For any business that you can think of,
somebody in IBM was an expert in selling hardware and software to that
business, and other people in IBM were working on projects to support
them with some hardware, software, or consulting related to that
business.    (022)

Remember Harry Markowitz, who won the Nobel Prize in economics for
portfolio theory?  That was his PhD dissertation at the U. of Chicago
in 1955.  He went to work at RAND and later at IBM Research, where he
was working on simulation languages.  He left IBM to teach at a
university a few years before he was awarded the Nobel Prize.    (023)

Remember the chess computer (Deep Blue) which beat Kasparov?  The people
who designed the special-purpose hardware and software for it were IBM
employees at Yorktown Research.    (024)

During the 1980s, IBM was the second largest publisher in the world,
in terms of both number of titles and number of pages printed.  (The
largest, of course, was and still is the US government.)  Today, IBM
still produces huge numbers of documents, but they're all distributed
electronically.    (025)

If you add up all the concepts in all the official IBM publications,
internal reports, emails (just the business related ones, not the
personal messages), research reports and articles, patents, etc.,
the total number of distinct business-related concepts would be
in the millions -- and any one of them could be important for
some software application.    (026)

John    (027)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (028)

<Prev in Thread] Current Thread [Next in Thread>