[Top] [All Lists]

Re: [ontolog-forum] the data mining craze

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Mon, 28 Feb 2011 22:42:11 -0500
Message-id: <4D6C6B13.2030702@xxxxxxxxxxx>
Steve, Krzysztof, and Azamat,    (01)

> Azamat, I've seen enough upper-ontology-related discussions
> on this list to know that you'll never get agreement on
> a "full reclassification".  And neither should you -
> the tight structures you've no doubt got in mind just won't
> be general enough for universal use.    (02)

I have some agreements, disagreements, and qualifications
about that point, but I'd like to relate it to the following:    (03)

> ... more than half of DBPedia's triples were dumped because
> of their quality.    (04)

The DBpedia triples are of very low quality, partly because what
they call an "ontology" has nothing resembling an ontology other
than the use of OWL.  It's the farthest you could possibly get
from a "tight structure".  Following are the classes:    (05)

http://mappings.dbpedia.org/index.php?title=Special%3AAllPages&from=&to=&namespace=200    (06)

Click on any of the words in blue to see how they're specified.
For example, click 'Event':    (07)

    What you discover is that 'Event' is a subclass of owl:Thing,
    which says absolutely nothing.  The only other useful information
    is that Event is disjoint with Person.  That's true, but it tells
    you next to nothing.    (08)

Just before 'Event' is 'EuroVisionSongContestEntry':    (09)

    EuroVisionSongContestEntry < Song < MusicalEntry < Work < owl:Thing
    In that whole chain up the hierarchy, Work is the only class that
    has further information: it's disjoint from wgs84_pos:SpatialThing.    (010)

Try the class 'Continent':    (011)

    Continent < PopulatedPlace < Place < owl:Thing.  The only additional
    information about Place is an rdfs:comment -- "a location".  It
    doesn't even say that Place is related to wgs84_pos:SpatialThing.    (012)

 From browsing that so-called "ontology", I can see why "more than half
of DBPedia's triples were dumped because of their quality."    (013)

> There is also an interesting short paper by Jain, Hitzler, and others
> called 'Linked Data is Merely More Data' (available at
> knoesis.wright.edu/library/publications/linkedai2010_submission_13.pdf )
> It highlights the need for ontologies to make the data more useful.    (014)

I would agree.  WordNet is a huge step up from the DBpedia so-called
ontology, and it is very good for what it does.  But a lot more
information at varying levels of detail is necessary.    (015)

 From what I've seen in the documentation for Watson, they use a number
of different resources, including WordNet.  The UIMA documentation
mentions some ontologies, which they probably elaborated.    (016)

But as we've seen from the performance on Jeopardy, Watson still makes
some elementary category errors.    (017)

There is a lot more that could be said, and I said a lot of it
in the following slides:    (018)

    Integrating Semantic Systems    (019)

John    (020)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (021)

<Prev in Thread] Current Thread [Next in Thread>