Re: [ontolog-forum] Is there something I missed?

To:	"[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From:	Александр Шкотин <alex.shkotin@xxxxxxxxx>
Date:	Sat, 31 Jan 2009 17:12:53 +0300
Message-id:	<b24945a10901310612o65248984w40b50692bec70b50@xxxxxxxxxxxxxx>

Rich,

my first task is to convert RDB to logically equivalent sentences.

And auxiliary names for articles, samples, places... work well.

Then we automatically translate these sentences to OWL (using Attempto project ape program) hopefully to OWL-DL.

And we shall use some kind of reasoner for query/answer user interface.

If a structure of an answer would be natural English we definitely come with problem not to use in the answer auxiliary names.

But today usually structure of an answer is a table;)

It is possible to rewrite sentences without variables, like this:

There is a sample. An authorial-number of the sample is "A". The sample is a rhyolite...

There is a sample. An authorial-number of the sample is "B". The sample is an andesite...

but we show our sentences to the customer mostly to approve terminology:)

Alex

PS right now we do not have idea to participate in Turing test;)

2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>

Alex,

Thanks for sharing your work in this area. But my concern is in generating more natural sentences. For example, one of the examples on that page is:

An authorial-number of S32994 is "A".

While this sentence was generated from actual data in the table, it isn't intuitively appealing to a reader. So you must be planning further work that will elaborate the meaning of S32994 and other identifiers that make perfect sense to SQL queries, but not to people.

That's the challenge. S32994 is probably an automatically generated identifier that ties together one or more concepts from the database. So changing that to some kind of English-like, humanly meaningful phrase is the major difficulty in that kind of translation.

Have you pursued that kind of translation? The original database for your work was probably produced by conforming to visual requirements for displays which operators used to enter multi-table data. But turning that into linguistically meaningful statements is a task that requires chasing down all the identifiers that make sense in database contexts, but not in human reference.

Any more information you can share about how you turn these identifiers into meaningful English would be very appreciated.

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of ????????? ??????
Sent: Saturday, January 31, 2009 3:38 AM

To: [ontolog-forum]
Subject: Re: [ontolog-forum] Is there something I missed?

Rich,

you are absolutely right about anaphoric references and huge number of sentences, but structure of each sentence is simple - this is a point.

Have a look at my experience: http://docs.google.com/Doc?id=dcj6m55s_607cfxccwc4

well, ~10 tables were used to generate this:)

I hope my customer accepts these simple sentences as understandable or even native;)

well, may be boring:)

But I am responsible to say that all RDB factology is there. In this case just for one article.

Alex

2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>

Thanks Alex for your suggestions,

Yes, each row in most tables can be translated into sentences. The issue I'm trying to address is the way the classes pop out of large tables depending on the perceptions of the translator.

For example, I'm presently looking at a table with tens of millions of rows and 160 some-odd columns. Each row of that huge table translates into a very, very long sentence! It would be better to translate each row into a full paragraph, with anaphoric references to earlier concepts expressed in sentences within the paragraphs.

Also, the issue of which concepts in each paragraph can be most appropriately represented is complicated. Each row has references to key columns in other tables, and other tables reference the keys of this huge table. So the development of appropriate concepts and reasonably compact sentences can be challenging. Relationships cross tables in many designs, including this one. So locating where the relationships can be expressed most appropriately, and non redundantly, is also a challenge.

Do you have any thoughts on this formulation of the problem?

-Rich

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of ????????? ??????
Sent: Saturday, January 31, 2009 2:33 AM
To: [ontolog-forum]

Subject: Re: [ontolog-forum] Is there something I missed?

Hi Rich,

We may partialy extract classes and relationships from user interface with database. You know these labels (words) on forms, pages, reports.

From other side any database may be converted to set of sentences (facts). And these sentences should be accepted by user (domain expert) as native.
It should be quite enough to begin modeling;)

Alex

2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>

Thanks Matthew,

You've written a large number of papers on modeling - my congratulations for
being so prolific. Your papers seem to begin with the concept of data
modeling and then go into the proper principles for modeling.

But I'm looking for ways to use existing database tables and to discover the
classes and relationships that, by chance, went into the original table
design. In my experience, commercial databases are developed in a haphazard
way for nearly all commercial applications without the luxury of careful
modeling. That's especially true of the applications developed as
requirements are being discovered and changes are requested by users that
may not meet the best practices of modelers. Database designers are usually
upset to see the way the databases turn out when created in this de facto
way instead of in a well designed, carefully orchestrated way.

So if you must begin with an existing database's table designs, how can a
reasonable class model be developed for that legacy database? Are there
automatic methods for generating the As-Is class models?

Suggestions, URLs, replies appreciated.

Thanks,

-Rich

Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com

-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx

[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Matthew West
Sent: Saturday, January 31, 2009 1:34 AM
To: edbark@xxxxxxxx; '[ontolog-forum] '
Subject: Re: [ontolog-forum] Is there something I missed?

Dear Rich

>
> > I have always considered the definition of an RDB-class
> correspondence to be
> > partially in the mind of the modeler, not in the RDB structure
> itself. I
> > know of no formal definition of what a RDB-class consists of, which
> would
> > provide a rigorous foundation for translating RDBs to classes with
> one
> > exception; there is at least one class per table. Modelers may often
> add
> > definitions of subclasses within a table, but that's usually some
> form of
> > correspondence between groups of columns from a table that fit within
> the
> > human "chunk" size of 7+/-2 concepts. So it seems every bit as
> subjective
> > as any other method of constructing classes from data. Data mining
> (and
> > text mining) consists of discovering those subclasses, along with
> classes
> > that relate one table to other(s).
>
> I would not dispute any of this.
>
> > Do you have reference(s) (URLs especially) to other documented points
> of
> > view which might more rigorously define how the RDB translates into
> classes?

[MW] I did some work that effectively does this about 15 years ago, which is
captured in a document called "Developing high Quality Data Models". Part of
it is about finding the hidden classes, and part of it is about not hiding
them in the first place. You can find it here:
http://www.matthew-west.org.uk/Publications.html

or just type "High Quality Data Models" into Google.

Regards

Matthew West
Information Junction
Tel: +44 560 302 3685
Mobile: +44 750 3385279
matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
http://www.matthew-west.org.uk/

This email originates from Information Junction Ltd. Registered in England
and Wales No. 6632177.
Registered office: 2 Brookside, Meadow Way, Letchworth Garden City,
Hertfordshire, SG6 3JE.

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (01)

<Prev in Thread]	Current Thread	[Next in Thread>
Re: [ontolog-forum] Is there something I missed?, (continued) Re: [ontolog-forum] Is there something I missed?, Pat Hayes Re: [ontolog-forum] Is there something I missed?, Matthew West Re: [ontolog-forum] Is there something I missed?, Azamat Re: [ontolog-forum] Is there something I missed?, Ed Barkmeyer Re: [ontolog-forum] Is there something I missed?, Matthew West Re: [ontolog-forum] Is there something I missed?, Rich Cooper Re: [ontolog-forum] Is there something I missed?, Александр Шкотин Re: [ontolog-forum] Is there something I missed?, Rich Cooper Re: [ontolog-forum] Is there something I missed?, Александр Шкотин Re: [ontolog-forum] Is there something I missed?, Rich Cooper Re: [ontolog-forum] Is there something I missed?, Александр Шкотин <=

Previous by Date:	Re: [ontolog-forum] Is there something I missed?, Rich Cooper
Next by Date:	[ontolog-forum] FW: CHAT: The role of the translator (was: The Enemy Within), FERENC KOVACS
Previous by Thread:	Re: [ontolog-forum] Is there something I missed?, Rich Cooper
Next by Thread:	[ontolog-forum] What is an Ontology [was - Re: Is there something I missed?], Peter Yim
Indexes:	[Date] [Thread] [Top] [All Lists]