Rich,
my first task is to convert RDB to logically equivalent sentences.
And auxiliary names for articles, samples, places... work well.
Then we automatically translate these sentences to OWL (using Attempto project ape program) hopefully to OWL-DL.
And we shall use some kind of reasoner for query/answer user interface.
If a structure of an answer would be natural English we definitely come with problem not to use in the answer auxiliary names.
But today usually structure of an answer is a table;)
It is possible to rewrite sentences without variables, like this:
There is a sample. An authorial-number of the sample is "A". The sample is a rhyolite...
There is a sample. An authorial-number of the sample is "B". The sample is an andesite...
but we show our sentences to the customer mostly to approve terminology:)
Alex
PS right now we do not have idea to participate in Turing test;)
2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>
Alex,
Thanks for sharing your work in this area. But my concern is in generating more natural sentences. For example, one of the examples on that page is:
An authorial-number of S32994 is "A".
While this sentence was generated from actual data in the table, it isn't intuitively appealing to a reader. So you must be planning further work that will elaborate the meaning of S32994 and other identifiers that make perfect sense to SQL queries, but not to people.
That's the challenge. S32994 is probably an automatically generated identifier that ties together one or more concepts from the database. So changing that to some kind of English-like, humanly meaningful phrase is the major difficulty in that kind of translation.
Have you pursued that kind of translation? The original database for your work was probably produced by conforming to visual requirements for displays which operators used to enter multi-table data. But turning that into linguistically meaningful statements is a task that requires chasing down all the identifiers that make sense in database contexts, but not in human reference.
Any more information you can share about how you turn these identifiers into meaningful English would be very appreciated.
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
you are absolutely right about anaphoric references and huge number of sentences, but structure of each sentence is simple - this is a point.
well, ~10 tables were used to generate this:)
I hope my customer accepts these simple sentences as understandable or even native;)
But I am responsible to say that all RDB factology is there. In this case just for one article.
2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>
Thanks Alex for your suggestions,
Yes, each row in most tables can be translated into sentences. The issue I'm trying to address is the way the classes pop out of large tables depending on the perceptions of the translator.
For example, I'm presently looking at a table with tens of millions of rows and 160 some-odd columns. Each row of that huge table translates into a very, very long sentence! It would be better to translate each row into a full paragraph, with anaphoric references to earlier concepts expressed in sentences within the paragraphs.
Also, the issue of which concepts in each paragraph can be most appropriately represented is complicated. Each row has references to key columns in other tables, and other tables reference the keys of this huge table. So the development of appropriate concepts and reasonably compact sentences can be challenging. Relationships cross tables in many designs, including this one. So locating where the relationships can be expressed most appropriately, and non redundantly, is also a challenge.
Do you have any thoughts on this formulation of the problem?
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
We may partialy extract classes and relationships from user interface with database. You know these labels (words) on forms, pages, reports.
From other side any database may be converted to set of sentences (facts). And these sentences should be accepted by user (domain expert) as native.
It should be quite enough to begin modeling;)
2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>
Thanks Matthew,
You've written a large number of papers on modeling - my congratulations for being so prolific. Your papers seem to begin with the concept of data
modeling and then go into the proper principles for modeling.
But I'm looking for ways to use existing database tables and to discover the classes and relationships that, by chance, went into the original table
design. In my experience, commercial databases are developed in a haphazard way for nearly all commercial applications without the luxury of careful modeling. That's especially true of the applications developed as
requirements are being discovered and changes are requested by users that may not meet the best practices of modelers. Database designers are usually upset to see the way the databases turn out when created in this de facto
way instead of in a well designed, carefully orchestrated way.
So if you must begin with an existing database's table designs, how can a reasonable class model be developed for that legacy database? Are there
automatic methods for generating the As-Is class models?
Suggestions, URLs, replies appreciated.
Thanks,
-Rich
Sincerely, Rich Cooper EnglishLogicKernel.com Rich AT EnglishLogicKernel DOT com
Dear Rich
> > > I have always considered the definition of an RDB-class > correspondence to be
> > partially in the mind of the modeler, not in the RDB structure > itself. I > > know of no formal definition of what a RDB-class consists of, which > would > > provide a rigorous foundation for translating RDBs to classes with
> one > > exception; there is at least one class per table. Modelers may often > add > > definitions of subclasses within a table, but that's usually some > form of > > correspondence between groups of columns from a table that fit within
> the > > human "chunk" size of 7+/-2 concepts. So it seems every bit as > subjective > > as any other method of constructing classes from data. Data mining > (and > > text mining) consists of discovering those subclasses, along with
> classes > > that relate one table to other(s). > > I would not dispute any of this. > > > Do you have reference(s) (URLs especially) to other documented points > of > > view which might more rigorously define how the RDB translates into
> classes?
[MW] I did some work that effectively does this about 15 years ago, which is captured in a document called "Developing high Quality Data Models". Part of it is about finding the hidden classes, and part of it is about not hiding
them in the first place. You can find it here: http://www.matthew-west.org.uk/Publications.html
or just type "High Quality Data Models" into Google.
Regards
Matthew West Information Junction Tel: +44 560 302 3685 Mobile: +44 750 3385279 matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
http://www.matthew-west.org.uk/
This email originates from Information Junction Ltd. Registered in England and Wales No. 6632177. Registered office: 2 Brookside, Meadow Way, Letchworth Garden City,
Hertfordshire, SG6 3JE.
_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (01)
|