| Rich,   my first task is to convert RDB to logically equivalent sentences. And auxiliary names for articles, samples, places... work well.   Then we automatically translate these sentences to OWL (using Attempto project ape program) hopefully to OWL-DL. And we shall use some kind of reasoner for query/answer user interface. If a structure of an answer would be natural English we definitely come with problem not to use in the answer auxiliary names. But today usually structure of an answer is a table;)   It is possible to rewrite sentences without variables, like this: There is a sample. An authorial-number of the sample is "A". The sample is a rhyolite... There is a sample. An authorial-number of the sample is "B". The sample is an andesite...   but we show our sentences to the customer mostly to approve terminology:)     Alex   PS right now we do not have idea to participate in Turing test;)   
 2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx> 
Alex,   Thanks for sharing your work in this area.  But my concern is in generating more natural sentences.  For example, one of the examples on that page is:   An authorial-number of S32994 is "A".   While this sentence was generated from actual data in the table, it isn't intuitively appealing to a reader.  So you must be planning further work that will elaborate the meaning of S32994 and other identifiers that make perfect sense to SQL queries, but not to people.     That's the challenge.  S32994 is probably an automatically generated identifier that ties together one or more concepts from the database.  So changing that to some kind of English-like, humanly meaningful phrase is the major difficulty in that kind of translation.     Have you pursued that kind of translation?  The original database for your work was probably produced by conforming to visual requirements for displays which operators used to enter multi-table data.  But turning that into linguistically meaningful statements is a task that requires chasing down all the identifiers that make sense in database contexts, but not in human reference.     Any more information you can share about how you turn these identifiers into meaningful English would be very appreciated.   
    Sincerely, Rich Cooper EnglishLogicKernel.com Rich AT EnglishLogicKernel DOT com     
  
you are absolutely right about anaphoric references and huge number of sentences, but structure of each sentence is simple - this is a point. 
well, ~10 tables were used to generate this:) 
I hope my customer accepts these simple sentences as understandable or even native;) 
But I am responsible to say that all RDB factology is there. In this case just for one article. 
2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx> 
Thanks Alex for your suggestions,   Yes, each row in most tables can be translated into sentences.  The issue I'm trying to address is the way the classes pop out of large tables depending on the perceptions of the translator.     For example, I'm presently looking at a table with tens of millions of rows and 160 some-odd columns.  Each row of that huge table translates into a very, very long sentence!  It would be better to translate each row into a full paragraph, with anaphoric references to earlier concepts expressed in sentences within the paragraphs.     Also, the issue of which concepts in each paragraph can be most appropriately represented is complicated.  Each row has references to key columns in other tables, and other tables reference the keys of this huge table.  So the development of appropriate concepts and reasonably compact sentences can be challenging.  Relationships cross tables in many designs, including this one.  So locating where the relationships can be expressed most appropriately, and non redundantly, is also a challenge.     Do you have any thoughts on this formulation of the problem? 
  -Rich   
    Sincerely, Rich Cooper EnglishLogicKernel.com Rich AT EnglishLogicKernel DOT com     
  
We may partialy extract classes and relationships from user interface with database. You know these labels (words) on forms, pages, reports. 
From other side any database may be converted to set of sentences (facts). And these sentences should be accepted by user (domain expert) as native.It should be quite enough to begin modeling;)
 
2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx> Thanks Matthew,
 You've written a large number of papers on modeling - my congratulations for
 being so prolific.  Your papers seem to begin with the concept of data
 modeling and then go into the proper principles for modeling.
 
 But I'm looking for ways to use existing database tables and to discover the
 classes and relationships that, by chance, went into the original table
 design.  In my experience, commercial databases are developed in a haphazard
 way for nearly all commercial applications without the luxury of careful
 modeling.  That's especially true of the applications developed as
 requirements are being discovered and changes are requested by users that
 may not meet the best practices of modelers.  Database designers are usually
 upset to see the way the databases turn out when created in this de facto
 way instead of in a well designed, carefully orchestrated way.
 
 So if you must begin with an existing database's table designs, how can a
 reasonable class model be developed for that legacy database?  Are there
 automatic methods for generating the As-Is class models?
 
 Suggestions, URLs, replies appreciated.
 
 Thanks,
 
-Rich
 
 
 Sincerely,
 Rich Cooper
 EnglishLogicKernel.com
 Rich AT EnglishLogicKernel DOT com
 
 
 
Dear Rich
 >
 > > I have always considered the definition of an RDB-class
 > correspondence to be
 > > partially in the mind of the modeler, not in the RDB structure
 > itself.  I
 > > know of no formal definition of what a RDB-class consists of, which
 > would
 > > provide a rigorous foundation for translating RDBs to classes with
 > one
 > > exception; there is at least one class per table.  Modelers may often
 > add
 > > definitions of subclasses within a table, but that's usually some
 > form of
 > > correspondence between groups of columns from a table that fit within
 > the
 > > human "chunk" size of 7+/-2 concepts.  So it seems every bit as
 > subjective
 > > as any other method of constructing classes from data.  Data mining
 > (and
 > > text mining) consists of discovering those subclasses, along with
 > classes
 > > that relate one table to other(s).
 >
 > I would not dispute any of this.
 >
 > > Do you have reference(s) (URLs especially) to other documented points
 > of
 > > view which might more rigorously define how the RDB translates into
 > classes?
 
 [MW] I did some work that effectively does this about 15 years ago, which is
 captured in a document called "Developing high Quality Data Models". Part of
 it is about finding the hidden classes, and part of it is about not hiding
 them in the first place. You can find it here:
 http://www.matthew-west.org.uk/Publications.html
 
 or just type "High Quality Data Models" into Google.
 
 Regards
 
 Matthew West
 Information  Junction
 Tel: +44 560 302 3685
 Mobile: +44 750 3385279
 matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
 http://www.matthew-west.org.uk/
 
 This email originates from Information Junction Ltd. Registered in England
 and Wales No. 6632177.
 Registered office: 2 Brookside, Meadow Way, Letchworth Garden City,
 Hertfordshire, SG6 3JE.
 
 
 
 
 
 _________________________________________________________________
 Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
 Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
 Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
 Shared Files: http://ontolog.cim3.net/file/
 Community Wiki: http://ontolog.cim3.net/wiki/
 To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
 To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
 
 
 
 _________________________________________________________________
 Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
 Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
 Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
 Shared Files: http://ontolog.cim3.net/file/
 Community Wiki: http://ontolog.cim3.net/wiki/
 To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
 To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
   
 _________________________________________________________________
 Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
 Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
 Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
 Shared Files: http://ontolog.cim3.net/file/
 Community Wiki: http://ontolog.cim3.net/wiki/
 To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
 To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
 
   
 _________________________________________________________________
 Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
 Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
 Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
 Shared Files: http://ontolog.cim3.net/file/
 Community Wiki: http://ontolog.cim3.net/wiki/
 To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
 To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
 
 
 
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (01)
 |