Alex,
Thanks for sharing your work in this
area. But my concern is in generating more natural sentences. For example,
one of the examples on that page is:
An authorial-number of S32994 is "A".
While this sentence was generated from actual data in the table, it isn’t
intuitively appealing to a reader. So you must be planning further work that will
elaborate the meaning of S32994 and other identifiers that make perfect sense to
SQL queries, but not to people.
That’s the challenge. S32994 is probably an automatically generated
identifier that ties together one or more concepts from the database. So
changing that to some kind of English-like, humanly meaningful phrase is the
major difficulty in that kind of translation.
Have you pursued that kind of translation? The original database for
your work was probably produced by conforming to visual requirements for
displays which operators used to enter multi-table data. But turning that into
linguistically meaningful statements is a task that requires chasing down all
the identifiers that make sense in database contexts, but not in human reference.
Any more information you can share about how you turn these identifiers
into meaningful English would be very appreciated.
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
From:
ontolog-forum-bounces@xxxxxxxxxxxxxxxx
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of ????????? ??????
Sent: Saturday, January 31, 2009
3:38 AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] Is
there something I missed?
you are absolutely right about anaphoric references and huge number of
sentences, but structure of each sentence is simple - this is a point.
well, ~10 tables were used to generate this:)
I hope my customer accepts these simple sentences as understandable or
even native;)
But I am responsible to say that all RDB factology is there. In this
case just for one article.
2009/1/31 Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>
Thanks Alex for your suggestions,
Yes, each row in most tables can be translated into
sentences. The issue I'm trying to address is the way the classes pop out
of large tables depending on the perceptions of the translator.
For example, I'm presently looking at a table with tens of
millions of rows and 160 some-odd columns. Each row of that huge table
translates into a very, very long sentence! It would be better to
translate each row into a full paragraph, with anaphoric references to earlier
concepts expressed in sentences within the paragraphs.
Also, the issue of which concepts in each paragraph can be
most appropriately represented is complicated. Each row has references to
key columns in other tables, and other tables reference the keys of this huge
table. So the development of appropriate concepts and reasonably compact
sentences can be challenging. Relationships cross tables in many designs,
including this one. So locating where the relationships can be expressed
most appropriately, and non redundantly, is also a challenge.
Do you have any thoughts on this formulation of the problem?
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
We may
partialy extract classes and relationships from user interface with database.
You know these labels (words) on forms, pages, reports.
From
other side any database may be converted to set of sentences (facts). And these
sentences should be accepted by user (domain expert) as native.
It should be quite enough to begin modeling;)
2009/1/31
Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>
Thanks
Matthew,
You've written a large number of papers on modeling - my congratulations for
being so prolific. Your papers seem to begin with the concept of data
modeling and then go into the proper principles for modeling.
But I'm looking for ways to use existing database tables and to discover the
classes and relationships that, by chance, went into the original table
design. In my experience, commercial databases are developed in a
haphazard
way for nearly all commercial applications without the luxury of careful
modeling. That's especially true of the applications developed as
requirements are being discovered and changes are requested by users that
may not meet the best practices of modelers. Database designers are
usually
upset to see the way the databases turn out when created in this de facto
way instead of in a well designed, carefully orchestrated way.
So if you must begin with an existing database's table designs, how can a
reasonable class model be developed for that legacy database? Are there
automatic methods for generating the As-Is class models?
Suggestions, URLs, replies appreciated.
Thanks,
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
Dear Rich
>
> > I have always considered the definition of an RDB-class
> correspondence to be
> > partially in the mind of the modeler, not in the RDB structure
> itself. I
> > know of no formal definition of what a RDB-class consists of, which
> would
> > provide a rigorous foundation for translating RDBs to classes with
> one
> > exception; there is at least one class per table. Modelers may
often
> add
> > definitions of subclasses within a table, but that's usually some
> form of
> > correspondence between groups of columns from a table that fit within
> the
> > human "chunk" size of 7+/-2 concepts. So it seems
every bit as
> subjective
> > as any other method of constructing classes from data. Data
mining
> (and
> > text mining) consists of discovering those subclasses, along with
> classes
> > that relate one table to other(s).
>
> I would not dispute any of this.
>
> > Do you have reference(s) (URLs especially) to other documented points
> of
> > view which might more rigorously define how the RDB translates into
> classes?
[MW] I did some work that effectively does this about 15 years ago, which is
captured in a document called "Developing high Quality Data Models".
Part of
it is about finding the hidden classes, and part of it is about not hiding
them in the first place. You can find it here:
http://www.matthew-west.org.uk/Publications.html
or just type "High Quality Data Models" into Google.
Regards
Matthew West
Information Junction
Tel: +44 560 302 3685
Mobile: +44 750
3385279
matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
http://www.matthew-west.org.uk/
This email originates from Information Junction Ltd. Registered in England
and Wales No. 6632177.
Registered office: 2 Brookside, Meadow Way,
Letchworth Garden City,
Hertfordshire, SG6 3JE.
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
|
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (01)
|