[Top] [All Lists]

Re: [ontolog-forum] Is there something I missed?

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Rich Cooper" <rich@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 31 Jan 2009 12:19:58 -0800
Message-id: <20090131202004.61E6C138CEE@xxxxxxxxxxxxxxxxx>
Hi John,    (01)

My old comments are marked RC, and new comments are marked with [RC] flags
below.  I have added [JS] tags as needed to support readability.     (02)

-Rich    (03)

Rich Cooper
Rich AT EnglishLogicKernel DOT com    (04)

============ ================== ==================== =============     (05)

RC> Actually, SQL enforces foreign key constraints that let the
 > modeler...    (06)

Note your verb 'let'.  I agree that SQL *lets* the modeler write
constraints that can enforce anything that can be stated in first
order logic.      (07)

[RC] SQL is just a programming language, so everything required to write a
query 'lets' the SQL programmer do just that.  Nothing in SQL is
automatically generated - the programmer writes everything in queries,
views, procedures, tables, constraints etc.  By the same reasoning, FOL is a
language not bound to reality.  Therefore FOL also 'lets' some user
(programmer?) write statements in FOL - nothing is done for the programmer
by the FOL (or by the SQL) language implementation.      (08)

[JS] But as you said yourself,    (09)

RC> In my experience, commercial databases are developed in a
 > haphazard way for nearly all commercial applications without
 > the luxury of careful modeling.    (010)

Implementing type hierarchies in an SQL database is possible, but
not the path of least resistance.  Furthermore, any N modelers who
adopt types are likely to find N incompatible ways to implement them.    (011)

[RC] I don't see SQL as more difficult to write type hierarchies in than
FOL.  Both require the same effort from the programmer.  An IsA table with a
row [This, That], for example, is equivalent to an FOL statement IsA(This,
That).  Why would a different form of FOL than SQL be any better at type
hierarchy development?  Explanation appreciated if provided.      (012)

RC> But sometimes, denormalized tables are preferable for performance
 > reasons - not for conceptual modeling reasons.    (013)

[JS] I agree.  But the underlying table structure is a performance issue
that should be handled by an automatic (or at least semi-automatic)
optimizer.    (014)

SQL became successful because the queries were optimized by the
compiler.  But optimization techniques have progressed quite far
in the past 40 years, and much more can be done to help knowledge
engineers focus on the knowledge, not the implementation details.    (015)

[RC] Optimization is only useful if the throughput and response times are
known in advance and can be predicted from the queries and the tables (or
indexes) they run against.  But most database applications have operational
constraints that the optimizer is not privy to.  For example, if operator A
has to meet tight schedule constraints because she is on the phone with a
customer as she enters data, operator B may not even be human - think of a
large report being printed by operator B.  So optimization is only good for
restructurable queries without real time constraints.      (016)

[RC] Database replication (which is not performed by the optimizer) is often
necessary to meet high throughput and response time constraints that the
optimizer is unaware of.  Google, for example, has some thousands of
database copies to support large numbers of queries.  Since the database(s)
are updated only occasionally, and searchers expect quick results, database
replication is a commonly applied method that is completely beyond the
abilities of the optimizers.  This kind of architectural design has nothing
to do with the optimization of individual queries.  So the optimizer is a
great first stab at using a single database to meet simple queries by a few
users, but optimization is doomed to failure when the temporal constraints
are at all complicated and the variety of tasks is high.      (017)

ASh> We may partially extract classes and relationships from user
 > interface with database. You know these labels (words) on forms,
 > pages, reports.  From other side any database may be converted
 > to set of sentences (facts). And these sentences should be accepted
 > by user (domain expert) as native.  It should be quite enough to
 > begin modeling    (018)

[JS] I agree, but good tools are necessary to support that approach.    (019)

[RC] Software tools are seldom actually practical in 'interpreting' ad hoc
columns such as the "S32994" example Alex posted.  Tools have value mainly
from the syntactic structure of a database, but the meaning of tables,
columns and rows escapes the tools.  People have to interpret them.  That is
why the data mining and text mining tools available have to have human users
provide useful interpretations.      (020)

RC> I'm presently looking at a table with tens of millions of rows
 > and 160 some-odd columns.  Each row of that huge table translates
 > into a very, very long sentence!  It would be better to translate
 > each row into a full paragraph, with anaphoric references to
 > earlier concepts expressed in sentences within the paragraphs.    (021)

I wouldn't attempt to generate English from the database.  Instead,
I'd extract English definitions from the documents that describe
the database.  Those would include the technical manuals and reports
as well as the documentation designed for the end users and data
entry clerks.    (022)

[RC] Generating English from the database seems to be about as hard as
translating English statements into database updates.  Technical manuals are
usually way behind the latest changes, and provide relatively unreliable
information for most projects.  I'm sure there are academic or research
projects that can show something more human LOOKING, but commercial
databases almost never have enough funding and long schedules needed to make
automatically generated English pretty.      (023)

-Rich    (024)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (025)

<Prev in Thread] Current Thread [Next in Thread>