ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] AJAX vs. the Giant Global Graph

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Ed Barkmeyer <edbark@xxxxxxxx>
Date: Tue, 30 Mar 2010 13:45:09 -0400
Message-id: <4BB238A5.8000603@xxxxxxxx>
I don't really disagree with John, but we definitely have different 
views of the topic.    (01)

John F. Sowa wrote:
> In the subject line, I put "vs." between AJAX and the GGG, but
> they should be considered complementary methods whose greatest
> strength comes from a dynamic combination.  What we really need
> is *not* a webified view of all data, but an AJAX-ified way of
> reorganizing the Semantic Web and combining it with other kinds
> of information.
>       (02)

I am going to put on my Haim Kilov hat and say I can't agree with this 
until you define your terms.
What do you mean by "webified"?  Marked up in (X)HTML?  Marked up with 
OWL or RDF?  I assume you don't mean "web-accessible", because AJAX 
depends on web-accessibility, just not on HTML.     (03)

The whole idea of AJAX is that you preprocess the data using an adapter 
to produce a marked up form that is used by the "combining algorithm".  
This is not a new idea. Approximately 80% of the software written 
specifically for individual businesses is this same idea on a smaller 
scale -- organizing specific combinations of business information in a 
specific view for a specific business function. And it seems to me that 
Google Maps is just another of these.     (04)

The real measure of the the AJAX approach is the degree to which it is 
possible to produce adequate web page displays with basic support for 
the form of the content in its original repository and minimal 
understanding of the content itself.  The more understanding needed for 
the content, the more referential knowledge needed in or available to 
the adapter or the combining algorithm.  Google's power is in the 
proprietary combining algorithm and its derived store of referential 
knowledge, where that store has been derived by a preponderance of 
association and little injected knowledge.  The Semantic Web approach is 
to capture the referential knowledge formally and derive it in a trusted 
way.  And the Wikipedia approach is somewhere in between.    (05)

I fully agree with the idea that we need to "combine [RDF-annotated 
information] with other kinds of information", but there are two ways to 
do that -- derive the semantic markup for the other kinds, or link them 
by statistical association.      (06)

> Many people have noted that there is vastly more data in the
> world than anything represented on web pages.  Much of that
> data is stored in files and databases that are used by AJAX
> methods for generating web pages dynamically.  But they are
> not represented as web pages until they are explicitly created
> in response to somebody's request.
>       (07)

The term "web pages" in the first sentence means (X)HTML objects 
accessed via HTTP.  All the data that is used by AJAX methods is 
provided by specific HTTP-accessible services on the servers.  The data 
is web-accessible; the only question is how much of the adapter is 
resident on the source server, how much on the search server, and how 
much on the client.  Ultimately, what is generated dynamically is the 
interactive display -- what the technology on the client side does with 
the form sent by the search server.  The "web page" thinking here is 
that the display form is dictated by the HTML sent by the server, but in 
some cases, much of the display is controlled by Java uploaded to the 
client via the HTML script.  In other cases, such as PDF, the raw source 
is sent to the client and the display intelligence is in a browser 
plug-in on the client side.  What we must realize is that different 
software houses are in the business of making money on different sides 
of the client/server interface.  They use different architectures to 
accomplish that -- smart server (dumb browser client), smart client 
(various servers), paired client/server with some distribution of 
functions.  The AJAX approach John describes is a smart search server 
approach.  Many financial apps use smart client agents that deal with 
multiple server information sources (and formats).  I suppose one may 
see them as personal AJAX agents -- internally they have the same 
architecture, but they are based on a reference ontology and business 
rules partly provided by the designer and partly by the user.    (08)

The idea of the Semantic Web technologies is that they are supporting 
technologies for any of several such architectures.  They require some 
agent to markup the original information sets (like the AJAX adapters), 
and some agent to provide the reference ontologies for the markup and 
interpretation, and some agent to perform the interpretation of multiple 
sources by reasoning.  The reason why this approach has been much slower 
to succeed is that it still takes human experts to do or guide the 
semantic markup, whereas statistical association can be done by simply 
having enough computer power.  The breakthrough that is needed for the 
Web is to do text interpretation automatically and generate the RDF 
markup with results that are not significantly worse than human-directed 
markup.    (09)

(In 1957, John Backus had to sell the idea of a higher-level programming 
language (FORTRAN) by developing tooling that generated machine code 
that was not significantly worse than that of human experts, and he did 
it by coding every element of smart programmer think.  By comparison, 10 
years later, Fortran H did an algorithmic analysis of the code and the 
resource requirements and then generated an optimal solution.  The 
breakthroughs depend on being able to do complex intellectual tasks as 
well as humans but much faster.  There is no requirement for the 
algorithmic approach to be the same.  So it is not obvious that logic 
will produce better results than statistical association in the general 
case.  La prova e nel gusto.)    (010)

> The terms 'Invisible Web', 'Hidden Web', and 'Deep Web' are
> often used for that data.  It is much more voluminous than
> the visible web, and for various reasons, it will never be
> part of the visible web.  Some kinds of data are unintelligible
> without further processing -- for example, the huge volumes of
> data about the universe gathered by NASA.  Other data must be
> kept out of the WWW for reasons of privacy and security.
>       (011)

We have to distinguish here between data that is not accessible over the 
web at all and data that is not usefully accessible to dumb browsers.  
We also have to realize that large volumes of data in specialized forms 
want only a server (with sufficient bandwidth) to make them 
http-accessible.  If they have value, adapter servers or clients will be 
developed to make the information accessible in more conventional ways.  
By comparison, data that is protected will remain invisible, regardless 
of technology, but it may be combined locally, or with Web data, to 
produce valuable information presentations.  Many companies use 
browser-based apps that use only private data sources within the 
organization.  The architectures for viewing and integrating distributed 
information have more use than the Web.    (012)

> In general, there is no single format that is ideal for all
> possible uses.  Instead of designing an ideal format for
> everything, we must develop frameworks for relating anything
> in any format and reorganizing it as needed for any possible
> purpose.
>       (013)

The question, however, is how the relational framework will be built.  
Statistical association frameworks and ontological frameworks are 
significantly different in approach and in purpose.  OTOH, it is not 
clear to me that they will in the long run be significantly different in 
the quality of their results.  Ontological frameworks are much more 
reliable in controlled spaces; but it is not clear that they will be 
more effective in producing good information when deployed blindly on 
the Web.    (014)

> AJAX and the Giant Global Graph are complementary pieces of
> an even more gigantic puzzle, and nobody knows how many more
> pieces may be discovered or invented in the future.  We must
> develop tools and methodologies that can accommodate and
> integrate anything that might arise.
>       (015)

How can one disagree with this?     (016)

-Ed    (017)

-- 
Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694    (018)

"The opinions expressed above do not reflect consensus of NIST, 
 and have not been reviewed by any Government authority."    (019)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (020)

<Prev in Thread] Current Thread [Next in Thread>