I don't really disagree with John, but we definitely have different
views of the topic. (01)
John F. Sowa wrote:
> In the subject line, I put "vs." between AJAX and the GGG, but
> they should be considered complementary methods whose greatest
> strength comes from a dynamic combination. What we really need
> is *not* a webified view of all data, but an AJAX-ified way of
> reorganizing the Semantic Web and combining it with other kinds
> of information.
> (02)
I am going to put on my Haim Kilov hat and say I can't agree with this
until you define your terms.
What do you mean by "webified"? Marked up in (X)HTML? Marked up with
OWL or RDF? I assume you don't mean "web-accessible", because AJAX
depends on web-accessibility, just not on HTML. (03)
The whole idea of AJAX is that you preprocess the data using an adapter
to produce a marked up form that is used by the "combining algorithm".
This is not a new idea. Approximately 80% of the software written
specifically for individual businesses is this same idea on a smaller
scale -- organizing specific combinations of business information in a
specific view for a specific business function. And it seems to me that
Google Maps is just another of these. (04)
The real measure of the the AJAX approach is the degree to which it is
possible to produce adequate web page displays with basic support for
the form of the content in its original repository and minimal
understanding of the content itself. The more understanding needed for
the content, the more referential knowledge needed in or available to
the adapter or the combining algorithm. Google's power is in the
proprietary combining algorithm and its derived store of referential
knowledge, where that store has been derived by a preponderance of
association and little injected knowledge. The Semantic Web approach is
to capture the referential knowledge formally and derive it in a trusted
way. And the Wikipedia approach is somewhere in between. (05)
I fully agree with the idea that we need to "combine [RDF-annotated
information] with other kinds of information", but there are two ways to
do that -- derive the semantic markup for the other kinds, or link them
by statistical association. (06)
> Many people have noted that there is vastly more data in the
> world than anything represented on web pages. Much of that
> data is stored in files and databases that are used by AJAX
> methods for generating web pages dynamically. But they are
> not represented as web pages until they are explicitly created
> in response to somebody's request.
> (07)
The term "web pages" in the first sentence means (X)HTML objects
accessed via HTTP. All the data that is used by AJAX methods is
provided by specific HTTP-accessible services on the servers. The data
is web-accessible; the only question is how much of the adapter is
resident on the source server, how much on the search server, and how
much on the client. Ultimately, what is generated dynamically is the
interactive display -- what the technology on the client side does with
the form sent by the search server. The "web page" thinking here is
that the display form is dictated by the HTML sent by the server, but in
some cases, much of the display is controlled by Java uploaded to the
client via the HTML script. In other cases, such as PDF, the raw source
is sent to the client and the display intelligence is in a browser
plug-in on the client side. What we must realize is that different
software houses are in the business of making money on different sides
of the client/server interface. They use different architectures to
accomplish that -- smart server (dumb browser client), smart client
(various servers), paired client/server with some distribution of
functions. The AJAX approach John describes is a smart search server
approach. Many financial apps use smart client agents that deal with
multiple server information sources (and formats). I suppose one may
see them as personal AJAX agents -- internally they have the same
architecture, but they are based on a reference ontology and business
rules partly provided by the designer and partly by the user. (08)
The idea of the Semantic Web technologies is that they are supporting
technologies for any of several such architectures. They require some
agent to markup the original information sets (like the AJAX adapters),
and some agent to provide the reference ontologies for the markup and
interpretation, and some agent to perform the interpretation of multiple
sources by reasoning. The reason why this approach has been much slower
to succeed is that it still takes human experts to do or guide the
semantic markup, whereas statistical association can be done by simply
having enough computer power. The breakthrough that is needed for the
Web is to do text interpretation automatically and generate the RDF
markup with results that are not significantly worse than human-directed
markup. (09)
(In 1957, John Backus had to sell the idea of a higher-level programming
language (FORTRAN) by developing tooling that generated machine code
that was not significantly worse than that of human experts, and he did
it by coding every element of smart programmer think. By comparison, 10
years later, Fortran H did an algorithmic analysis of the code and the
resource requirements and then generated an optimal solution. The
breakthroughs depend on being able to do complex intellectual tasks as
well as humans but much faster. There is no requirement for the
algorithmic approach to be the same. So it is not obvious that logic
will produce better results than statistical association in the general
case. La prova e nel gusto.) (010)
> The terms 'Invisible Web', 'Hidden Web', and 'Deep Web' are
> often used for that data. It is much more voluminous than
> the visible web, and for various reasons, it will never be
> part of the visible web. Some kinds of data are unintelligible
> without further processing -- for example, the huge volumes of
> data about the universe gathered by NASA. Other data must be
> kept out of the WWW for reasons of privacy and security.
> (011)
We have to distinguish here between data that is not accessible over the
web at all and data that is not usefully accessible to dumb browsers.
We also have to realize that large volumes of data in specialized forms
want only a server (with sufficient bandwidth) to make them
http-accessible. If they have value, adapter servers or clients will be
developed to make the information accessible in more conventional ways.
By comparison, data that is protected will remain invisible, regardless
of technology, but it may be combined locally, or with Web data, to
produce valuable information presentations. Many companies use
browser-based apps that use only private data sources within the
organization. The architectures for viewing and integrating distributed
information have more use than the Web. (012)
> In general, there is no single format that is ideal for all
> possible uses. Instead of designing an ideal format for
> everything, we must develop frameworks for relating anything
> in any format and reorganizing it as needed for any possible
> purpose.
> (013)
The question, however, is how the relational framework will be built.
Statistical association frameworks and ontological frameworks are
significantly different in approach and in purpose. OTOH, it is not
clear to me that they will in the long run be significantly different in
the quality of their results. Ontological frameworks are much more
reliable in controlled spaces; but it is not clear that they will be
more effective in producing good information when deployed blindly on
the Web. (014)
> AJAX and the Giant Global Graph are complementary pieces of
> an even more gigantic puzzle, and nobody knows how many more
> pieces may be discovered or invented in the future. We must
> develop tools and methodologies that can accommodate and
> integrate anything that might arise.
> (015)
How can one disagree with this? (016)
-Ed (017)
--
Edward J. Barkmeyer Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263 Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263 FAX: +1 301-975-4694 (018)
"The opinions expressed above do not reflect consensus of NIST,
and have not been reviewed by any Government authority." (019)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (020)
|