ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] RDF vs. EAR

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Ed Barkmeyer <edbark@xxxxxxxx>
Date: Wed, 07 Dec 2011 14:12:00 -0500
Message-id: <4EDFBA80.7030708@xxxxxxxx>
John,    (01)

We just have different value systems.    (02)

you wrote:
> Ed,
>
> JFS
>   
>>> But XSLT is a horribly inefficient example.
>>>       
>
> EB
>   
>> As compared to what?  And for what purpose?
>>     
>
> For any transformations of XML of any size, Prolog runs circles
> around XSLT in speed, flexibility, and generality.  Many groups
> (including our VivoMind company) use Prolog to process anything
> we get from the SemWeb -- with a huge improvement in speed and
> scalability to large volumes.
>       (03)

I understand.  You are only talking about pure efficiency in terms of 
machine cycles per byte.
And I don't doubt that if you are doing semantic web and Google-like 
applications, high performance is a critical issue.    (04)

I misread your technical characterization as a general condemnation of XSLT.    (05)

All I am saying is that XSLT has its place, and it is commonly used in 
industry where the KPI is dollars per byte.  That is "efficiency" of a 
different kind.    (06)

XSLT is a simple paradigm for doing simple things.  The learning curve 
for adequate competence is days.  Adequate XSLT skills are widely 
available and cheap.  Thus, the engineering cost of doing 
transformations with XSLT is low.  The operating cost may be high in 
terms of machine cycles, but most organizations have more of those than 
they need, and can cheaply acquire more.  XSLT has proved to be too 
clumsy for time-critical message transformation, yes.  There is no value 
to becoming an XSLT expert; anything that requires any really advanced 
features also requires a better tool.    (07)

By comparison, every Prolog programmer who is able to do anything 
interesting right is well up on the competence scale.  The learning 
curve for Prolog is weeks to months.  Adequate Prolog skills are rare 
and expensive.  Thus the engineering cost of doing transformations with 
Prolog is high.  That cost is justifiable when the problem is hard, or 
high-performance is an issue.    (08)

As I said before, we are comparing cheese and chalk.    (09)

> Many gov't agencies are very heavy users of Prolog for many
> purposes, but they're rather quiet about what they do with it.
>       (010)

There is no way that argumentum ad populum is going to result in the 
determination that Prolog is "better" than XSLT.
For every government Prolog-based project, there are 10-100 XSLT 
projects.  But it is probable that the sum of the costs is about equal. ;-)    (011)

XSLT fills a niche.  Prolog fills an entirely different one.  XSLT is a 
tool for the 60% of software engineers who are of average competence, 
and thus of average cost, and perhaps overcompetent for the average 
software development project.  And perhaps half of them are XSLT users.  
Prolog is a tool for the top 20%, who are of extraordinary competence, 
harder to find, and higher in cost.  And perhaps 1% are actual users.    (012)

> The largest commercial user of Prolog is Experian, one of the
> major credit bureaus.  But they are also very quiet about how
> they use it.
>
> But Experian is such a heavy user of Prolog that they bought
> Prologia -- the company that was founded by Alain Colmerauer,
> who implemented the first version of Prolog.  By buying the
> company, they can do all sorts of clever things with it without
> telling anybody what they're doing.  Check your favorite search
> engine with the terms 'experian' and 'prologia'.
>
> Another company that is built on Prolog is Mathematica.
> Early versions of Mathematica explicitly used Prolog for
> their mathematical transformations.  Their current language
> still has a Prolog engine at its core, but they have built
> so many extensions around it that Prolog is no longer visible.
>
> EB
>   
>> Most data exchange standards that preceded XML were "bad"...
>>     
>
> I agree.  And I approve of using XML for many purposes, as I said
> in a previous note.  IBM uses UIMA (also XML based) for representing
> NLP data in Watson, but UIMA is more concise and efficient than RDF.
>
> But the best uses of the *ML family are for the purposes that GML
> was originally designed to support:  marking up documents.  That is
> still their "sweet spot".      (013)

Perhaps, but 10 years later, XML is only finally becoming used in 
marking up documents, because the tools used to generate the documents 
didn't see a demand for XML markup, e.g., over HTML with embedded 
scripting languages that provided much more general power, including 
lots of opportunities for trojans.  The real value of XML is in data 
exchange -- it enabled many first-attempt standards.  (That is both a 
good thing and a bad thing, but it prevented proprietary forms from 
dominating exchange.)    (014)

> For embedding languages in *ML documents,
> the script-tag (or the equivalent) is still the best method.
>
> EB
>   
>> The idea of XML was to discard efficiency for simplicity and clarity
>>     
>
> Discarding efficiency is a very, very bad idea.      (015)

We simply disagree.  Many things that are worth doing are not worth 
doing well, if they can be done adequately and quickly.  The information 
technology world has discarded efficiency in pursuit of other 
engineering goals for over 40 years.  Not all of those were good choices 
or good reasons, but some definitely were.  Java is inefficient by 
design.  Was the JVM a bad idea?    (016)

> Computer speeds are
> growing, but the volumes and complexity of the data are growing even
> faster.  If you want to process data in the browser, JavaScript (or
> the proposals for a next generation script) are vastly better than
> interpreting any extensive amount of XLST or similar notations.
>       (017)

Of course, but I don't see the relevance.  I don't see XSLT being used 
for SemWeb applications, but I don't see Javascript or Python or Ruby or 
C# being used extensively for such applications either.  Those languages 
are nonetheless often preferred for "web applications" and browser 
plug-ins.  Conversely, with the possible exception of C#, I wouldn't use 
any of them by choice to implement simple message and file 
transformations.  Match the tool to the job.    (018)

> EB
>   
>> Enter XSLT -- the means of translation between simple XML encodings
>>     
>
> Google has a lot of experience in processing web data, and they have
> chosen JSON as the foundation for Google apps.  They'll index anything
> thrown at them, but they don't use XSLT.
>       (019)

But GM sees 30000 XML messages a day, and they do use XSLT to translate 
them to standard forms.  They don't use JSON because no supply-chain 
tooling currently uses JSON as its base message syntax.  The future may 
be different, but then the Google use of JSON may be just another legacy.    (020)

> EB
>   
>> Never underestimate the value of simple tools for simple tasks.
>>     
>
> Amen, Amen.  We certainly agree on that point.
>
> But JSON is vastly simpler than any XML-based encoding, and every
> browser has JavaScript built in to process it.  Google knows that.
>       (021)

In 2011, yes.  The latest-and-greatest strikes again.  (XML and XSLT are 
10 years old.  By IT standards, that's right up there with Methuselah.  
But by industry standards, that makes them about ripe for reliable use.)    (022)

> Bottom line:  RDF/XML is the poison pill that is killing the SemWeb,
> and their salvation depends on dumping it ASAP.
>       (023)

So the next great technology is RDF/JSON?  Or is it CL/JSON?  :-)    (024)

-Ed    (025)


> John
>  
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/ 
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>  
>       (026)

-- 
Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                Cel: +1 240-672-5800    (027)

"The opinions expressed above do not reflect consensus of NIST, 
 and have not been reviewed by any Government authority."    (028)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (029)

<Prev in Thread] Current Thread [Next in Thread>