ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Foundation Ontology

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>, standard-upper-ontology@xxxxxxxxxxxxxxxxx
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Wed, 27 Aug 2008 10:57:14 -0400
Message-id: <48B56B4A.90903@xxxxxxxxxxx>
As I said in previous notes, the Foundation Ontology should be designed
to use any or every logic-based notation as input, and the internals
should be stored in some suitable version of logic.    (01)

For complex logical expressions, a full version of logic, such as the
Common Logic standard would be necessary.  But a very large amount of
specification can be done in much simpler notations.  RDF(S) and OWL
are widely used, but their human factors leave much to be desired.    (02)

Recently, Google has released specifications and software for their
_Protocol Buffers_, which use a compact, elegant, humanly readable
notation that is also very efficient for computer processing.
Following is an example from their documentation:    (03)

    person {
      name: "John Doe"
      email: "jdoe@xxxxxxxxxxx"
    }    (04)

Following is an equivalent in XML (an RDF version would look worse):    (05)

    <person>
       <name>John Doe</name>
       <email>jdoe@xxxxxxxxxxx</email>
    </person>    (06)

For such a short example, the Google form is somewhat more readable
than the XML form, but the difference in readability escalates rapidly
for large examples.  Just as important is the computer efficiency.
Compressing XML notations by ZIP or other algorithms takes an enormous
amount of time, but the Google software is extremely fast.  Following
is their comment:    (07)

 > When this message is encoded to the protocol buffer binary format
 > (the text format above is just a convenient human-readable
 > representation for debugging and editing), it would probably be
 > 28 bytes long and take around 100-200 nanoseconds to parse. The
 > XML version is at least 69 bytes if you remove whitespace, and
 > would take around 5,000-10,000 nanoseconds to parse.    (08)

Following is the Google summary, which I find convincing:    (09)

 > Protocol buffers have many advantages over XML for serializing
 > structured data. Protocol buffers:
 >
 >   * are simpler
 >   * are 3 to 10 times smaller
 >   * are 20 to 100 times faster
 >   * are less ambiguous
 >   * generate data access classes that are easier to use
 >     programmatically    (010)

The three primary languages they support are Java, C++, and Python.
But other groups have implemented versions for C, C#, Perl, PHP,
Ruby, LISP, Erlang, Haskell, and ActionScript.    (011)

The above example came from the Google Developer's Guide:    (012)

    http://code.google.com/apis/protocolbuffers/docs/overview.html    (013)

At the end of this note are some quotations from Google developers.
A very important point is that the Google notation with its supporting
software has been implemented and tested on very large applications --
probably some of the largest applications in the world.  And the
software they are now making available (under the Apache license)
is already version 2.0, so the speed bumps have been smoothed out.    (014)

Recommendation: I suggest that we use the Google notation to represent
simple type hierarchies at the level of Aristotle's syllogisms.  That
is the most commonly used subset of OWL, and it can be automatically
translated to Common Logic, and every other notation that anyone has
been using for ontologies.  For more complex expressions, a richer
version of logic could be used.  But I would recommend a version of
controlled English as the *primary* notation for complex logic.  Other
notations, including OWL or Common Logic, would be compiled *from*
controlled English.    (015)

John Sowa
_______________________________________________________________________    (016)

http://www.informationweek.com/news/internet/google/showArticle.jhtml?articleID=208803049    (017)

"It's the way we encode almost any sort of structured information which 
needs to be passed across the network or stored on disk," said Chris 
DiBona, Google's open source programs manager, in a blog post. "We 
thought Protocol Buffers might be useful to other people, too, so we've 
decided to release it as open source software."    (018)

Google software engineer Kenton Varda, in a post on the Google open 
source blog, said that Google uses literally thousands of different data 
formats, most of which are structured. Encoding these data formats on a 
massive scale is too much for XML, so Google developed Protocol Buffers.    (019)

Varda compares Protocol Buffers to an Interface Description Language 
(IDL), without the complexity. "[O]ne of Protocol Buffers' major design 
goals is simplicity," said Varda. "By sticking to a simple 
lists-and-records model that solves the majority of problems and 
resisting the desire to chase diminishing returns, we believe we have 
created something that is powerful without being bloated. And, yes, it 
is very fast -- at least an order of magnitude faster than XML."    (020)

For Google's FAQ, see    (021)

http://code.google.com/apis/protocolbuffers/docs/faq.html    (022)



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (023)

<Prev in Thread] Current Thread [Next in Thread>