Re: [ontolog-forum] PDF and the semantic web

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Duane Nickull
Date: Wed, 11 Feb 2009 14:47:45 -0800
Message-id: <C5B89591.1CC0E%dnickull@xxxxxxxxx>

Most of what you wrote below is inaccurate.  PDF is not closed – PDF is an ISO spec and completely open.  There are absolutely no limitations on who may implement it.  There are multiple vendors supporting PDF.




Also - PDF , while not concerned with semantic declarations, uses a secondary format that is shared with Creative Suite products called XMP, based on RDF and expressed in XML.


There are multiple libraries, most open source, for dealing with XMP so metadata can be extracted easily in multiple languages for free.



PDF is, as you noted, concerned with document fidelity and layout for consistent, cross platform rendering.  While it does have structure, it is similar to HTML in terms of semantics (headers, body, paragraph etc).

There are roughly 40-50 vendors with various libraries to access PDF.  True – we think we offer the best but others are based on the ISO standard as well.

This needed to be corrected.



On 11/02/09 2:23 PM, "Pat Hayes" <phayes@xxxxxxx> wrote:

On Feb 11, 2009, at 9:45 AM, Alexander Garcia Castro wrote:

Sorry if this is not the right venue; I decided to send this email because in the past I have seen some semantic web issues being discussed here.

I think one of the W3C mailing lists might be more suitable for this topic. Try    semantic-web@xxxxxx

I would like to know how applicable could the PDF format be within the context of the Semantic web?

NOt very: the primary purpose of PDF is making visually accurate documents, rather than semantic information.

The PDF format is closed; annotating PDFs, as in tagging not the file but the information within the file, is not possible by means different from those provided by ADOBE. For instance, if I wanted to tag a word, or an image within, inside, a PDF I would have to do it with my acrobat reader -the latest version; But if I wanted to facilitate such operation via WEB I could only do it if and only if I had the XSLT so I could transform the PDF into XML.


This limitation is, IMHO, a huge one within the context of the semantic web where we should be able to define links and use them.

I don't quite see why you feel this is a SWeb problem.

Furthermore, being forced to have a third party application just for displaying a file that should be displayed directly by the browser is not a nice feature.

That is an issue for browser implementations.

If PDF was open it could be rendered by the browser.  Aren't closed formats such as PDF viable within the context of the SW?

Im not sure what exactly you are asking here.

After all the PDF was a solution within the context of portability and exchange of information

... for human readers, yes. But not for software inference agents, which is the point of the SWeb.

; the main problem it was solving was a simple one "I want my document to look on display and once printed,  the same everywhere" and "I want people to be able to read my documents without loosing the format of the document and without having to consider the OS". Isn't the PDF obsolete within this context?

What context? PDF seems to work well for its intended purpose. (?? Maybe I havnt understood your point.)

Seriously, I suggest re-sending your message to the semantic-web@xxxxxx  mailing list.

Pat Hayes

