ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] fitness of XML for ontology

To: "[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Barkmeyer, Edward J" <edward.barkmeyer@xxxxxxxx>
Date: Wed, 12 Feb 2014 00:00:22 +0000
Message-id: <1b9f1677d0e9407d992028f80735d11f@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>

Duane,

 

I never thought of sand, beaches, or chunks of cement as “content”.  We speak different languages, we understand too many terms very differently.  Continued babbling at each other will be fruitless.

 

-Ed

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Duane Nickull
Sent: Tuesday, February 11, 2014 6:22 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

>Duane,

> With that definition, can you identify any significant content that isn’t “structured”?

 

Interesting to explore. Abstract, possibly not easily since there are an infinite number of things that may provide structure to something.  There are definitely some examples however.

 

Sand on the beach is not structured other than the relationship a "beach" is "an aggregation" of "sand particles". 

 

The ocean is not structured other than the ocean is a structure aggregated from unordered water molecules.  My coffee cup is not structured as it has no discernible aggregation that I can see with my eyes on a non-microscopic level.   So what else is unstructured?

 

A block of wood

A chunk of cement

A chunk of plastic

Etc…

 

A binary file may be unstructured given I cannot even read into the contents of it.  A JPEG may have no structure other than being able to go to the byte level.  

 

I generally only like to use the term "structured" within a context.  In the context of a programming language accessing data,  it is much easier to use the term unstructured if the data cannot be systematically de-composed into smaller components. 

 

Duane

***********************************

Technoracle Advanced Systems Inc.

Consulting and Contracting; Proven Results!

i.  Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile

t.  @duanenickull

 

 

NOTICE: This e-mail and any attachments may contain confidential information. If you are the intended recipient, please consider this a privileged communication, not to be forwarded without explicit approval from the sender.  If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal. The originator reserves the right to monitor all e-mail communications through its networks for quality control purposes.

 

 

 

From: "Barkmeyer, Edward J" <edward.barkmeyer@xxxxxxxx>
Reply-To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Date: Tuesday, 11 February, 2014 1:42 PM
To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Duane,

 

With that definition, can you identify any significant content that isn’t “structured”?  As Rich pointed out, even natural language content is divided into sentences, and sentences have grammatical structure with elements that are phrases made up of words, all of which contributes to conveying the intent.    In a similar way a binary data stream, e.g., telemetry data or streaming video, has some internal organization as a set of information units, with rules for its interpretation.

 

-Ed

 

15-20 years ago, Haim Kilov succinctly summed up what I was on about:

“I won’t agree with anything you say until you define your terms.”

 

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Duane Nickull
Sent: Tuesday, February 11, 2014 4:08 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Structured tom me means that there are patterns within the content that can be used to divide it into smaller divisions.  If I can look at the examples below, I would state that they are all structured, given you have stated they can be divided into smaller components.

 

I am using this at a pure abstract level and there is no presumption that the smaller components can be retrieved via any programming construct or language.

 

If you decide to use this in the context of a specific programming language and data chunk, then it would depend on whether or not you can retrieve the content and get it out of the larger corpus.

 

Not sure if this is in alignment with anyone else's definitions.

 

Duane

 

***********************************

Technoracle Advanced Systems Inc.

Consulting and Contracting; Proven Results!

i.  Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile

t.  @duanenickull

 

From: Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>
Reply-To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Date: Monday, 10 February, 2014 6:02 PM
To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Well put, Ed:

A text ‘is structured’ if the relevant information can be extracted by a rote process that interprets the structural elements.  It is ‘semi-structured’, if you can get some of the information that way, but you have to deal with ‘unstructured content’ to extract other important information for your purpose.  It is ‘unstructured’  if you can get little or no useful information by applying a rote interpretation process.

 

Examples:

DateTime stamps,

            Dates,

            Times,

            3 line city addresses,

            Email addresses,

            URLs

are all structured in the bidirectional conversion to strings and back.

            Patent Claims,

are semistructured, and

            Patent Specifications,

are unstructured, in many common application purposes. 

 

But as Ed mentioned, it can be highly application dependent, and also technology dependent.  A few years from now, there will likely be a raised level of the terms depending on what successful new inventions reach the consumer markets over that time. 

 

-Rich

 

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2


From:ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Barkmeyer, Edward J
Sent: Monday, February 10, 2014 11:03 AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Duane,

 

It seems to me only that we have different understandings of ‘the structure of a corpus’, and the relationship of the structure to the information content. 

I think:

A text ‘is structured’ if the relevant information can be extracted by a rote process that interprets the structural elements.  It is ‘semi-structured’, if you can get some of the information that way, but you have to deal with ‘unstructured content’ to extract other important information for your purpose.  It is ‘unstructured’  if you can get little or no useful information by applying a rote interpretation process.

 

Do you have a definition for your use of ‘is structured’?

 

-Ed

 

From:ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Duane Nickull
Sent: Monday, February 10, 2014 1:08 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Ed:

 

The case you present would be structured then.  If it has structured, it is structured.    There is a different argument here which may be more relevant to tag something as deterministic.  Many structured documents are not deterministic which is what people usually mean when they state semi-structured.

 

Thoughts?

 

Duane

***********************************

Technoracle Advanced Systems Inc.

Consulting and Contracting; Proven Results!

i.  Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile

t.  @duanenickull

 

 

NOTICE: This e-mail and any attachments may contain confidential information. If you are the intended recipient, please consider this a privileged communication, not to be forwarded without explicit approval from the sender.  If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal. The originator reserves the right to monitor all e-mail communications through its networks for quality control purposes.

 

 

 

From: "Barkmeyer, Edward J" <edward.barkmeyer@xxxxxxxx>
Reply-To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Date: Monday, 10 February, 2014 7:33 AM
To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Duane,

 

> I would like to suggest that structured/unstructured is binary.  "Semi-structured" is ipso facto structured.

 

Well, there is some middle ground.  A lot of nominally structured data has fields whose content may be critical information in an unstructured form.  (This is often the case with ontologies, e.g. OWL annotations.)  Similarly, you can have essentially unstructured text with formal attachments, like a spreadsheet.  I think that is what Rich means by ‘semi-structured’, given his example.

 

-Ed

 

 

From:ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Duane Nickull
Sent: Sunday, February 09, 2014 3:45 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

I would like to suggest that structured/unstructured is binary.  "Semi-structured" is ipso facto structured.  

 

Duane

***********************************

Technoracle Advanced Systems Inc.

Consulting and Contracting; Proven Results!

i.  Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile

t.  @duanenickull

 

 

NOTICE: This e-mail and any attachments may contain confidential information. If you are the intended recipient, please consider this a privileged communication, not to be forwarded without explicit approval from the sender.  If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal. The originator reserves the right to monitor all e-mail communications through its networks for quality control purposes.

 

 

 

From: Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>
Reply-To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Date: Saturday, 8 February, 2014 11:01 PM
To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

At 100-300 words, how little structured does it have to be before its “unstructured”?  How about patent claim language, which can be up to and beyond 1000 words in a single sentence?  How about Wikipedia articles which can be several thousand words in dozens on sentences, all encoded as strings?

 

-Rich

 

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2


From:ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Martijn Tromm
Sent: Saturday, February 08, 2014 10:45 AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

I would say this is not unstructured but little structured. Similar to a table in db with one or two fields containing a text blob.

 

Martijn

On Saturday, February 8, 2014, Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx> wrote:

Dear Kingsley,

XML can also representunstructured text in string format.  For example, the description of a property title's meets and bounds is typically from one hundred to three hundred words of text, as written by the surveyor, and makes up one of the attributes in string form. The XML parser simply passes it as a string to the software. 

-Rich

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2

-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Kingsley Idehen
Sent: Saturday, February 08, 2014 10:20 AM
To: ontolog-forum@xxxxxxxxxxxxxxxx
Subject: Re: [ontolog-forum] fitness of XML for ontology

On 2/8/14 12:32 AM, Paul Tyson wrote:

> "-- It is a data base language for text.

That's the crux of the matter! A database is a document comprised of

structured data. It isn't a database management system i.e., an

application that provides services such as: storage, indexing,

declarative query access etc..

XML is fine as mechanism for marking up structured data in a document.

The utility of this process isn't optimal if the endeavor requires

entity relationships and relation semantics to be discernible and

comprehensible to human authors and readers.

Links:

[1] http://bit.ly/1ievivx -- Database

[2] http://bit.ly/1d6gvSR -- Database Management System (DBMS)

[3] http://bit.ly/1n1FrMr -- Relational Database Management System (RDBMS) .

--

Regards,

Kingsley Idehen

Founder & CEO

OpenLink Software

Company Web: http://www.openlinksw.com

Personal Weblog: http://www.openlinksw.com/blog/~kidehen

Twitter Profile: https://twitter.com/kidehen

Google+ Profile: https://plus.google.com/+KingsleyIdehen/about

LinkedIn Profile: http://www.linkedin.com/in/kidehen






_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/ Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J

_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/ Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J

_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/ Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J

_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/ Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>