ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] fitness of XML for ontology

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Rich Cooper" <rich@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 10 Feb 2014 18:02:11 -0800
Message-id: <88A4FEDE77CC4BC38092740F4020F61C@Gateway>

Well put, Ed:

A text ‘is structured’ if the relevant information can be extracted by a rote process that interprets the structural elements.  It is ‘semi-structured’, if you can get some of the information that way, but you have to deal with ‘unstructured content’ to extract other important information for your purpose.  It is ‘unstructured’  if you can get little or no useful information by applying a rote interpretation process.

 

Examples:

DateTime stamps,

            Dates,

            Times,

            3 line city addresses,

            Email addresses,

            URLs

are all structured in the bidirectional conversion to strings and back.

            Patent Claims,

are semistructured, and

            Patent Specifications,

are unstructured, in many common application purposes. 

 

But as Ed mentioned, it can be highly application dependent, and also technology dependent.  A few years from now, there will likely be a raised level of the terms depending on what successful new inventions reach the consumer markets over that time. 

 

-Rich

 

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2


From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Barkmeyer, Edward J
Sent: Monday, February 10, 2014 11:03 AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Duane,

 

It seems to me only that we have different understandings of ‘the structure of a corpus’, and the relationship of the structure to the information content. 

I think:

A text ‘is structured’ if the relevant information can be extracted by a rote process that interprets the structural elements.  It is ‘semi-structured’, if you can get some of the information that way, but you have to deal with ‘unstructured content’ to extract other important information for your purpose.  It is ‘unstructured’  if you can get little or no useful information by applying a rote interpretation process.

 

Do you have a definition for your use of ‘is structured’?

 

-Ed

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Duane Nickull
Sent: Monday, February 10, 2014 1:08 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Ed:

 

The case you present would be structured then.  If it has structured, it is structured.    There is a different argument here which may be more relevant to tag something as deterministic.  Many structured documents are not deterministic which is what people usually mean when they state semi-structured.

 

Thoughts?

 

Duane

***********************************

Technoracle Advanced Systems Inc.

Consulting and Contracting; Proven Results!

i.  Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile

t.  @duanenickull

 

 

NOTICE: This e-mail and any attachments may contain confidential information. If you are the intended recipient, please consider this a privileged communication, not to be forwarded without explicit approval from the sender.  If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal. The originator reserves the right to monitor all e-mail communications through its networks for quality control purposes.

 

 

 

From: "Barkmeyer, Edward J" <edward.barkmeyer@xxxxxxxx>
Reply-To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Date: Monday, 10 February, 2014 7:33 AM
To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

Duane,

 

> I would like to suggest that structured/unstructured is binary.  "Semi-structured" is ipso facto structured.

 

Well, there is some middle ground.  A lot of nominally structured data has fields whose content may be critical information in an unstructured form.  (This is often the case with ontologies, e.g. OWL annotations.)  Similarly, you can have essentially unstructured text with formal attachments, like a spreadsheet.  I think that is what Rich means by ‘semi-structured’, given his example.

 

-Ed

 

 

From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Duane Nickull
Sent: Sunday, February 09, 2014 3:45 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

I would like to suggest that structured/unstructured is binary.  "Semi-structured" is ipso facto structured.  

 

Duane

***********************************

Technoracle Advanced Systems Inc.

Consulting and Contracting; Proven Results!

i.  Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile

t.  @duanenickull

 

 

NOTICE: This e-mail and any attachments may contain confidential information. If you are the intended recipient, please consider this a privileged communication, not to be forwarded without explicit approval from the sender.  If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal. The originator reserves the right to monitor all e-mail communications through its networks for quality control purposes.

 

 

 

From: Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx>
Reply-To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
Date: Saturday, 8 February, 2014 11:01 PM
To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

At 100-300 words, how little structured does it have to be before its “unstructured”?  How about patent claim language, which can be up to and beyond 1000 words in a single sentence?  How about Wikipedia articles which can be several thousand words in dozens on sentences, all encoded as strings?

 

-Rich

 

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2


From:ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Martijn Tromm
Sent: Saturday, February 08, 2014 10:45 AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] fitness of XML for ontology

 

I would say this is not unstructured but little structured. Similar to a table in db with one or two fields containing a text blob.

 

Martijn

On Saturday, February 8, 2014, Rich Cooper <rich@xxxxxxxxxxxxxxxxxxxxxx> wrote:

Dear Kingsley,

XML can also representunstructured text in string format.  For example, the description of a property title's meets and bounds is typically from one hundred to three hundred words of text, as written by the surveyor, and makes up one of the attributes in string form. The XML parser simply passes it as a string to the software. 

-Rich

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2

-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Kingsley Idehen
Sent: Saturday, February 08, 2014 10:20 AM
To: ontolog-forum@xxxxxxxxxxxxxxxx
Subject: Re: [ontolog-forum] fitness of XML for ontology

On 2/8/14 12:32 AM, Paul Tyson wrote:

> "-- It is a data base language for text.

That's the crux of the matter! A database is a document comprised of

structured data. It isn't a database management system i.e., an

application that provides services such as: storage, indexing,

declarative query access etc..

XML is fine as mechanism for marking up structured data in a document.

The utility of this process isn't optimal if the endeavor requires

entity relationships and relation semantics to be discernible and

comprehensible to human authors and readers.

Links:

[1] http://bit.ly/1ievivx -- Database

[2] http://bit.ly/1d6gvSR -- Database Management System (DBMS)

[3] http://bit.ly/1n1FrMr -- Relational Database Management System (RDBMS) .

--

Regards,

Kingsley Idehen

Founder & CEO

OpenLink Software

Company Web: http://www.openlinksw.com

Personal Weblog: http://www.openlinksw.com/blog/~kidehen

Twitter Profile: https://twitter.com/kidehen

Google+ Profile: https://plus.google.com/+KingsleyIdehen/about

LinkedIn Profile: http://www.linkedin.com/in/kidehen




_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/ Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J

_________________________________________________________________ Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/ Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/ Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>