[Top] [All Lists]

[ontolog-forum] NLP>Ontology>Structured Database

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Deborah MacPherson" <debmacp@xxxxxxxxx>
Date: Wed, 7 Nov 2007 09:04:08 -0500
Message-id: <48f213f30711070604n41c571a9t6d11581ded93bbcc@xxxxxxxxxxxxxx>
This message is to see if anyone can recommend existing tools or processes to help with at least part of a problem I am trying to solve.  The  solution needs to be low cost or free, and able to work on a Mac.

First, I need to load in approximately 220 Word documents which are the master specifications at WDG Architecture. Each project starts with the masters, sections that are not relevant to a project are eliminated, sections that are left are customized for each project. Each section is a word document, subject matters are presented in a certain order following CSI best practices. Changing performance requirements, products going in and out of favor, and the idiosyncrasies of local authorities having jurisdiction are hidden in the records waiting to be tapped some day.

For now, my aim is to extract the term "As Shown" to build a simple table to coordinate the specifications with the drawings. Some construction tasks such as concrete finishing are never shown on the drawings, they are only specified. Other work such as components of an assembly are only shown in the drawings and the specifications are like a menu describing each individual material. We, and all architecture offices, face problems, risk, and liability when the drawings and specifications conflict which is why this task is being undertaken. We want a clear distinction where certain information belongs (drawings or specs, codes or paragraphs) to avoid ambiguity between the construction documents.

It is taking FOREVER to search each specification section to identify locations of the word "shown". Currently I am opening each document, copying the paragraph by hand and building a table with the following columns "Keynote (the drawing code), Must Show (entering Yes only where applicable), Subject (broad description), Section Number (the file name), Paragraph Number, Specification text (entire paragraph in context), and Notes (for example, built as specified unless shown otherwise).

Even if the perfect system is available now, it will still come down to thinking about how to fill in most of the blanks. However, if there was anything to at least save the step of loading in a set of 220 closed, inactive word documents to extract entire paragraphs, their paragraph numbers, and section numbers this would help a lot. The specifications are legal documents which means the paragraph numbers and dates are important. Does anyone know if perhaps there is a Natural Language Processing system and legal ontology that could be adapted for this purpose?

The ultimate goal is a structured pathway database linked to the specifications as they update and, in a completely blue sky, links to products and manufacturers, and the ability to compare projects over time. In the near future, we may be able to use Building Information Modeling (BIM) to confirm the presence of keynotes appearing on the drawings to narrow down issues for discussion with the project architects, phase by phase, as the drawings and specifications are completed.

Any pointers would be sincerely appreciated.

Thank you,

Deborah MacPherson


Deborah L. MacPherson
Projects Director, Accuracy&Aesthetics
Specifier, WDG Architecture PLLC

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (01)

<Prev in Thread] Current Thread [Next in Thread>
  • [ontolog-forum] NLP>Ontology>Structured Database, Deborah MacPherson <=