Don't know if this is relevant, but Google, of course, offers a very fast natural language translator here:
How do people here appraise it? What techniques are they using?
As a test, I put in the text of one of the messages posted here and translated it into German. Then I took the German output and stuck it into the source-text box and translated it back into English. Maybe that's not fair, since it no doubt compounds the errors going both ways like that. It's a little bit like that game where a message is whispered around a circle of people and comes back to the source totally unrecognizable. But this isn't too bad:
In the 1970s, I worked on a project research, called the language of the data. I find no trace of this project or its results on the Web. The problem is that the project was motivated by the observation that in the 1970s during the oil crisis of two government agencies results on oil imports to the United States reported on a monthly basis. The two agencies reported results for the same statement "oil imports to the U.S. in May 1978 ', which differed by a significant amount. Was it because the conditions were poorly defined, the methods of data collection were bad, or what? The problem was to understand why the answers were so different, how to build computers and information systems, which would not suffer from this problem. The investigation revealed that each agency had really good methods of data collection and in each agency's criteria for the interpretation of the results was carefully documented the agencies policies and procedures. For example, each word in the phrase on oil imports had a different meaning to each agency. For example, U.S. Puerto Rico contain in one case and in the other. The definition of oil was a set of hydrocarbons for an agency. The other agency uses a different set.
The problem was that if the information was in computer information systems, which had lost all information about the meaning of the terms. All that remained was column heads like U.S., May, oil, etc. The language of the data project approached the problem from an information and library science perspective. It took Ranganathan work and other metadata methods for classification of data.
I later used to design ideas to a metadata system for the management of large volumes of technical data to an aerospace program, the language of the data. This worked quite well in practice. However, a meta-ontology would work even better. But I never got to implement it.
(805) 966-9515 Santa Barbara
http://interspirit.net | http://sharedpurpose.net | http://bridgeacrossconsciousness.net
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F Sowa
Sent: Tuesday, August 13, 2013 1:06 PM
To: '[ontolog-forum] '
Subject: [ontolog-forum] Concept dictionaries and interlinguas
The idea of a universal Interlingua as a basis for defining all the concepts of all languages and facilitating translations among them is one of the oldest and fondest hopes for machine translation.
One of the largest and best funded projects (billions of yen from the Japanese government in the period 1986 to 1994) is EDR. Following is a 4-page summary of their goals, formats, and achievements:
Following is an excerpt from that article:
> The Concept Dictionary contains information on the 400,000 concepts
> listed in the Word Dictionary and is divided according to information
> type into the Headconcept Dictionary, the Concept Classification
> Dictionary, and the Concept Description Dictionary. The Headconcept
> Dictionary describes information on the concepts themselves. The
> Concept Classification Dictionary describes the super/sub relations
> among the 400,000 concepts. The Concept Description Dictionary
> describes the semantic (binary) relations, such as 'agent', 'implement', and 'place', between concepts that co-occur in a sentence.
Table 1 in that article summarizes the users. Among them, it lists 66 Japanese universities and one university "overseas". I suspect that the overseas university is Stanford (CSLI). I had talked with some people at CSLI about EDR. They said that they had a copy, but nobody had found anything useful to do with it.
Following is another article about EDR from a conference in 1997:
That article is from a workshop on Interlinguas. For the table of contents of the proceedings with URLs to the papers presented, see
The series of conferences on Interlinguas continued, but the proceedings from later years were published in book form, and they're not available for free download.
As far as I know, the R & D on Interlinguas has not produced any great breakthroughs in natural language understanding or high-quality machine translation. If anybody knows of even minor breakthroughs (successful commercial applications), please send a note to Ontolog Forum.
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Shared Files: http://ontolog.cim3.net/file/ Community Wiki: http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J