The type checker alone on a reasonable system is typically twice as
hard as the lexing and parsing combined. This is especially so if you
need to support any form of type inference (which all modern strongly
typed languages do to some extent).
As it happens, I am currently working on a compiler. The approx line
number count is (in lines of Java): (01)
Parser (yacc equiv): 950
Type checker: 5000
Code transform (src->src level optimization): 1000
Compiler (semantic transform): 1200 (02)
The compiler is still in its early stages, with very little code
generation and very little in the way of optimizations implemented.
But already, there is a 7-2 ratio in post parsing to parsing code. (03)
As for my nice :) most of the compilers I wrote are in the AI
languages space. Currently I am working on a language for complex
event processing. (04)
On Jan 6, 2009, at 4:23 PM, Ed Barkmeyer wrote: (06)
> you wrote:
>> Could not help jumping in on this one ed...
>> Having written more than 12 compilers in my time; I would have to
>> state that lex and yacc 'solve' an extremely small part of the
>> problem. So small that I actually rarely use them...
> I'm curious to know what you think are the bigger parts of the
> I suppose it may depend on the relation of the syntax to the
> semantics. Most of my experience is with languages that are nearly
> one-to-one, at least in "simply recognized" contexts.
> I can only claim 5 commercial compilers, all before 1975, and two
> others since (for research purposes). And I have never used yacc
> and lex, although several of my colleagues did. But it destroyed
> the market for my skills at the time. Like the XML phenomenon, too
> many people thought parsing the surface language was the problem,
> and it was one of the black arts of the 1960s (parsing unformatted
> information, ooh).
> The Gries book did provide basic skills for organizing and searching
> symbol tables, understanding operator overloading, allocating memory
> and generating code. And it emphasized generation of "threaded
> code", which meant generating very little that wasn't calls on
> library routines. (The GE POPS team of 1960 is usually credited with
> inventing the technique.)
> And, as I said, it was the combination of those two factors that
> produced the commodity effect -- any computer science student could
> learn the trade in a few months.
> My specialty was code optimization, and that can be a very complex
> problem, but it was also a completely optional feature. It was
> totally absent in most university compilers (why bother for a one-
> shot homework program?), and in many minicomputer compilers of the
> 1970s as well (probably because of memory limitations). I only
> wrote real optimizers for two compilers, and those were for
> My customers rarely wanted anything the newly minted degrees
> couldn't do. So I am curious as to what niche you found.
> P.S. A correction: The GE/Honeywell NDB product was IDS -- the
> Integrated Data Store. I think IDMS was the Cullinane product for
> the IBM 360/370 series.
> Edward J. Barkmeyer Email: edbark@xxxxxxxx
> National Institute of Standards & Technology
> Manufacturing Systems Integration Division
> 100 Bureau Drive, Stop 8263 Tel: +1 301-975-3528
> Gaithersburg, MD 20899-8263 FAX: +1 301-975-4694
> "The opinions expressed above do not reflect consensus of NIST,
> and have not been reviewed by any Government authority." (07)
Description: S/MIME cryptographic signature
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (01)