But that's not the problem legacy systems need solved.
Legacy systems do not contain public information. They contain corporate proprietary information. The systems themselves are proprietary & the data in the systems is proprietary.
I'm interested in the systems (e.g. the actual software), NOT the data the systems produce & handle.
I have a small collection of 826 applications... how do I find what's related to what? How do I do provenance research in these days of elevated regulation & regulatory scrutiny?
This collection—50,000 programs, 38,000 control cards, 62,000 DB2 columns, 31,000 DB2 tables, 59 other kinds of artifacts for a total of 1.7M artifacts—is behind the corporate firewall & my guess is very little of it has been webified. I don't know specifically but I'd assume at least some of the data is accessible via internal & external web mechanism, but the SYSTEMS have not be webified.
A GUI browser is a terrible access mechanism since none of the support staff has three arms. Having to remove hands from the keyboard to mouse around immediately cuts productivity by 33%.
My guess would be that the suggestion of exposing this collection—particularly the relationships between artifacts—to "the web" would likely get me tossed off the roof. I'm pretty sure I wouldn't be given the time to explain that the objective is to use web mechanisms, BEHIND the company firewall...
Remember... the value that I need here in tending & enhancing legacy systems is to quickly discover the relationships between artifacts...