* part 1 - http://blip.tv/file/1997719/
* part 2 - http://blip.tv/file/1998152
* part 3 - http://blip.tv/file/2000983/
I cannot speak officially for the Genome Center, but I'll throw out there that the ORM that powers much of the GC's analysis platform is out on Github and CPAN.
It's actually more than an ORM in that it also supports features like automated creation/smart rewriting of class files based on database tables, quick and easy command modules that get turned into hierarchical command-line tools for free, and an automated test harness that can even parallelize onto an LSF cluster if you've got one.
Github http://github.com/sakoht/UR
CPAN w/ documentation http://search.cpan.org/dist/UR/lib/UR.pm
My understanding is that all these projects (mine included) were cast adrift when the funding for them evaporated in the post-9/11 climate. In the intervening years, I was aware that Perl was being picked rapidly at the Genomics labs in the nearby university hospital (i.e. since we never delivered them the FPGA platform), and I'm happy to read Perl has risen to fill this niche.
I've got 3-4 terabytes of storage within a dozen feet of me as I type this; it really drives home the pace of change in computing.
http://oreilly.com/pub/a/oreilly/perl/news/success_stories.h...
O'Reilly also published some of these in at least two folded/stapled pamphlets that were handed out for free e.g. at conferences. I recall a finance-centered application where the Perl prototype far outperformed the subsequent implementation and ended up taking over the production role.
It looks like maintenance at that URL stopped in about 2004, but in googling "perl success stories" I saw a few more recent articles that might qualify.
I suspect the reason perl flourished here was a combination of luck and the cultural fit. Culture here includes the newbie-friendly online help (e.g. perlmonks), the ease of "publish and re-use components" (CPAN).
I wonder if perl would still be used if the project was started today.
Because at the time Perl was probably the only capable dynamic/scripting language.
"Perl is forgiving. Biological data is often incomplete, fields can be missing, a field that is expected to be present once occurs several times (because, for example, an experiment was run in triplicate) or the data gets entered by hand and doesn't quite fit the expected format. Perl doesn't particularly mind if a value is empty or contains odd characters. Regular expressions can be written to detect and correct a variety of common errors in data entry. Of course, this flexibility can also be a curse, as I'll discuss in more detail later."
A few words are different. The article says triplicate, and p3ll0n says duplicate, for example. But they are similar enough to use as testing input to a diff algorithm.
EDIT: Also from this guy's comment history:
http://news.ycombinator.com/item?id=1456105
Some of the phrasing looks to have been copied and pasted from this article by Jonathan Ellis:
http://www.rackspacecloud.com/blog/2009/11/09/nosql-ecosyste...
I bet if you could make a bot to do this -- go out and find relevant information, and summarize it -- you could actually provide a serious public service. As long as you cited your sources, so it's not a plagiarism-bot.