When someone finds fault with the way a field conducts itself, I would implore them to constructively influence that field. You might be surprised how many are actually sympathetic to your concerns.
I'm not dismissing this author's concerns: to do that would really require knowing the molecular biology field (which is more than sequencing, it turns out). I do neuroscience right now, and programming can be a problem for some. But a constructive suggestion to change can have much more impact than a long rant.
[0] http://www.runmycode.org/data/MetaSite/upload/nature10836.pd...
It's a similar issue. I think statisticians are taking constructive steps to correct their path, since you know, ML is the new sexy thing. Bioinformatics could take a much longer time to self-correct though.
Although, as I mentioned in an earlier comment, Fred seems to be in a prime position to disrupt the bioinformatics field since he seems to know all the problems that afflict it
http://books.google.com/ngrams/graph?content=machine+learnin...
In my experience, what happens is that biologists define the science, and they depend on the computer scientists / engineers to implement solutions to their computational problems. The computational people depend on the biologists to validate whatever results they produce. The iteration cycle can be painfully slow, especially for people used to telling machines what they want them to do, and getting results immediately. The proposition of changing that dynamic is not alluring to most people, but I still hope there will be some who try.
I spent five years working in bioinformatics, and this is exactly the attitude of both the researchers and the other developers on the projects I worked on. It was very frustrating.
My single most limited resource is programmer time. My time and the time of other people who work with me. I have access to loads of computers that sit idle all the time, even if it is on nights and weekends. There is zero opportunity cost to me in using these computers more fully. I have enough human work to do that I can wait for the results without having any wait states.
There can be a big opportunity cost in trying to rework a workflow so that it is more efficient and then test it thoroughly ensure correctness. Doing this may seem more appealing to someone who is interested primarily in computational efficiency. But I am more interested in research efficiency, and so are my employers and funders.
Hi, I recognize your name as a legit bioinformatician, am a huge fan of the lab that you're currently in, and others should listen to you.
I'd like to add that for many projects, general reusable software engineering is not necessarily a huge advantage. Instead of verifying a single implementation, it's often better for somebody to reimplement the idea from scratch; if a second implementation in a different language written by a different programmer gets the same results, this is a much more thorough validation of the software than going over prototype software line by line.
Also, I've seen way too many software engineers come in with an enterprisey attitude of establishing all sorts of crazy infrastructure and get absolutely no work done. If Java is your idea of a good time, it's unlikely that you'll be an effective researcher (though it's not unheard of), because it's not good at maximizing single-programmer output, and not good at maximizing I/O or CPU or string processing. In research it's best to get results, fail fast fast fast, and move on to the next idea. If you're lucky, 1 in 20 will work out. Publish your crap, and if it's a good idea, it will be worth polishing the turd later, but it's better to explore the field then to spend too much time on an uninteresting area.
The only time you worry about efficiency is when it enables a whole other level of analysis. So, for example, UCSC does most of their work in C, including an entire web app and framework written in C, because when they were doing the draft assembly of human genome a decade ago on a small cluster of computers that they scrounged from secretaris' desks over the summer, Perl wouldn't cut it.
I proposed, implemented, and tested an 8 line change to our alignment tool that saved 6% cpu time. It took me two days, most of which was my spare time at home. This one program was using 15 cpu years every month. Nobody cared. It never went into production. I started interviewing for a new job and left shortly after that.
The tools are written by (in my experience) very smart bioinformaticians who aren't taught much computer science in school (you get a smattering, but mostly it's biology, math, chemistry, etc.). Ex:
http://catalog.njit.edu/undergraduate/programs/bioinformatic...
http://www.bme.ucsc.edu/bioinformatics/curriculum#LowerDivis...
http://advanced.jhu.edu/academic/biotechnology/ms-in-bioinfo...
The tools themselves are written by smart non-programmers (a very dangerous combination) and so you get all sorts of unusual conventions that make sense only to the author or organization that wrote it, anti-patterns that would make a career programmer cringe, and a design that looks good to no one and is barely useable.
Then, as he said, they get grants to spend millions of dollars on giant clusters of computers to manage the data that is stored and queried in a really inefficient way.
There's really no incentive to make better software because that's not how the industry gets paid. You get a grant to sequence genome "X". After it's done? You publish your results and move on. Sure, you carve out a bit for overhead but most of it goes to new hardware (disk arrays, grid computing, oh my).
I often remarked that if I had enough money, there would be a killing to be made writing genome software with a proper visual and user experience design, combined with a deep computer science background. My perfect team would be a CS person, a geneticist, a UX designer, and a visual designer. Could crank out a really brilliant full-stack product that would blow away anything else out there (from sequencing to assembly to annotation and then cataloging/subsequent search and comparison).
Except, I realized that most folks using this software are in non-profits, research labs, and universities, so - no, there in fact is not a killing to be made. No one would buy it.
I wrote a post about why GATK - one of the most popular bioinformatic tools in Next Generation Sequencing should not be put into a clinical pipeline:
http://blog.goldenhelix.com/?p=1534
In terms of your ideal software strategy, I can speak to that as well, as I am actually attempting to do almost exactly what you suggesting. My team is all masters in CS & Stats, with focus on kick-ass CG visualization and UX.
We released a free genome browser (visualization of NGS data and public annotations) that reflects this:
http://www.goldenhelix.com/GenomeBrowse/
But you're right, selling software in this field is a very weird thing. It's almost B2B, but academics are not businesses and their alternative is always to throw more Post-Doc man-power at the problem or slog it out with open source tools (which many do).
That said, we've been building our business (in Montana) over the last 10 years through the GWAS era selling statistical software and are looking optimistically into the era of sequencing having a huge impact on health care.
I've seen you link to your blog post a couple of times now, and I still think it's misleading. I do wonder whether your conflict of interest (selling competing software) has led you to come to a pretty unreasonable conclusion. (My conflict of interest is that I have a Broad affiliation, though I'm not a GATK developer.)
In your blog post, you received output from 23andme. The GATK was part of the processing pipeline that they used. What you received from 23andme indicated that you had a loss of function indel in a gene. However, it turns out that upon re-analysis, that was not present in your genome; it was just present in the genome of someone else processed at the same time as you.
Somehow, the conclusion that you draw is that the GATK should not be used in a clinical pipeline. This is hugely problematic:
1) It's not clear that there were any errors made by the GATK. Someone at 23andme said it was a GATK error, but the difference between "user error" and "software error" can be blurred for advantage. It's open source, so can someone demonstrate where this bug was fixed, if it ever existed?
2) Now let's assume that there was truly a bug. Is it not the job of the entity using the software to check it to ensure quality? An appropriate suite of test data would surely have caught this error yielding the wrong output. Wouldn't it be as fair, if not more so, to say that 23andme should not be used for clinical purposes since they don't do a good job of paying attention to their output?
Your blog post shows, for sure, a failure at 23andme. Depending on whether the erroneous output was purely due to 23andme or if the GATK had a bug in production code, your post shows an interesting system failure: an alignment of mistakes at 23andme and in the GATK. But I really don't think it remotely supports the argument that the GATK is unsuitable for use in a clinical sequencing pipeline.
In my experience, this applies to accounting software, sensor data, computer-aided design, print manufacturing, healthcare, etc.
I imagine there's phases of maturity, something akin to CMM/SEI. Eventually there's enough people with a foot on both sides to bridge the gap.
It just takes time.
Maybe it's still in the early going, but I do see how it's going to be real difficult making a living doing this. OTOH, companies like CLC Bio seem like they're doing well for themselves...
So I disagree with you on your very last sentence (agree with the rest)
The trick is, academics often have excess manpower capacity in the form of grad students and post-docs. Even though personell is usually one of the highest expenses on any given grant, they often don't look at ways to improve the efficiency of their research man-hours.
That's not a blank rule, as we have definitely had success with the value proposition of research efficiency, but in general, a lot of things business adopt to improve project time (like Theory of Constraints project management, Mindset/Skillset/Toolset matching of personel et) is of no interest to academic researchers.
As for whether there's "a killing to be made", it's kind of unclear so far.
For example, it isn't true at all that microarray data is worthless. The early data was bad, and it was very over-hyped, but with a decade of optimization of the measurement technologies, better experimental designs, and better statistical methods, genome-wide expression analysis became a routine and ubiquitous tool.
The claim that sequencing isn't important is ridiculous. It's the scaffold to which all of biological research can be attached.
However:
There is a great deal of obfuscation, and reinventing well-known algorithms under different names (perhaps often inadvertently). There's also a lot of low-quality drivel on tool implementations or complete nonsense. This is driven largely by the need in academia to publish.
The other side of this problem is that in general, CS and computer scientists don't get much respect in biology. People care about Nature/Science/Cell papers, not about CS conference abstracts. Despite bioinformatics/computational biology not really being a new field anymore, the cultures are still very different.
Bioinformatics is hard, but too many careerists take advantage of difficulties and uncertainty to publish as many papers as they can get away with.
Minor quibble: genome assembly is definitely still an open problem that's computationally difficult. So is robust high dimension inference, but that falls more under statistics.
I've wanted to leave at least a dozen times too, for the better pay, for working with programmers that can teach me something, and to not have my work be interrupted by academic politics. But the people pissed at the status quo are the ones that are smart enough to see it's broken and try to fix it, and if we all leave, science is really fucked.
[0] http://www.johndcook.com/blog/2010/10/19/buggy-simulation-co...
"Must be an expert in 18 technologies" "Must have a PHD in Computer Science or Molecular Biology" "Must have 12 years experience and post doctoral training" "Pay: $30,000"
It's delusional because they apply the requirements it took for themselves to get a job in Molecular Biology (long PHD, post doc, very low pay for first jobs) and just apply it carte blanche to all fields that may be able to aid in their pursuits. Especially when it comes to software engineering where it can often be extremely difficult to explain why you did not pursue a PHD.
In my geographic area, this salary range is somewhat below corporate IT work (say 10% to 15%), but generally higher than the typical university software dev job listing. The university is really bad to list jobs and job requirements with laughable salaries. I have seen (in other departments) web app dev jobs that require significant front-end and back-end skillsets/experience and then pop a salary that is full 50% less than entry level jobs for CS undergrads.
One problem is that hiring departments in that position will find someone to hire at that rate, so they think it was correct. From personal experience, I can verify that "good on-paper" candidates with exceptional credentials (say MS in CS, bunch of experience) from other depts who look to join our team are unable to to write any code at the whiteboard at all (say a for loop in java to println something). But to be fair, a recent job interview cycle one of my teammates performed produced exactly two candidates out of 16 who could do this and only one of those could write a SQL statement that required a simple inner-join. Most of those folks were external, so it's not just a problem inside the institution.
I have a number of cynical and embarrassing opinions about this situation.
The whiteboard is only useful as an aid in explaining an algorithm. If a candidate can do that without the whiteboard, even better.
It's only delusional if they can't find people to fill the jobs. The idea that, as an outsider, you know what requirements they should use in their hiring process better than they do is perhaps more delusional.
I worked in bioinformatics for more than 10 years before I moved on, and In my experience they do have a lot of trouble finding people to fill positions, especially outside of massive government funded groups like the NIH. This often results in passing on competent software engineers with a B.Sc. that don't meet the requirements in favor of PHD level biology graduates who have taken a year or so of undergrad computer science courses. In my experience, this leads to many of the problems discussed (and exaggerated) by the OP. While some of these people are smart and produce good work, much of the time they produce poor quality software that gets the job done, but as inefficiently as possible and they leave a code base that is virtually unusable. Overall, I mostly just wanted say that it's a mindset they REALLY need to get past for the long term success of the industry.
If you really feel strongly about something, write it dispassionately (normally some time after the event) and treat it like a dissertation, backed with case studies and citations.
Sh*tty data? Comes from the community. If the data and algorithms are so poor, and the author so superior, he should have been able to improve the circumstances.
This whole screed reads like an entitled individual who entered a profession, didn't get the glory, oh and yeah, academia doesn't pay well.
In the realm of bioinformatics, lets ignore the work done on the human genome and the like.
Why? Aren't you assuming a lot about the incentives? What if the ground truth is simply that all the results are false due to a melange of bad practices? Do you think he'll get tenure for that? (That was a rhetorical question to which the answer is 'no'.) Then you know there's at least one very obvious way in which he could not improve the circumstances of poor data & algorithms.
He discusses this specifically in the rant. Are you saying he's wrong?
Was anyone asking him to? Was anyone paying him to? No? Then it's an uphill battle and also not his responsibility. Leaving is saner.
Academia rewards journal publication and does not adequately reward programming and data collection and analysis, although these are indispensable activities that can be as difficult and profound as crafting a research paper. At least the National Science Foundation has done researchers a small favor by changing the NSF biosketch format in mid-January to better accommodate the contributions of programmers and "data scientists": the old category Publications has been replaced with Products.
Naming is important to administrators and bureaucrats. It can be easy to underestimate the extent to which names matter to them. Now there is a category under which the contribution of a programmer can be recognized for the purpose of academic advancement. Previously one had to force-fit programming under Synergistic Activities or otherwise stretch or violate the NSF biosketch format. This is a small step, but it does show some understanding that the increasingly necessary contributions of scientific programmers ought to be recognized. The alternative is attrition. Like the author of the article, programmers will go where their accomplishments are recognized.
Still, reforming old attitudes is like retraining Pavlov's dogs. Scientific programmers are lumped in with "IT guys." IT as in ITIL: the platitudinous, highly non-mathematical service as a service as a service Information Technocracy Indoctrination Library. There is little comprehension that computer science has specialized. For many academics, scientific programmers are interchangeable IT guys who do help desk work, system and network administration, build websites, run GIS analyses, write scientific software and get Gmail and Google Calendar synchronization running on Blackberries. It is as if scientists themselves could be satisfied if their colleagues were hired as "scientists" or "natural philosophers" with no further qualification, as opposed to "vulcanologist" or "meteorologist" (to a first order of approximation).
"[bioinformatics] software is written to be inefficient, to use memory poorly, and the cry goes up for bigger, faster machines! [...]"
Well, the author is heading for a very bitter surprise...
- This guy clearly has a limited understanding of the field. This quote is laughable: "There are only two computationally difficult problems in bioinformatics, sequence alignment and phylogenetic tree construction."
- As a bioinformatician, I feel sorry for this guy. Just like any other field, there are shitty places to work. If I was stuck in a lab where a demanding PI with no computer skills kept throwing the results of poorly designed experiments at me and asking for miracles, I'd be a little bitter too.
- Just like any other field, there are also lots of places that are great places to work and are churning out some pretty goddamn amazing code and science. I'm working in cancer genomics, and we've already done work where the results of our bioinformatic analyses have saved people's lives. Here's one high-profile example that got a lot of good press. (http://www.nytimes.com/2012/07/08/health/in-gene-sequencing-...)
- I'm in the field of bioinformatics to improve human health and understand deep biological questions. I care about reproducibility and accuracy in my code, but 90% of the time, I could give a rat's ass about performance. I'm trying to find the answer to a question, and if I can get that answer in a reasonable amount of time, then the code is good enough. This is especially true when you consider that 3/4 of the things I do are one-off analyses with code that will never be used again. (largely because 3/4 of experiments fail - science is messy and hard like that). If given a choice between dicking around for two weeks to make my code perfect, or cranking out something that works in 2 hours, I'll pretty much always choose the latter. ("Premature optimization is the root of all evil (or at least most of it) in programming." --Donald Knuth)
- That said, when we do come up with some useful and widely applicable code, we do our best to optimize it, put it into pipelines with robust testing, and open-source it, so that the community can use it. If his lab never did that, they're rapidly falling behind the rest of the field.
- As for his assertion that bad code and obscure file formats are job security through obscurity, I'm going to call bullshit. For many years, the field lacked people with real CS training, so you got a lot of biologists reading a perl book in their spare time and hacking together some ugly, but functional solutions. Sure, in some ways that was less than optimal, but hell, it got us the human genome. The field is beginning to mature, and you're starting to see better code and standard formats as more computationally-savvy people move in. No one will argue that things couldn't be improved, but attributing it to unethical behavior or malice is just ridiculous.
tl;dr: Bitter guy with some kind of bone to pick doesn't really understand or accurately depict the state of the field.
This is the only bad point that a lot of people are aligned with.
The more time a program needs to finish, the more time you will need to run it again with some other dataset, and in turn - more time to find the right answer.
I really feel that people with scientific and mathematics background should learn proper programming (not take a course in some language - but have actual experience). Design patterns, data structures, best practices, memory consumption, are all things that should be known before a person starts submitting code for this kind of projects.
I'm very interested in bioinformatics, but sadly don't know as much about the field as I'd like.
These are just two of many questions ( biased towards my research interests of course ). It is really funny that he mentions sequence alignment and phylogenetically as the two big problems, because people generally consider these to be boring, uncool, solved-well-enough-for-our-purposes problems nowadays and just trust the algorithms described by Durbin decades ago. It sounds like the writer really doesn't know bioinformatics that well...
Definitely a computationally difficult problem because while naive approaches work, they produce crappy results, wasting the result of tens of thousands of dollars of experiments. I see a big move towards applying statistical/machine learning methods, and graph theory stuffs in our field.
A lot of the rants in the original article are correct, with regards to prototyping and throwaway codes. That's because researchers are rushing to get an MVP out. The truly good ones got turned into (usually open-source) products, where the code quality hopefully improves a fair bit.
If you're a CS person who's interested or considering a move into bioinfo, I wrote a blog post about it recently: http://www.joewandy.com/2013/01/getting-into-bioinformatics....
Any solid factual resources besides the references mentioned in this justified rant?
See there for answers to your question, eg:
* Best resources to learn molecular biology for a computer scientist. [1]
* What are the best bioinformatics course materials and videos (available online)? [2]
"A Hitchhikers Guide to Next Generation Sequencing"
Part1: http://blog.goldenhelix.com/?p=423
The fact of the matter is that through high-throughput sequencing, microarrays, what have you, generation of biologically-meaningful results is possible.
There are a lot of problems in bioinformatics that need to be solved. Github has helped. More of bioinformaticians are learning about good software development practices, and journal reviewers are becoming more enlightened of the merits of sharing source code.
I find it curious that he stops to salute ecologists, since I was in an ecology lab. I liked my labmates and our perspective, but we didn't have any magical ability to avoid the problems he aludes to here.
I think a lot of his frustration comes down to not being more involved in the planning process. That's not a new problem. R.A. Fisher put it this way in 1938: “To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.”
Perhaps the idea that we can have bioinformatics specialists who wait for data is just wrong. Should we blame PIs who don't want to give up control to their specialists, or the specialists who don't push harder, earlier? Ultimately the problem will only be solved as more people with these skills move up the ranks. But the whole idea that we need more specialists working on smaller chunks of the problem may be broken from the start (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1183512/).
Surely this means there's a goldmine waiting there for someone to produce a non-broken toolchain for bioinformatics?
Or is it even possible to produce standard tools? Maybe all the labs are too bespoke?
6 years ago using CVS or something like that was novel. Now not using GIT is. Big improvement!
Problems are still interesting and challenging.
Biologists are almost never good coders, if they can code at all. But thats not what they do, they signed up for pipettes, not python.
Its the programmers who wrote said shitty code that are to be blamed, but you can't hate under-paid and over-worked phd students who write this code even though it usually has nothing to do with their thesis (the math/algorithm is the main part, the deployable implementation is usually not the most important).
If you want good code and organized/accountable databases, go to industry. Theres nothing new about this transition. The IMPORTANT part, is that industry gives back to academia. So when you get an office with windows and a working coffee machine, remember to help make some phd student's life a little easier by making part of your code open source.
Surely they can't get that far without having some kind of sensible method?
1. I agree that SE standards and good coding practice are completely absent in the bioinformatics world. I remember being asked to improved the speed of some sequence alignment tools and realized that the source code was originally Delphi that had been run through a C++ converter. No comments, single monolithic file. The vast majority of the bioinformatics code I worked with was poorly written/documented Perl. In addition a lot of bioinformatics guys don't understand SE process and so rather than having a coordinated engineering effort, you end up with a lot of "coyboy coding" with guys writing the same thing over and over.
2. I agree that productivity is very slow. This is a side product of research itself though. In the "real world" (quoted) where people need to sell software, time is the enemy. It's important to work together quickly to get a good product to market. In the research world, you get a 2/5 year grants and no one seems have much of a fire under them to get anything done (Hey we're good for 5 years!). You would think that the people would be motivated to cure caner quickly (etc), but it's not really the case. Research moves at a snail's pace - and consequently the productivity expectations of the bioinformatics group.
3. I disagree that research results from the scientists are garbage. Yes it's true that some experiments get screwed up. However, if you having a lot of people running those experiments over and over, the bad experiments clearly become outliers. Replication in the scientific community is good because it protects against bad data this way. Somehow the author must have had a particularly bad experience.
4. Something the author didn't mention that I think is important to understand: most scientists have no idea how to utilize software engineering resources. The pure biologists, many times are the boss, and don't really understand how to run a software division like bioinformatics. Many times PHD's in CS run a bioinformatics group, who have never worked in industry and don't know anything about good SE practice or how to run a software project. A lot of the problems in the bioinformatics industry is directly related to poor management. Wherever you go you're going to have team members that have trouble programming, trouble with their work ethic, trouble with following direction. However, in a bioinformatics environment where these individuals are given free reign and are not working as a cohesive unit, you can see why there is so much terrible code and duplication.
Yes, industry typically pays more than academia. Yes, most molecular biologists cannot code and rely on bioinformatics support. Yes, biological data is often noisy. Yes, code in bionformatics is often research grade (poorly implemented, poorly documented, often not available). These are all good points that have been made many times more potently by others in the field like C. Titus Brown (http://ivory.idyll.org/blog/category/science.html). But they are not universal truths and exceptions to these trends abound. Show me an academic research software system in any field outside of biology that is functional and robust as the UCSC genome browser (serving >500,000 requests a day) or the NCBI's pubmed (serving ~200,000 requests a day). To conclude from common shortcomings of academic research programming that bioinformatics is "computational shit heap" is unjustified and far from an accurate assessment of the reality of the field.
From looking into this guy a bit (who I've never heard of before today in my 10+ years in the field), my take on what is going is here is that this is the rant of a disgruntled physicist/mathematician is a self-proclaimed perfectionist (https://documents.epfl.ch/users/r/ro/ross/www/values.html), who moved into biology but did not establish himself in the field. From what I can tell contrasting his CV (https://documents.epfl.ch/users/r/ro/ross/www/cv.pdf) to his linkedin profile (http://www.linkedin.com/pub/frederick-ross/13/81a/47), it does not appear that he completed his PhD after several years of work, which is always a sign of something something going awry and that someone has had a bad personal experience in academic research. I think this is most important light to interpret this blog post in, rather than an indictment of the field.
That said, I would also like to see bioinformatics die (or at least whither) and be replaced by computational biology (see differences in the two fields here: http://rbaltman.wordpress.com/2009/02/18/bioinformatics-comp...). Many of the problems that apparently Ross has experienced come from the fact that most biologists cannot code, and therefore two brains (the biologist's and the programmer's) are required to solve problems that require computing in biology. This leads to an abundance of technical and social problems, which as someone who can speak fluently to both communities pains me to see happen on a regular basis. Once the culture of biology shifts to see programming as an essential skill (like using a microscope or a pipette), biological problems can be solved by one brain and the problems that are created by miscommunication, differences in expectations, differences in background, etc. will be minimized and situations like this will become less common.
I for one am very bullish that bioinformatics/computational biology is still the biggest growth area in biology, which is the biggest domain of academic research, and highly recommend students to move into this area (http://caseybergman.wordpress.com/2012/07/31/top-n-reasons-t...). Clearly, academic research is not for everyone. If you are unlucky, can't hack it, or greener pastures come your way, so be it. Such is life. But programming in biology ain't going away anytime soon, and with one less body taking up a job in this domain, it looks like prospects have just gotten that little bit better for the rest of us.
Just another data point for someone contemplating a career in BINF, although some purists might say that my work did not really fall under the same category.
"Ept" means effective. As in "inept"
I don't understand this part:
> No one seems to have pointed out that this makes your database a reflection of your database, not a reflection of reality. Pull out an annotation in GenBank today and it’s not very long odds that it’s completely wrong.
In fact this entire article seems to be a rant on why bioinformatics as a field is rotting. But instead of ranting, surely something can be done about it?
Shouldn't we as hackers see this as an opportunity to revolutionize the field?
Rants like this, and providing interviews to third parties, are actually one of the more positive things that he could bring to the table: it provides information to people who aren't aware and inspires motivation in people who aren't entangled.
Then again, I am in no position to judge what Fred should or should not do
http://nsaunders.wordpress.com/2012/10/22/gene-name-errors-a...
Maybe bioinformatics is not the place to aim for great informatics. We do bioinformatics because of love of science first and foremost. This is frontier land, the wild west, and it pays to play quick and dirty. I would suggest to hang on to some best practices, e.g. modularity, TDD and BDD, but forget about appreciation. Dirty Harry, as a bioinformatician you are on your own.
To be honest, in industry it is not much different. These days, coders are carpenters. If you really want to be a diva, learn to sing instead.
More money, good on you. Starting off your critique of your former colleagues with "technically ept people'...not going to get a lot of sympathy for the correctness of your work.
from the OED:
ept, adj. Pronunciation: /ɛpt/ Etymology: Back-formation < inept adj.
Used as a deliberate antonym of ‘inept’: adroit, appropriate, effective.
1938 E. B. White Let. Oct. (1976) 183, I am much obliged..to you for your warm, courteous, and ept treatment of a rather weak, skinny subject.1966 Time 30 Sept. 7/1 With the exception of one or two semantic twisters, I think it is a first-rate job—definitely ept, ane and ert.
1976 N.Y. Times Mag. 6 June 15 The obvious answer is summed up by a White House official's sardonic crack: ‘Politically, we're not very ept.’
Etymology is straight from Latin: ineptus, which is prefix in- plus aptus (fitting or suitable). Interestingly there's also inapt which is quite similar.
edit: aheilbut's research on this is much more thorough.
have you checked out synthetic biology? will it be easy to understand when you have a degree in bioinformatics?