As for the cost structure for research computing, the argument that the costs are externalized isn't a good one- that overhead that pays for the facility, and the networking, comes out of your grant money, and using grad student time to admin your cluster often just causes your grad students to leave for FAAMG.
That has not been my experience. There are lots of scientific workflows that only need 10s of TB at most, yet can still consume lots of cycles.
> As for the cost structure for research computing, the argument that the costs are externalized isn't a good one- that overhead that pays for the facility, and the networking, comes out of your grant money, and using grad student time to admin your cluster often just causes your grad students to leave for FAAMG.
At the universities I've worked at, equipment (large purchases) is except from overhead, or results in a lower overhead charge. (Researchers balk at paying a ~50% overhead rate on a $1million instrument). Using grad student time to admin your cluster is dumb, but I'm more talking about users who need single-digit numbers of computers. If you need real HPC, you're in the world of queues, national and regional supercomputers, etc. etc.
It does not simulate complex biochemical interactions in different parts of the body.
From the description, they did something that requires a lot more horsepower.
There certainly wasn't any "heavy biochemical calculations"; this work is entirely comparative genomics, so just operating on DNA strings.
I see this fluff article https://www.ornl.gov/blog/genomics-code-exceeds-exaops-summi... and there may be more detail here: https://www.hpcuserforum.com/presentations/april2019/Joubert... which shows near-linear performance they ascribe to "Made possible by aggressive communication overlap and low-congestion Mellanox Infiniband fat tree network with adaptive routing"
So there may actually be an HPC/supercomputer story in there, but I'm having trouble figuring out what they did in this most recent work.
> RNA-Seq analysis was performed using the latest version of the human transcriptome
I found this article discussing read mapping for RNA sequence analysis: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4833417/
> In particular, RNA sequencing (RNA-seq) technology,1 which provides a comprehensive profile of a transcriptome, is increasingly replacing conventional expression microarrays.2 Primary data processing in RNA-seq (as well as in other massive sequencing experiments, including genome resequencing) involves mapping reads onto a reference genome. This step constitutes a computationally expensive process in which, in addition, sensitivity is a serious concern