To detect cancer we turned genomes into images and put them through DCNNs (opens in new tab)

(medium.com)

34 points8iterations8y ago5 comments

5 comments

5 comments · 1 top-level

urlwolf8y ago· 4 in thread

How come this is not bigger news?

It's the deep learning equivalent of a souped-up Honda Civic. Sure, with enough tweaking it'll eventually be competitive, but you could have just bought a racecar.

Variant calling doesn't look like it needs to be turned into a image in the first place. You'd probably be better off feeding a regular, non-convolutional network some tabular data.

malux858y ago

This is right, a convolutoonal network is the right choice for images, but images are not the right choice for “vectorising” the genomic data, because vectorising it in a 2D grid and then using a convolutional network on it (a network designed to exploit spatial hierarchy) is introducing unnecessary and arbitrary constraints on data locality. Definitely a souped up Honda Civic - love the metaphor

8iterationsOP8y ago

Traditional variant calling, that is bayesian methods, have gotten very good at detecting point mutations, that is single letters. But have hit a wall when it comes to more complex structural mutations, we wanted to build this for example to focus on driver/passenger mutations in tumor/normal samples. (check out the "Preface" post for our LSTM version, and the logic behind this)

8iterationsOP8y ago

Hey, I'm one of the guys working on this. These are just our notes and drafts, and even though we are at ~97%-99.3% the goal is to get to 99.9587% or higher because out of 3-4 billion letters that is still significant. Here's a cool non-technical magazine article about it https://twitter.com/EricTopol/status/922315550054793216

j / k navigate · click thread line to collapse

5 comments

5 comments · 1 top-level

urlwolf8y ago· 4 in thread

How come this is not bigger news?

leblancfg8y ago

It's the deep learning equivalent of a souped-up Honda Civic. Sure, with enough tweaking it'll eventually be competitive, but you could have just bought a racecar.

Variant calling doesn't look like it needs to be turned into a image in the first place. You'd probably be better off feeding a regular, non-convolutional network some tabular data.

malux858y ago

This is right, a convolutoonal network is the right choice for images, but images are not the right choice for “vectorising” the genomic data, because vectorising it in a 2D grid and then using a convolutional network on it (a network designed to exploit spatial hierarchy) is introducing unnecessary and arbitrary constraints on data locality. Definitely a souped up Honda Civic - love the metaphor

8iterationsOP8y ago

Traditional variant calling, that is bayesian methods, have gotten very good at detecting point mutations, that is single letters. But have hit a wall when it comes to more complex structural mutations, we wanted to build this for example to focus on driver/passenger mutations in tumor/normal samples. (check out the "Preface" post for our LSTM version, and the logic behind this)

8iterationsOP8y ago

Hey, I'm one of the guys working on this. These are just our notes and drafts, and even though we are at ~97%-99.3% the goal is to get to 99.9587% or higher because out of 3-4 billion letters that is still significant. Here's a cool non-technical magazine article about it https://twitter.com/EricTopol/status/922315550054793216

j / k navigate · click thread line to collapse