Where I live, far away from tech, it's almost impossible to land a job in ML / AI / DS unless you have a (minimum) Masters degree in something relevant. Preferably a Ph.D and solid experience to show for - I know because I work in the field, and lots of F500 dinosaurs are just now waking up. But are also unfortunately clinging to their old ways of hiring people.
Schools all over are also picking up slack, starting to offer specialized graduate degrees in those domains. When I got my degree in ML, it was a sub-field at my schools engineering department, mixed up with signal processing and control theory groups.
When first trying to get a job, the main problem was to explain what I actually could bring and do, and a lot of the recruiters or managers had no idea what Machine Learning was. Then you said "It's basically Artificial Intelligence" and, and they were instantly wooed.
I'm a senior data scientist at vc-back startup. I'm in a hybrid data scientist/ machine learning engineering role, where I build and train ml and deep learning models and also build the scaffolding them to support their production usage. But my previous roles included being a business analyst, project manager, and research analyst. My undergrad education was in Creative Writing and the social sciences.
While I kind of accidentally transitioned into this career, how I got here is similar to most folks coming from a different background. Lot's of self-study and experimentation. I think one of the challenges to transitioning into ML and deep learning is that there are so many applications, domains, and input formats. It can be overwhelming to learn about vision, nlp, tabular, time-series and all other formats, applications and domains.
Things solidified for me when I found a space I found compelling and I was able to dive deep into it. You kind of learn the fundamentals along the way through experimentations and reflection. My pattern was pick up a model or architecture. Learn to apply it first to get familiar with it, experiment with different data, and then go back to build it from scratch to learn the fundamentals. That and I read a lot of papers related to problems I was interested in. After a while, I started developing intuitions around classes of problems and how to engage them (in DS you rarely ever solve the problem, there's always room to improve the model ...)
I have a serious question (not for bashing)
Can you please describe what part of your job CANNOT be automated?
Really none of it is really automatable. I'm working developing NLP features for our product (question answering, search, neural machine translation, dialog, etc). Our customer data is diverse, in different formats, and thier use cases are all distinct. So most of my work is novel applied research and development.
If it's helpful, I dropped out of both schools — the vast majority of my knowledge is self taught!
This is so incredibly important for me and, based on my conversations, many others as well.
The other thing I struggle with is the feeling that many of the problems I wish to solve are likely also solvable with simpler statistical methods and that I'm just being a poser by trying to pound them home with the ML hammer.
(For reference, I’m an undergrad looking to get into this field)
I think when you shift into pure research, yes a deep probability, information theory, linear algebra, and calculus background are needed. But at the level, you're rarely writing code and more likely working at theoretical level.
1. Most of your time is spent transforming data. Very little is spent building models.
2. Most of the eye-grabbing stuff that makes headlines is inapplicable. My application involves decisions that are expensive and can be safety critical. The models themselves have to be simple enough to be reasoned about, or they're no use.
You might argue that this means what I'm actually doing is statistics.
Also, most folks I know that are making practical deep learning contributions are doing so by combining their pre-existing domain expertise with their new deep learning skills. E.g. a journalist analyzing a large corpus of text for a story, or an oil&gas analyst building models from well plots, etc.
With math, on paper say, it is hard to tell if you are doing it right or wrong. You can trick yourself quite easily. A compelling proof can have a huge hole.
You can still trick yourself programming -- in a sense, that is what a bug is -- but it is much harder.
The upshot is, I think it is easier to teach yourself math that is applied to a computer program than math on a piece of paper.
Too many people going into ML could skew the supply/demand into making it a worse job option (more work, less pay), like game programming or academia.
The caveat is that I've worked on ML in the past and I think that the work is maybe less intellectual than software engineering - with complex enough models they become impossible to understand and you start to just try out ideas based on random intuitions. The thing I mostly like about it is the ability to use math and more independent style of work - no scrum, less need for cooperation with other team members etc..
What's happening at least in Australia now is that contract rates (a good indicator of the supply/demand ratio) has halved for ML engineers. Which means (a) a lot of people want to be ML engineers and (b) there aren't that many jobs for them.
It makes finding good positions really hard.
Also thanks for telling us how you became a practitioner. It's definitely relatable and not a humble brag at all.