Deep Learning Interviews book: Hundreds of fully solved job interview questions (opens in new tab)

(github.com)

582 pointspiccogabriele4y ago147 comments

147 comments

83 comments · 20 top-level

mcemilg4y ago· 18 in thread

The ML/DS positions highly competitive these days. I don't get why ML positions requires hard preparations for the interviews more than other CS positions while you do similar things. People expect you to know a lot of theory from statistics, probability, algorithms to linear algebra. I am ok with knowing basic of these topics which are the foundations of ML and DL. But I don't get to ask eigenvectors and challenging algorithm problems in an ML Engineering position at the same while you already proof yourself with a Masters Degree and enough professional experience. I am not defending my PhD there. We will just build some DL models, maybe we will read some DL papers and maybe try to implement some of those. The theory is the only 10% of the job, rest is engineering, data cleaning etc. Honestly I am looking for the soft way to get back to Software Engineering.

uoaei4y ago

In part because ML fails silently by design. Even if the code runs flawlessly with no errors, the outputs could be completely bunk, useless, or even harmful, and you won't have any idea if that is true just from watching The Number go down during training. It's not enough to know how to build it but also how it works. It's the difference between designing the JWST and assembling it.

dekhn4y ago

ML doesn't just fail silently by design. because ML is based on error minimization, it fails in a way that is maximally hard to tell from random garbage. This is, remarkably, a subtlety that is lost on most people, which is a real surprise- my introduction to this was in structural biology, where you always do hold-outs and check the performance on the hold-out set before overfitting is such a problem.

1 more reply

borroka4y ago

But the OP was asking something different, that is why someone should excessively focus on theory, when, by the way, DL theory is very far from being solid and trial and error in ML and AI is the common way of operating.

The "model is in place, but I have no clue what's doing and so it can fail without me understanding when and how is straw-man". Especially for supervised learning, that is, we have a label for data, it is immediately clear whether the output of the model is "bunk, useless, or even harmful". There is no "fail silently by design".

I have been working in the field for almost 20 years in academia and in industry and it is not that I starting every PCA thinking about eigenvectors and eigenvalues and if you ask me now without preparing what are those, I would be between approximately right and wrong. But I fit many, many very accurate models.

1 more reply

minimaxir4y ago

> In part because ML fails silently by design.

That's why there's so much iteration and feedback gathering (e.g. A/B tests) as a part of DS/ML, which incidentally is rarely a part of the interview loop.

Anyone who claims they can get a good model the first time they train it is dangerously optimistic. Even the "how it works" aspect has become more and more marginal due to black boxing.

mattkrause4y ago

I'm sure this happens, but do you think the problem is actually one of mathematical savvy?

My guess would be that more machine learning projects go off the rails for want of understanding the data or the {business, research} problem.

2 more replies

light_hue_14y ago

And you learn literally zero about a candidate's ability to understand when and why things work by asking questions about eigenvectors. Someone can understand what an eigenvector is and still not have any clue about how you figure out a system is working, why it's working, what is likely to happen in production, how you test the limits of your method's ability to generalize, how you take an real problem and find something that you can productively use ML on, etc.

People say things like "you need to know how it works" but "it" doesn't work using your knowledge of eigenvectors. If you want to test how "it" works, test that, literally. Put up a model on the board and a dataset. Ask people about what might happen when you apply one to the other. What changes they would make in response to changes in the data. What they would do in response to the following training curves, budget limitations, etc.

These interviews are terrible and they select for people that regurgitate facts.

1 more reply

nerdponx4y ago

Maybe "eigenvectors" is a bad example, because it's a pretty foundational linear algebra concept.

But there is a threshold where it stops being a test of foundational knowledge and starts being a test of arbitrary trivia, and favors who has the most free time to study and memorize said trivia.

uoaei4y ago

The difference between trivia and meaty knowledge is somewhat contextually dependent, but an understanding of how core probability and statistics concepts are integrated into the framework of machine learning by means of linear algebra and the other analytical tools is pretty damn useful to have substantive conversations about ML design decisions. Helps when everyone in the team speaks that language to keep up the momentum.

whimsicalism4y ago

Having recently completed an MLE interview loop successfully at a top company, I'm wondering where you are getting asked complicated linear algebra questions in interview?

1 more reply

hintymad4y ago

A reason for such requirements is similar to that that software engineers need to leetcode hard: supply and demand. Prestigious companies get hundreds, if not thousands, of applications every day. The companies can afford looking for candidates who have raw talent, such as the capability of mastering many concepts and being able solve hard mathematical problems in a short time. Case in point, you may not need to use eigenvectors directly in the job, but the concept is so essential in linear algebra and I as a hiring manager would expect a candidate to explain and apply it in their sleep. That is, knowing eigenvector is an indirect filter to get people who are deeply geeky. Is it the best strategy for a company? That's up to discussion. I'm just explaining the motives behind such requirements.

vsareto4y ago

I can’t help but think there’s been a ton of filters used in the past to figure out if someone is deeply geeky, and we’ll continue to invent more in the future.

It’s really looking like another rat race. Especially since there’s no central authority, every hiring manager has the potential to invent their own filter, and make it arbitrarily harder or easier based on supply and demand (and then the filter drifts away from the intended purposes).

2 more replies

MontyCarloHall4y ago

> Case in point, you may not need to use eigenvectors directly in the job, but the concept is so essential in linear algebra and I as a hiring manager would expect a candidate to explain and apply it in their sleep.

Exactly. Whenever eigenvectors come up during interviews, it’s usually in the context of asking a candidate to explain how something elementary like principal components analysis works. If they claim on their CV to understand PCA, then they’d better understand what eigenvectors are. If not, it means they don’t actually know how PCA works, and the knowledge they profess on their CV is superficial at best.

That said, if they don’t claim to know PCA or SVD or other analysis techniques requiring some (generalized) form of eigendecomposition, then I won’t ask them about eigenvectors. But given how fundamental these techniques are, this is rare.

Der_Einzige4y ago

Given that PCA is heavily antiquated these days, I'd say that asking your candidates to know algebraic topology (the basis behind many much more effective non linear DR algorithms like UMAP) is far better... But in spite of the field having long ago advanced beyond PCA, you're still using it to gatekeep.

3 more replies

pasquinelli4y ago

and yet we're also told that tech companies can't get enough people.

barry-cotter4y ago

> But I don't get to ask eigenvectors and challenging algorithm problems in an ML Engineering position at the same while you already proof yourself with a Masters Degree and enough professional experience.

People know pity passes exist for Master's degrees. You can't trust that someone actually knows what they should know just because they have a degree. Ditto professional experience. The entire reason FizzBuzz exists is because people with years of profesional experience can't program.

vanusa4y ago

We aren't talking about FizzBuzz here; but rather the fashionable practice of subject people to 4-6 hours of grilling on "medium-to-hard" problems that you absolutely cannot fail, or even be slightly halting in your delivery on. And which can only be effectively prepared for by investing substantial amounts of time on by-the-book cramming.

On top of the fact that these problems are often poorly selected, poorly communicated, conducted under completely unrealistic time pressure, often as pile-ons (with 3-4 strangers as if just to add pressure and distraction), and (these days) over video conferencing (so you have to stare in the camera and pretend to make eye contact with people while supposedly thinking about your problem, on top of shitty acoustics), etc, etc.

It's just fucking ridiculous.

1 more reply

devoutsalsa4y ago

I figure the best way to prepare for an ML job is to pull out the nastiest working rat’s nest of if statements you’ve ever written & claim it was autogenerated by an adversarial network (which was you fighting with your coworkers over your spaghetti code).

flubflub4y ago

This really made me laugh, thanks.

1970-01-014y ago· 12 in thread

This book has fun problems! Example:

During the cold war, the U.S.A developed a speech to text (STT) algorithm that could theoretically detect the hidden dialects of Russian sleeper agents. These agents (Fig. 3.7), were trained to speak English in Russia and subsequently sent to the US to gather intelligence. The FBI was able to apprehend ten such hidden Russian spies and accused them of being "sleeper" agents.

The Algorithm relied on the acoustic properties of Russian pronunciation of the word (v-o-k-s-a-l) which was borrowed from English V-a-u-x-h-a-l-l. It was alleged that it is impossible for Russians to completely hide their accent and hence when a Russian would say V-a-u-x-h-a-l-l, the algorithm would yield the text "v-o-k-s-a-l". To test the algorithm at a diplomatic gathering where 20% of participants are Sleeper agents and the rest Americans, a data scientist randomly chooses a person and asks him to say V-a-u-x-h-a-l-l. A single letter is then chosen randomly from the word that was generated by the algorithm, which is observed to be an "l". What is the probability that the person is indeed a Russian sleeper agent?

kilotaras4y ago

Bayes rule with odd ratios makes it pretty easy.

    base odds: 20:80 = 1:4  
    relative odds = (1 letter/6 letters) : (2 letters / 8 letters) = 2/3  
    posterior odds = 1:4*2:3 = 1:6  
    Final probability = 1/(6+1) = 1/7 or roughly 14.2%

Bayes rule with raw probabilities is a lot more involved.

hervature4y ago

I don't know about "a lot more". It is essentially the same calculation without having to know 3 new terms. Let:

A = the event they are a spy B = the event that an l appears

And ^c denote the complement of these events. Then,

P(A) = 1/5

P(A^c) = 4/5

P(B|A) = 1/6

P(B|A^c) = 1/4

P(A|B) = P(B|A)P(A)/P(B)

By law of total probability,

P(B) = P(B|A)P(A) + P(B|A^c)P(A^c)

Which is very standard formulation and really just your equation as you can rewrite everything I have done as:

P(A|B) = 1/(1 + P(B|A^c)P(A^c)/P(B|A)P(A))

Which is the base odds, posterior odds, and odds to probability conversion all in one. The reason why this method is strictly better in my opinion is because the odds breaks down simply if we introduce a third type of person which doesn't pronounce l's. Also, after doing one homework's worth of these problems, you just skip to the final equation in which case my post is just as short as yours.

FabHK4y ago

Hmm, a bit more involved maybe, but not that much. But your calculation sure seems short.

With S = sleeper, and L = letter L, and remembering "total probability":

   P(L) = P(L|S)P(S) + P(L|-S)P(-S),

(where -S is not S), we have by Bayes

   P(S|L)
 = P(L|S) P(S) / P(L)
 = P(L|S) P(S) / (P(L|S)P(S) + P(L|-S)P(-S))
 = 1/6 * 1/5 / (1/6*1/5 + 1/4*4/5) 
 = 1/30 / (1/30 + 6/30) 
 = 1/7

thaumasiotes4y ago

Odds are usually represented with a colon -- the base odds are 1:4 (20%), not 1/4 (25%).

1 more reply

Aethylia4y ago

Assuming that the algorithm is 100% accurate!

1 more reply

TrackerFF4y ago

Likewise, in the military, the use of countersigns have been designed to make non-native speakers stand out - should the countersign be compromised. For example, in WW2, Americans would use "Lollapalooza", as Japanese really struggled with that word.

PeterisP4y ago

That's more of a shibboleth than a secret, which is literally a practice as old as the Bible - "And the Gileadites took the passages of Jordan before the Ephraimites: and it was so, that when those Ephraimites which were escaped said, Let me go over; that the men of Gilead said unto him, Art thou an Ephraimite? If he said, Nay; Then said they unto him, Say now Shibboleth: and he said Sibboleth: for he could not frame to pronounce it right. Then they took him, and slew him at the passages of Jordan: and there fell at that time of the Ephraimites forty and two thousand."

1 more reply

kragen4y ago

Hmm, I'd think that in a rhotic accent a word like "furlstrengths" or "fatherlands" would work better? In Japanese they sound like [ɸɯ̟ɾɯ̟ɾɯ̟sɯ̟tɯ̟ɾiĩsɯ̟] or [haɾɯ̟sɯ̟tɯ̟ɾiĩsɯ̟] and [hazaɾɯ̟randozɯ̟] respectively, rather than the native [fɚɹłstɹiŋθs] or [fɚɹłstɹiŋkθs] and [faðɚlændz]. Adjacent /rl/ pairs are a special challenge, there are multiple unvoiced fricatives that don't exist at all in Japanese, and consonant clusters totally violate Japanese phonotactics to the point where it's hard for Japanese people to even detect the presence of some of the consonants. By contrast Japanese [ɾaɾapaɾɯ̟za] is only slightly wrong, requiring a little bit more bilateral bypass on the voiced taps and a slight rounding of the ɯ̟ sound.

Some Japanese-American soldiers would be SOL tho.

whatshisface4y ago

A single letter is chosen randomly? Huh? Why would you do that?

renewiltord4y ago

Seems a bit pointless to ask. You want them to make up a story? "The data scientist's radio link degrades to static while he waits for the answer and all he hears is the letter 'l'". There.

1 more reply

8note4y ago

Really small?

How many russians in america are actually sleeper agents?

jhgb4y ago

That would be a good argument in a general case but the premise of all Russians present at the diplomatic gathering being secret agents is clearly stated.

angarg124y ago· 11 in thread

I have been working as an ML Engineer for a few years now and I am baffled by the bar to entry for these positions in the industry.

Not only I need to perform at the Software Engineer level expected for the position (with your standard leetcode style interviews), but I need to pass extra ML specific (theory and practice) rounds. Meanwhile the vast majority of my work consist of getting systems production ready and hunting bugs.

If I have to jump through so many hoops when changing jobs I'll seriously consider a regular non-ML position.

dekhn4y ago

there's a bunch of gatekeeping to get into ML. Part of it is that ML people don't want non-ML people to know just how much of what they do is drudgery and how little of it is exciting math, or have competition from people with similar skills. And those roles come with a lot of prestige.

I went through all that and am a SWE again instead of an ML engineer. The one thing I learned from all that? "The very best models are distilled from postdoc tears".

Jensson4y ago

Getting state of the art performance in ML requires a lot of intuition about equations though. I've seen some of the top ML engineers work at Google, they all have a really good understanding of math, how formulas translates into measurable results etc. An ML education or research background seems less important, if you have that from studying physics or math or anything then it still translates.

I feel the biggest problem for people without an ML background is that you'd think "I don't know what I'm doing, I can't get hired for this job!", but fact is that people with ML backgrounds mostly don't know what they are doing either. They just get standard results by applying standard libraries, any programmer with some math skills could do the same, it is no harder than learning a frontend or backend framework, people just think it would be harder so they lack confidence about it. There are some gotchas you got to learn, but there are a lot of gotchas in both backend and frontend as well.

2 more replies

godelski4y ago

I agree with you, but I also wish more ML people knew more math. Though I think there's a difference between research and production (I'm in research).

HeavyStorm4y ago

I, on the other hand, am baffled b by the lack of basic engineering skills of mods ML workers at my company. Their code is unmaintainable, and they seem to lack the usual problem solving skills I look for in software engineers.

Then again, my company business model leads to terrible hires anyway.

bronxbomber924y ago

This is the same for any specialized software engineering role. Compilers, GPGPU, embedded systems, computer graphics, image processing, etc. In an interview panel for any of these roles, you will be expected to be a competent software engineer and have domain knowledge about the sub-field.

capdeck4y ago

> ... I'll seriously consider a regular non-ML position.

What about asking for more money at the end? Multi-stage complex interview process eliminates more candidates. Some, like you say, will opt for a developer gig instead, probably because ML wasn't something they were interested in to begin with. That narrows down the list of candidates even more. Either "play the game" and ask for more money or don't play the game at all. Let employers pay extra for polished candidates.

angarg124y ago

If what I want is money I think I'm better off getting competing offers as a regular Software Engineer and pumping the numbers.

ivanamies4y ago

As an "ML Engineer," this is both true and very funny

emotional_fool4y ago

And most likely will be paid <= software-engineers

coconut_man4y ago

just like Software Engineer having to pass leetcode round and system design round, I also doubt that ML theory and practice is much harder than system design (theory and practice). Beside that, you get paid more than Software Engineer.

angarg124y ago

I can tell you we are not paid more than Software Engineers, but that might be only my company.

One startup asked me this. They gave me a very vague problem statement, and in 2 days I had to find a couple of recent articles relevant to the problem and prepare a presentation explaining my solution and justifying my decisions.

mrfusion4y ago· 5 in thread

Are there deep learning roles that focus more on software engineering and using the tools rather than having a deep understanding of statistics?

time_to_smile4y ago

> having a deep understanding of statistics?

As someone with a strong background in statistics, please tell me where I can find DS jobs that require this.

For me and all my statistics friends in DS we find much more frustration in how hard it is to pass DS interviews when you understand problems deeper than "use XGBoost". I have found that very few data scientists really even understand basic statistics, I failed an interview once because an interviewer did not believe that logistic regression could be used to solve statistical inference questions (when it and more generally the GLM is the workhorse of statistical work).

And to answer your question, whenever I'm in a hiring manager position I very strongly value strong software engineering skills. DS teams made up of people that are closer to curious engineers tend to greatly outperform teams made up of researchers that don't know you can write code outside of a notebook.

disgruntledphd24y ago

A good conceptual understanding of statistics is always helpful.

It's not really tested for in most places though, where they regard a DS as a service that produces models.

jstx14y ago

There are. But

1) the titles will vary a lot (software engineer, ML engineer, research engineer, data scientist etc.) which makes it hard to locate those jobs and to move in the job market in general

2) you still need a reasonable amount of theory (not necessarily too much statistics) to use the tools well. And in all likelihood you will be tested on it in some way during the interviews.

3) the interviews/job descriptions that don't emphasise the theory often will be for jobs where you get a title like Machine Learning Engineer but you focus more on the infrastructure rather than on the ML code

fault14y ago

I would say on average MLE roles tend to be more SWEng heavy. But some roles are as much creating infrastructure as running the tools.

throwaway67344y ago

I think they're called research engineering roles or ML engineering

light_hue_14y ago· 4 in thread

I've interviewed well over 100 people for DL/ML positions. This may be a good roadmap to what some people ask, but it's a terrible guide to what you should ask. It's like a collection of class exam questions.

Just as in programming, the world is full of people who can recite facts but don't understand them. There is no point in asking what an L1 norm is and asking for its equation. Or say, giving someone the C++ code that corresponds to computing the norm of a vector and asking them "what does this do". Or even worse, showing them some picture of some cross-validation scheme and asking them to name it. Yes, your candidates should be able to do this, but positive answers to these kinds of questions are nearly useless. These are the kinds of questions you get answers to by Googling.

It's far more critical to know what your candidate can do, practically. Create a hypothetical dataset from your domain where the answer is that they need to use an L1 norm. Do they realize this? Do they even realize that the distance metric matters? Are they proposing reasonable distance metrics? Do they understand what goes wrong with different distance metrics? etc. Or problems where they need to use a network but say, padding matters a lot. Or where the particulars of cross validation matter a lot.

This also gives you depth. "name this cross validation scheme" gives you a binary answer "yes, they can do it, or no they can't" And you're done. If you have a hypothetical dataset, you can keep prodding. "Ok, but how about if I unbalance the data" or "what if we now need to fine tune" or "what if the payoffs for precision and recall change in our domain", "what if my budget is limited", etc. It also lets you transition smoothly to other kinds of questions. And to discover areas of deeper expertise than you expected. For example, even for the cross validation questions, if you ask that binary question, you might never discover that a candidate knows about how to use generalized cross validation, which might actually be very useful for your problem.

The uninformative tedious mess that we see in programming interviews? This is the equivalent for ML/DL interviews!

coliveira4y ago

Do you have any books/material that can help the learner acquiring this deeper understanding?

master_yoda_14y ago

I know one good reference.

https://www.deeplearningbook.org/

Also there are various courses and lectures but that needs time and effort. There is no short cuts like the book posted by OP.

2 more replies

flubflub4y ago

A problem with these questions is that a lot of them people can answer without knowing ML/DL, admittedly cherry picked but still.

For example what is the definition of two events being independent in probability?

Or the L1 norm example: 'Which norm does the following equation represent? |x1 − x2| + |y1 − y2|'

Find the taylor series expansion for e^x (this is highschool maths).

Find the partial derivatives of f (x, y) = 3 sin2(x − y)

Limits etc...

These aren't specific to deep learning or machine learning, not that I claim to be a practitioner.

MichaelRazum4y ago

Exactly I though the same. Not sure what a really good alternative is. BUT you may be in risk to get bad candidates, since they might be the ones with the best intrview practice.

Maybe that kind of questions are ok for people without expirience but not for seniors.

jstx14y ago· 3 in thread

Data science and ML interviews can be tough because it's very difficult to prepare for everything and cover all the theory. A lot of the value you add comes from knowing the theory so it's understandable to test it but it's still hard to prepare well. And you have a take-home and/or LC style problem(s) in addition to the theory interview.

minimaxir4y ago

The hard questions in DS/ML interviews I've received over the years aren't the theory questions (which I rarely get asked), but the trick SQL questions that often depend on obscure syntax and/or dialect-specific features, or "implement binary search" when I'm not in the mindset for that as that isn't what DS/ML is in the real world.

jstx14y ago

I think they're fine as long as you know the format and have an opportunity to prepare or just get in the right mindset for it. And some things (like binary search) should be easy to write anyway.

The SQL questions can also be a symptom of the type of job - Facebook's first data science round focuses a lot on SQL but that's because it's a very product/analytics/decision-making focused role without that much coding or ML. With data science you have to be more careful about these things when searching for a job; you can't just use the job title as a descriptor.

2 more replies

nerdponx4y ago

I had an "implement binary search" interview once. I came away feeling like I was being interviewed for the wrong role. I don't understand how anyone could think that's an appropriate interview task for a DS position.

1 more reply

Raphaellll4y ago· 2 in thread

I actually bought this as a physical book on Amazon. Naturally it came as a print-on-demand book. Unfortunately it has many problems in this format. E.g. the lack of margins makes it hard to read the end of sentences towards the gutter. Also some text is pushed into each other. Not sure what source file format you have to provide to Amazon, but it's certainly not the pdf provided in the repo.

Edit:

It seems the overlapping text also occurs on some pdf readers: https://github.com/BoltzmannEntropy/interviews.ai/issues/2

ruph1234y ago

The last 5 textbooks I bought new on amazon had similar problems. Totally unacceptable. I started returning them and (because most were exclusive to amazon) started buying them new on ebay with great results.

Raphaellll4y ago

It's really a hit and miss. This [1] book also came as print-on-demand but looks perfectly fine. Good layout and clean colors.

[1] https://mml-book.github.io/book/mml-book.pdf

time_to_smile4y ago· 2 in thread

Fisher Information is under the "Kindergarten" section?

Maybe I've just been interviewing at the wrong places, I'd be very curious if anyone here has been asked to even explain Fisher information in any DS interview?

It's not that Fisher information is a particularly tricky topic, but I certainly wouldn't put it as a "must know" for even the most junior of data scientists. Not that I wouldn't mind living in a world where this was the case... just not sure I live in the same world as the authors.

sdenton44y ago

When I was a mathematician it was pretty common to make jokes whenever we actually had to evaluate an integral, along the lines of 'think back to your elementary-school calculus...'

lp2514y ago

“integrate by parts, like you learned in middle school”

tf middle school did you go to?!

1 more reply

erwincoumans4y ago· 2 in thread

Wow, nice resource! Wish it had some sections about (deep) reinforcement learning and its algorithms. Looks like it is in the plan though.

jstx14y ago

RL is still kind of niche - the number of companies that ship anything using RL and the number of jobs that require it are both quite low.

master_yoda_14y ago

just a clarification I think you are confused between RL and robotics. RL algorithm could be used anywhere either in ads, nlp, computer vision etc.

kragen4y ago· 2 in thread

Why are all the em dashes missing from the PDF?

aesthesia4y ago

This may be a rendering issue. Some interaction of the Computer Modern font, the TeX layout algorithm, and Chrome's rendering engine sometimes ends up making em-dashes and minus signs invisible.

kragen4y ago

I'm not using Chrome's rendering engine, is he?

la_fayette4y ago· 1 in thread

Question aside: using arXiv for distributing such interview questions, seems to me inappropriate. Is there any SEO trick behind it?

seaman19214y ago

Yes I was also surprised how this is hosted on arxiv. Can someone explain why this is ok ? It is definitely not a scholarly article.

pradn4y ago· 1 in thread

I think I know the answer to this, but how bad should I feel for being a software engineer with little-to-no knowledge of deep learning. I suspect it's not bad at all since the software engineering field has split into a few camps, and mine - backend systems work - isn't in the same universe as the machine learning one, for the most part.

jstx14y ago

Not bad at all. I'm a data scientist and my not knowing React doesn't affect me one bit.

spekcular4y ago

This is amazing. I am ecstatic.

I've been looking for something exactly like this – and it's executed better than I could have imagined.

(Needs a good proofreader still, though! Also, whatever custom LaTeX template the authors are using is misbehaving a bit in various places. Still great content.)

lvl1004y ago

In my 20s, I was doing data science at a very high level spanning multiple disciplines. Truly state of the art. I would like to think I was quite good at my job.

I am 99% certain I would not have passed the interview bars set today. More specifically, the breadth they expect you to master is very puzzling (and seemingly unrealistic).

master_yoda_14y ago

My problem with these line of numerous shallow books and courses are

  1) Written by people who has no experience in industry or they are not working on "real" machine learning jobs

  2) They think the standard in industry is pretty low and any BS works. For example the concept of "lagrange multiplier" is missing from the book. One need this concept to understand training convergence guarantee.

pugio4y ago

I'm really enjoying the discussion here, as I've been thinking a lot about what a full modern ML/DS curriculum would look like.

I currently work for a non-profit investigating making a free high quality set of courses in this space, and would love to talk to as many people either working in ML/DS or looking to get into the field. (I have ideas but would prefer to ground them in as many real-world experiences as I can collect.)

If anyone here wouldn't mind chatting about this, or even just sharing an experience or opinion, please drop me an email (in my profile).

EDIT: We already have Into to DS, and a Deep RL sequence far along in our pipeline, but are looking to see where we can help the most with available resources.

I really appreciate this Interviews book as an example of what topics might be necessary (and at what level), taking into account the qualifying discussion here, of course.

nutanc4y ago

Now someone just train a model using these questions and answers and we will let the model take all future interviews.

d4rkp4ttern4y ago

This would be great resource for creating a DL/AI course. Or chapter quizzes for such a course.

However, one of the important things when interviewing someone is that the person has not seen the question before. So as an interviewer my impulse would be to first ensure that my question is NOT in this book :)

Or perhaps even if it is in the book, if the question is advanced enough, I could test how they articulate and reason through the solution, so I know they are not simply regurgitating the answer?

agentofoblivion4y ago

I commend the author’s effort, but this is not reflective of any interviews I’ve been part of, which is many across several industries and levels. Bayesian Deep Learning? Chapter 2 in Kindergarten? If anyone asked me a question on that, I would kindly ask them to eat shit.

pietromenna4y ago

Wow! Great resource! Thank you!

j / k navigate · click thread line to collapse

147 comments

83 comments · 20 top-level

mcemilg4y ago· 18 in thread

uoaei4y ago

dekhn4y ago

1 more reply

borroka4y ago

1 more reply

minimaxir4y ago

> In part because ML fails silently by design.

That's why there's so much iteration and feedback gathering (e.g. A/B tests) as a part of DS/ML, which incidentally is rarely a part of the interview loop.

Anyone who claims they can get a good model the first time they train it is dangerously optimistic. Even the "how it works" aspect has become more and more marginal due to black boxing.

mattkrause4y ago

I'm sure this happens, but do you think the problem is actually one of mathematical savvy?

My guess would be that more machine learning projects go off the rails for want of understanding the data or the {business, research} problem.

2 more replies

light_hue_14y ago

These interviews are terrible and they select for people that regurgitate facts.

1 more reply

nerdponx4y ago

Maybe "eigenvectors" is a bad example, because it's a pretty foundational linear algebra concept.

But there is a threshold where it stops being a test of foundational knowledge and starts being a test of arbitrary trivia, and favors who has the most free time to study and memorize said trivia.

uoaei4y ago

whimsicalism4y ago

Having recently completed an MLE interview loop successfully at a top company, I'm wondering where you are getting asked complicated linear algebra questions in interview?

1 more reply

hintymad4y ago

vsareto4y ago

I can’t help but think there’s been a ton of filters used in the past to figure out if someone is deeply geeky, and we’ll continue to invent more in the future.

2 more replies

MontyCarloHall4y ago

Der_Einzige4y ago

3 more replies

pasquinelli4y ago

and yet we're also told that tech companies can't get enough people.

barry-cotter4y ago

vanusa4y ago

It's just fucking ridiculous.

1 more reply

devoutsalsa4y ago

flubflub4y ago

This really made me laugh, thanks.

1970-01-014y ago· 12 in thread

This book has fun problems! Example:

kilotaras4y ago

Bayes rule with odd ratios makes it pretty easy.

    base odds: 20:80 = 1:4  
    relative odds = (1 letter/6 letters) : (2 letters / 8 letters) = 2/3  
    posterior odds = 1:4*2:3 = 1:6  
    Final probability = 1/(6+1) = 1/7 or roughly 14.2%

Bayes rule with raw probabilities is a lot more involved.

hervature4y ago

I don't know about "a lot more". It is essentially the same calculation without having to know 3 new terms. Let:

A = the event they are a spy B = the event that an l appears

And ^c denote the complement of these events. Then,

P(A) = 1/5

P(A^c) = 4/5

P(B|A) = 1/6

P(B|A^c) = 1/4

P(A|B) = P(B|A)P(A)/P(B)

By law of total probability,

P(B) = P(B|A)P(A) + P(B|A^c)P(A^c)

Which is very standard formulation and really just your equation as you can rewrite everything I have done as:

P(A|B) = 1/(1 + P(B|A^c)P(A^c)/P(B|A)P(A))

FabHK4y ago

Hmm, a bit more involved maybe, but not that much. But your calculation sure seems short.

With S = sleeper, and L = letter L, and remembering "total probability":

   P(L) = P(L|S)P(S) + P(L|-S)P(-S),

(where -S is not S), we have by Bayes

   P(S|L)
 = P(L|S) P(S) / P(L)
 = P(L|S) P(S) / (P(L|S)P(S) + P(L|-S)P(-S))
 = 1/6 * 1/5 / (1/6*1/5 + 1/4*4/5) 
 = 1/30 / (1/30 + 6/30) 
 = 1/7

thaumasiotes4y ago

Odds are usually represented with a colon -- the base odds are 1:4 (20%), not 1/4 (25%).

1 more reply

Aethylia4y ago

Assuming that the algorithm is 100% accurate!

1 more reply

TrackerFF4y ago

PeterisP4y ago

1 more reply

kragen4y ago

Some Japanese-American soldiers would be SOL tho.

whatshisface4y ago

A single letter is chosen randomly? Huh? Why would you do that?

renewiltord4y ago

Seems a bit pointless to ask. You want them to make up a story? "The data scientist's radio link degrades to static while he waits for the answer and all he hears is the letter 'l'". There.

1 more reply

8note4y ago

Really small?

How many russians in america are actually sleeper agents?

jhgb4y ago

That would be a good argument in a general case but the premise of all Russians present at the diplomatic gathering being secret agents is clearly stated.

angarg124y ago· 11 in thread

I have been working as an ML Engineer for a few years now and I am baffled by the bar to entry for these positions in the industry.

If I have to jump through so many hoops when changing jobs I'll seriously consider a regular non-ML position.

dekhn4y ago

I went through all that and am a SWE again instead of an ML engineer. The one thing I learned from all that? "The very best models are distilled from postdoc tears".

Jensson4y ago

2 more replies

godelski4y ago

I agree with you, but I also wish more ML people knew more math. Though I think there's a difference between research and production (I'm in research).

HeavyStorm4y ago

Then again, my company business model leads to terrible hires anyway.

bronxbomber924y ago

capdeck4y ago

> ... I'll seriously consider a regular non-ML position.

angarg124y ago

If what I want is money I think I'm better off getting competing offers as a regular Software Engineer and pumping the numbers.

ivanamies4y ago

As an "ML Engineer," this is both true and very funny

emotional_fool4y ago

And most likely will be paid <= software-engineers

coconut_man4y ago

angarg124y ago

I can tell you we are not paid more than Software Engineers, but that might be only my company.

mrfusion4y ago· 5 in thread

Are there deep learning roles that focus more on software engineering and using the tools rather than having a deep understanding of statistics?

time_to_smile4y ago

> having a deep understanding of statistics?

As someone with a strong background in statistics, please tell me where I can find DS jobs that require this.

disgruntledphd24y ago

A good conceptual understanding of statistics is always helpful.

It's not really tested for in most places though, where they regard a DS as a service that produces models.

jstx14y ago

There are. But

1) the titles will vary a lot (software engineer, ML engineer, research engineer, data scientist etc.) which makes it hard to locate those jobs and to move in the job market in general

2) you still need a reasonable amount of theory (not necessarily too much statistics) to use the tools well. And in all likelihood you will be tested on it in some way during the interviews.

fault14y ago

I would say on average MLE roles tend to be more SWEng heavy. But some roles are as much creating infrastructure as running the tools.

throwaway67344y ago

I think they're called research engineering roles or ML engineering

light_hue_14y ago· 4 in thread

The uninformative tedious mess that we see in programming interviews? This is the equivalent for ML/DL interviews!

coliveira4y ago

Do you have any books/material that can help the learner acquiring this deeper understanding?

master_yoda_14y ago

I know one good reference.

https://www.deeplearningbook.org/

Also there are various courses and lectures but that needs time and effort. There is no short cuts like the book posted by OP.

2 more replies

flubflub4y ago

A problem with these questions is that a lot of them people can answer without knowing ML/DL, admittedly cherry picked but still.

For example what is the definition of two events being independent in probability?

Or the L1 norm example: 'Which norm does the following equation represent? |x1 − x2| + |y1 − y2|'

Find the taylor series expansion for e^x (this is highschool maths).

Find the partial derivatives of f (x, y) = 3 sin2(x − y)

Limits etc...

These aren't specific to deep learning or machine learning, not that I claim to be a practitioner.

MichaelRazum4y ago

Exactly I though the same. Not sure what a really good alternative is. BUT you may be in risk to get bad candidates, since they might be the ones with the best intrview practice.

Maybe that kind of questions are ok for people without expirience but not for seniors.

jstx14y ago· 3 in thread

minimaxir4y ago

jstx14y ago

I think they're fine as long as you know the format and have an opportunity to prepare or just get in the right mindset for it. And some things (like binary search) should be easy to write anyway.

2 more replies

nerdponx4y ago

1 more reply

Raphaellll4y ago· 2 in thread

Edit:

It seems the overlapping text also occurs on some pdf readers: https://github.com/BoltzmannEntropy/interviews.ai/issues/2

ruph1234y ago

Raphaellll4y ago

It's really a hit and miss. This [1] book also came as print-on-demand but looks perfectly fine. Good layout and clean colors.

[1] https://mml-book.github.io/book/mml-book.pdf

time_to_smile4y ago· 2 in thread

Fisher Information is under the "Kindergarten" section?

Maybe I've just been interviewing at the wrong places, I'd be very curious if anyone here has been asked to even explain Fisher information in any DS interview?

sdenton44y ago

When I was a mathematician it was pretty common to make jokes whenever we actually had to evaluate an integral, along the lines of 'think back to your elementary-school calculus...'

lp2514y ago

“integrate by parts, like you learned in middle school”

tf middle school did you go to?!

1 more reply

erwincoumans4y ago· 2 in thread

Wow, nice resource! Wish it had some sections about (deep) reinforcement learning and its algorithms. Looks like it is in the plan though.

jstx14y ago

RL is still kind of niche - the number of companies that ship anything using RL and the number of jobs that require it are both quite low.

master_yoda_14y ago

just a clarification I think you are confused between RL and robotics. RL algorithm could be used anywhere either in ads, nlp, computer vision etc.

kragen4y ago· 2 in thread

Why are all the em dashes missing from the PDF?

aesthesia4y ago

This may be a rendering issue. Some interaction of the Computer Modern font, the TeX layout algorithm, and Chrome's rendering engine sometimes ends up making em-dashes and minus signs invisible.

kragen4y ago

I'm not using Chrome's rendering engine, is he?

la_fayette4y ago· 1 in thread

Question aside: using arXiv for distributing such interview questions, seems to me inappropriate. Is there any SEO trick behind it?

seaman19214y ago

Yes I was also surprised how this is hosted on arxiv. Can someone explain why this is ok ? It is definitely not a scholarly article.

pradn4y ago· 1 in thread

jstx14y ago

Not bad at all. I'm a data scientist and my not knowing React doesn't affect me one bit.

spekcular4y ago

This is amazing. I am ecstatic.

I've been looking for something exactly like this – and it's executed better than I could have imagined.

(Needs a good proofreader still, though! Also, whatever custom LaTeX template the authors are using is misbehaving a bit in various places. Still great content.)

lvl1004y ago

In my 20s, I was doing data science at a very high level spanning multiple disciplines. Truly state of the art. I would like to think I was quite good at my job.

I am 99% certain I would not have passed the interview bars set today. More specifically, the breadth they expect you to master is very puzzling (and seemingly unrealistic).

master_yoda_14y ago

My problem with these line of numerous shallow books and courses are

  1) Written by people who has no experience in industry or they are not working on "real" machine learning jobs

  2) They think the standard in industry is pretty low and any BS works. For example the concept of "lagrange multiplier" is missing from the book. One need this concept to understand training convergence guarantee.

pugio4y ago

I'm really enjoying the discussion here, as I've been thinking a lot about what a full modern ML/DS curriculum would look like.

If anyone here wouldn't mind chatting about this, or even just sharing an experience or opinion, please drop me an email (in my profile).

EDIT: We already have Into to DS, and a Deep RL sequence far along in our pipeline, but are looking to see where we can help the most with available resources.

I really appreciate this Interviews book as an example of what topics might be necessary (and at what level), taking into account the qualifying discussion here, of course.

nutanc4y ago

Now someone just train a model using these questions and answers and we will let the model take all future interviews.

d4rkp4ttern4y ago

This would be great resource for creating a DL/AI course. Or chapter quizzes for such a course.

Or perhaps even if it is in the book, if the question is advanced enough, I could test how they articulate and reason through the solution, so I know they are not simply regurgitating the answer?

agentofoblivion4y ago

pietromenna4y ago

Wow! Great resource! Thank you!

j / k navigate · click thread line to collapse