Instead of doing stuff like this, you can just give candidates realistic programming challenges and have them write code. You don't even need to be in the room with them; after all, you won't be when they're actually doing the work.
I've given this same advice for _years_ for designing technical interviews. Take a problem that you or your team has actually faced in the last n months. Distill it down to remove tribal knowledge and get it to something you or a capable colleague can hammer out in m minutes. Give the candidate 3m minutes to solve it. Typically this means it should take you 15 minutes to solve and give them 45 minutes.
Our biggest hit for a question that followed this for an SRE type role: we had a cron job that was legacy and buggy and caused a couple of issues over 6 months. We went back and created a very similar script with very similar bugs, added some similar data and re-seeding scripts, and told the candidates "customers say that feature $x is not working, and that is ran by this script. Here is how to run it, here is the user's id that can be found in the data seed." We added some other data validation problems. After they fixed it, we asked them what they would do to make this more production safe and dig in on what they have and have not done for improving existing systems. We expected discussions on structured logging, metrics, alerting, testing, code layout, etc.
Over time, the roles that you are looking to fill and the weaknesses on the existing team that you want to shore up will change and you will need different interview questions. Maybe more towards distributed systems or more towards testing or observability.
Importantly, it also allows you to examine fluency. The candidate who can write a code/pseudocode solution as fast as they can type, and explain it verbally at the same time, is much more impressive than the candidate who pauses for a long time before starting haltingly and unconfidently. Fluency in solving leetcode problems just tests for having practiced leetcode, but fluency in writing real code tests for actual productivity.
If you have a question like this, you can drill down into performance characteristics just as the question in TFA does. Not necessarily using big-Oh notation, but "what if you have to do this millions of times? what if you have to provide results immediately? what if some of the files are too big to fit into RAM?". But you can also drill down into other aspects of the problem like changing specs, noisy input, constrained resources, observability.
FWIW, when I came up with a question like this (with the constraint that it had to represent a problem I had solved that morning), ChatGPT was not able to solve it correctly.
No wonder it is a hit. It is an actual problem that makes me want to go for it - I want to take an interview like this, instead of the usual of dreading interviews! Keep it up.
And in my experience listening to auditions over the last several decades, what you have now is a bunch of musicians who can only do what's on the test. When you try to plug them into an orchestra, they are completely fucked.
My experience interviewing and hiring tech people is that this is one of the worst possible approaches. Anyone who needed to could figure this out. It's not relevant. What you need to figure out in an interview is if the person is curious enough to find the answer.
I don't ask for code or algos in my interviews. I ask about the person. What's your favorite programming language? Why is it your favorite? What do you not like about it?
What's the project you are most proud of? What was hard about it? What made you happy? What did you find challenging about that project?
These are 100% totally humane questions that tell you everything you could possibly need to know about a candidate in about a half hour. Maybe less.
If you can't figure out if this person is technically competent from this set of questions then you aren't competent to interview or hire.
Interviews don't have to be only about how good you are at interviews. They could be about trying to understand if the person would be good at the job and fit in well with the team.
Episodic recall and emotional recognition are both talents that vary across the population. If I want to do well at an interview like this, I will have to spend a bunch of time in advance refreshing my memory of various projects I have worked on so that I can simulate a neurotypical response to those questions. If I don't, I may just stare at you nervously and break out in a sweat when I realize I don't have anything filed under, "project I am most proud of". Or that pulling up the narratively engaging details of a project I haven't thought about in years is a process that will take me hours of specific effort. And it's not just me; I've hired great developers who were also terrible at "humane" interview questions but who did great when we just sat down and coded together.
I would still require the candidate however to write some code. Something rather simple, not leetcode, ok, not fizzbuzz either, and something that proves their familiarity with at least one language and platform.
I should be a GM by now simply watching chess analysis videos on Youtube, learning the names of some openings etc. I can talk about it all day long. But since I don't play at all (I'm not interested) when I actually do play on occasions I revert to 500 ELO...
You can easily tell a seasoned veteran from a rookie just by looking at them code for a few minutes, working on a simple task. If you can't, you have no business in interviewing others :)
Someone who can explain why they love Rust and hate Python (or vice versa) is a convincing candidate but I think you need to see a couple of other things from them. In addition, sometimes you just don't hit on the right question to get evidence of their aesthetic judgement. (Like the candidate who doesn't care about Rust vs Python, but who hates Postgres and loves MariaDB..)
IMO, the disdain from LeetCode from developers vs. its continued use disconnect stems from the fact that we developers tend to assume that the goal of the hiring process is to find the very best possible candidate, whereas the goal in reality is to find somebody "good enough" as efficiently as possible.
At the end of the day, you can't test for negatives and just pick a random person that doesn't fail anything. You need to have a defined set of values to look for in a candidate, and to design an interview that tests for those values. That's why I hate DSAs -- for most (all?) companies, they're treated as dogma rather than a pragmatic solution to a difficult problem.
This is also an unproven statement with dubious validity.
I come from the trading world where the leetcode equivalent is a brain teaser. After much deliberation, the only thing we could think of for why a brain teaser was useful was that it often proved someone wanted the job so badly that they memorized all the brain teasers. And that type of dedication and drive is what we really wanted.
To me, I don't understand why someone who develops with advanced IDEs, using complex tools and libraries, is in the least bit judged by putting them in front of what is essentially a text editor, and asking them to solve a problem that's of no relevance to their job, using no tools that they're comfortable with
I suppose that's why mathematicians came up with the idea of "optimal stopping." There are literally tons of scenarios out there where the goal isn't to get "the very best possible candidate," but to get someone "good enough," or maybe even the best person you can get after putting a reasonable amount of time and effort into searching.
Sadly everyone who makes it to an on-site is now a competitive programming enthusiast who is probably a really poor fit to actually working in software development.
Many people are vaguely aware of this problem, but nobody has a strong incentive to put in the work to come up with anything better.
Unfortunately the OP seems to be using it in the former sense (when applied to the tries solution that he things will separate the "good" candidates from the plodding lunkheads).
And most programming jobs involve no complex algorithms at all. At most you have to be a little careful to avoid N^2 occasionally.
Leetcode is great for filtering out bad candidates but if your solution involves dynamic programming it is too hard for an interview.
- While DP might be rare, memoize is quite common alternative. Sticking a @functools.cache in Python for example. They are functionally equivalent in many cases,
- I believe knowing their data-structures and algorithms is the difference between an okay programmer and a good one,
- When interviewing entry-level software-engineers, there's not much else you can ask them except for what they learned in class, which is these kinds of questions.
I worked a SWE job years ago which was heavily reliant on dynamic programming and string search/alignment/etc. It really depends.
Give me any problem related to algorithms+ds to be solved in 45mins without internet access and someone looking over my shoulder and I will give you a poor solution. But give me a day and internet connection and I will solve the problem in the most efficient/readable/maintainable way possible.
The interview question is not just about knowing data structures and is not about Behdad showing off either. It's the starting point for a Socratic discussion.
OTOH re the concern for O(n^2) time when n is bounded by word lengths, and when the inner loop was in already-optimized string comparisons in your language's hashtable library -- I think that part is very artificial.
Yes! The candidate already optimized for the important thing (which the author forgot about and added in a footnote), which is that the list of words could be very long, and that the words in an English dictionary are relatively short.
Optimising for the important factors based on an assessment of the problem domain is actually a more important skill than regurgitating the best algorithm from the book. If the author wanted to suggest a solution suitable for very long needles, he should have used a different domain like, say, gene sequencing.
It was great because he was testing the very thing I was going to be hired for. Additionally, he asked how I would design X or Y and I explained the pros and cons of certain designs and I could show him right there on the screen how I'd do it. This wasn't a whiteboard where I'd have to slowly write out code in a medium I'm not used to. I literally was typing on a keyboard, using intellisense, etc.
As someone who does iOS development day to day, it was a very easy way for me to show that I knew what I was doing and the guy was quite pleased when I plowed through all the requests he had.
Not precisely a SWE, but every single piece of code I write at work is more about knowing how the immense codebases are laid out and where to insert my change which is typically simple.
Given this, any "write self contained thing" is unrealistic. The only thing a coding question tells me is whether someone is happy writing simple code to solve a vaguely fun problem.
That still feels useful?
What is important is that the candidate’s skills generalize well to both the challenge in the interview as well as the day job. A question like this might be a good indicator or not depending on the job.
If people completely fail at this question it can reveal a lot of relevant information.
def is_concatenation_of_two_dict_words(word, dict_words):
"""
Returns True if the input word is a concatenation of two words in the input list of dictionary words,
and False otherwise.
Args:
- word (str): The input string to check.
- dict_words (list[str]): A list of dictionary words to check against.
Returns:
- (bool): True if the input word is a concatenation of two words in the input list of dictionary words,
and False otherwise.
"""
for i in range(1, len(word)):
prefix = word[:i]
suffix = word[i:]
if prefix in dict_words and suffix in dict_words:
return True
return False
In general, I don't think that LeetCode questions have any particular advantage over work sample tests when LLMs are involved. Your questions will end up on LeetCode, where LLMs will index them and will be able to recite the answers.This question could be improved as follows:
* present the naive solution. Explain that this is a PR that was submitted in 20 minutes by our new intern from Waterloo. Ask them to help improve it.
* suggest to them the intern’s argument why the lookup is O(1) and explain that it’s actually O(n). Ask them to write a PR comment to break the news to the intern.
* explain the optimal solution with tries. Ask what happens if the vocabulary is big enough—and the prod environment is resource constrained—so that the trie is rarely in memory when the function is called. How would the trie solution compare to a binary search through a length normalized list of words?
This kind of question is barely useful to interview new hires—when you still may have basic questions about how much undergrad material “stuck”. Within a few years when they hit full-performance level you should care much less about this kind of question and much more about “higher order” thinking about engineering.
How do you communicate bad news? How do you identify, communicate, and mitigate risk? How do you plan large tasks at at appropriate level without over engineering the solution? What questions should you ask at different project stages to capture the right requirements, anticipate issues, and successfully execute over time?
But whatever, I’m sure you get lots of signal out of repeating “a hash table lookup is O(1)” “nuh uh” on loop for 10 years. :eyeroll:
Maybe the problem has something to do with the way they hire. The article mentions that this string-processing problem had to be approved by a committee:
”Twice I asked the hiring committee if I should stop using this and they had no problem with it.”
Not even the Soviet Union had a process where job interview questions were regularly approved by a committee. The level of bureaucracy in administering these brain puzzles at mass scale is absurd.
The article mentions that the candidate who did best on this question was “the type who did high-school competitions.” I wonder if anyone stopped at any point to question whether that’s the ideal type of hire.
I think this is the most telling bit. Time after time, I've seen excellent competitive programmers end up being mediocre engineers. Almost none of them are bad and it's also a very objective (at least on paper) process which is probably why Google is happy to stick with the process but most of them also aren't amazing innovators.
I'm sure that university student who aced the interview turned out to be very smart - they went to a great university and can solve challenging puzzles. But I don't think they'll end up being the engineer that builds Google's competition to ChatGPT. I wonder whether a course correction where Google starts looking for people who create really cool projects even at the risk of not being able to solve any coding challenges would actually help.
On the point of committees, though. Hiring committees make a hire/no-hire decision. They don't approve questions, but they do make decisions based on the results of asking a question. So they rely on the signal that the question generates, and if a question is not giving good signal, they will have an opinion.
There's a different reason for wanting to standardize on interview questions. It's very easy to embed cultural bias. If the topic of a programming problem uses the rules of a game, for example, that will probably mean that people who are familiar with the game will understand the gist of the problem much faster than people who've never played the game.
That being said I am easily the best/fastest programmer in my organization despite there being people who have 20 years more experience.
Not defending Google's practices here, but this claim is nonsense. Many public sector/third sector jobs in Western society will only use interview questions which have been pre-approved by explicit policy.
Also I recommend you try these "high school problems" you seem to be talking down on. I for one greatly respect people who can solve those types of problems, regardless of their age. These are not trivial problems, mind you. They require a lot of mathematical and logical intuition that 99% of people (and I'd wager the majority of SWEs) don't have.
It's unavoidable when you allow too many different interests to have a say in the process.
I think any discussion of complexity for this problem is misguided.
I mean, it's fine to talk about a hash lookup being o(n) for pseudocode in some hypothetical context. But if you're talking about actual code that you intend to run in Python or JS or whatever, it doesn't make sense - are the strings interned or not? When your language hashes the key of a dictionary lookup, does it cache the result? If you're using a JITTed language, won't any native o(n) hash lookups vanish in comparison to your non-native o(n) for-loop?
On top of all of which - why are we even talking about o(n) complexity for an n that cannot grow? If we're given a fixed dictionary then we can ignore any substring longer than its largest element, so the "n" for hash lookups is not the size of the original input - which is the n we realistically care about!
Basically, once TFA mentions complexity I feel like the only correct answer is to say "this isn't a situation where complexity analysis would be useful - we should benchmark the code".
All of these are useful signals for a hiring decision. Much more than quibbling about big-O of hash table lookups.
Like these can’t be prepared for? I think there are now whole chapters in those interview prep books/courses on how to answer these.
Even if there weren’t this is a good way to hire people who sound very professional on slack and in meetings but don’t produce any actual code (unless you got very lucky)
You would be absolutely stunned how many people cannot give you back the correct answer even after you literally give it to them. If you don‘t know some programming and I give you code and explain what it does, that still doesn’t help you.
There’s some tiny signal of your ability by asking you to memorize and regurgitate something; and maybe a tiny bit more signal in this question by implicitly testing “how do you respond to unexpected technical details in a high stress situation”. But they aren’t worth the trouble. You could measure both much more directly. You’d get more detail about their coding ability by directly asking about code they’ve written and know well.
> The fastest I had a candidate solve the entirety of the problem with all bells and whistles was in 20 minutes, by a freshman from the University of Waterloo, the type who did high-school competitions.
This should’ve been the interviewer’s first clue.
I don't get why the dictionary solution is actually O(n).
Surely you can have the hash function be O(1) by only looking at a subsets of characters something like
hash(s) = (s[0 %s.length ] ^ (s[6 %s.length]<<c1) ^ (s[7 %s.length]<<c2) ) % (size of the hash table)
You can trade-off memory for speed to avoid collisions by having a mostly empty big table (like 9/10 empty table). And you can also use multiple hash functions that must both simultaneously collide to avoid having too big a table (kinda like fronting your hash-table with bloom filters).
And you can pick the selected sub-index at random such that they split well your dictionary. Maybe you can even build a "Perfect hash function" so that all your dictionary words have distinct keys.
You only have to check for the whole string when the hash match, probability of which you can make as small as you want.
> > About Part 2 :“Write a function that given a string, determines if it’s a concatenation of a number of dictionary words.”
Isn't it just recognize if a string belongs to a regular language, where the language is defined by words in a dictionary ? So you just build a parser for your language and you test for belonging.
I'm quite rusty in non-deterministic finite automatons and regular languages, but surely there exists algorithm that build an automaton from the dictionary to recognize the language in something better than n^2, no? (We are moving the states of the automaton once per character, and we likely won't have n active states (except maybe for pathological dictionaries), complexity should be something like O(k*n) where k is a constant depends on the dictionary).
What this means in practice is that there is no polynomial curve for very long words. You don't bother hashing them, so the curve peaks and then flattens.
Obviously a trie is an interesting and possibly faster solution, though the hash has the benefit of being more cache friendly. But the big O analysis presented for this problem is flawed.
I seem to remember that in the very early days of Java the standard hashing method for strings only looked at a small prefix - turns out that very common use case was to be storing URLs as keys which, of course, mostly start with the same letters...
Eg you can’t just do equality on collision, you need to do equality on every lookup at least once.
This seems like a really great question, thank you :)
Great prompt!
As I age, one thing that I realise about interviewing: It is scary as hell. It is basically performance art. And, most nerds are terrible about public performance & speaking. Some days you are really on it; other days: way off. Every single job that I ever got in my life feels like a lucky day. Also, the tech universe is infinitely wide at this point, so somebody somewhere can ask a question that you will fail.
Since this industry is so dominated by men, it also feels like a lot of machismo. So many interviews are just a jerk dude on the other side of the table (probably working for an amazing company) that wants to make himself feel better by making me look foolish. There are too many stories to tell. I have also kept my mouth shut when interviewers were insistent about a technical point that was 100% verifiably incorrect with a simple Google search. It helps to stroke their ego during an interview. When they give you a hint, or explain a better solution, try: "Oh, that's really impressive! How did you discover that algorithm?" You are more likely to make it to the next round. The goal of interviewing is to get the offer. You can always turn it down. If you had one jerk in the interview stream, but they work on a different team, it should be OK to accept the offer.
Another thing that I say to all interviewees now: I know that interviewing is scary. We are interested in what you do know, not what you do not know. Together, we will dive as deep as possible on a few topics to find the outer limit of what you know. If we come to a topic that you don't know about (or not deeply), tell us; we will move to a new topic.
I think I read on here about the concept of successful people really just doing more things to increase their luck surface area.
In other words, you set yourself up to interview as best as you can, then make a bunch of connections and schedule a bunch of interviews, and (hopefully) at least one of those will be on a good day.
It's such a weird and, frankly, shitty way to make hiring decisions though. But at the same time, we don't have much better.
The trick for me was diligently applying to as many places as possible (that were at least somewhat of a fit), and eventually coming across one where the main technical part of it was "take home". Coding and discussing thought processes on the spot for an audience isn't something I'm good at it turns out, but solving a practical problem on my own time was fine.
It's ultimately a numbers game, the more places you apply to, the more likely you'll find an interview process that fits. That is the most important thing to keep in mind--shotgun blast those applications. It's hard not to take rejection personally, but don't give up and settle for a toxic job, that's even worse.
After a while, either they will make you offers to work for them on the spot, or they will at least "fast-track" you through the interview process.
It doesn't work in all software areas (it is pretty difficult to do it if you are an embedded systems developer) but it works especially well in others (e.g. cloud)
and you suggest providing free labour for an entire year to some companies. No wonder GP hates this industry.
>For the next year try contributing to open source projects that you like and that have companies behind them.
You studied field X to work in field X, and in a job interview the "smarmy git" considering to employ and pay you has the temerity to ask you things about field X to see whether you've understood the basics of field X, can apply them, explain your reasoning, and think on your feet?
And that engenders (and I quote) hate, sadness, anger and despair?
I have interviewed a lot of people for coding jobs, and I usually try to test ability at doing things like this, and when I find someone junior who already knows how to approach problems like this, I am always very in favor of hiring them. Based on the discussion here I'm definitely not the only one.
That is who you're optimizing for.
(I still think the question is unfair, but my objection is more about the stressful interview situation where the real problem you're solving is "desperately trying to pass this or N other interviews you're currently doing").
It's an extremely simplified but recognizable version of a real problem: natural language input tokenization, which of course has real consequences.
What folks need to remember is the employers who are primarily known for asking this generally have no problem hiring people who are good programmers and ALSO good at e.g. Dynamic Programming. And no, as much as some try to frame it, the two skills are not directly at odds with each other. Therefore, between a good programmer who can “Leetcode” and one who cannot, why wouldn’t they pick ones that are good at both?
And sure, if your company don’t have that luxury, don’t cargo cult. Hire a good enough domain expert for your specific needs.
I only know 1 guy who did coding competitions. He was in fact a good developer.
Why would you think a person that enjoys such a thing wouldn't be?
Its just another signal of aptitude, its not some magical catch all heuristic. However, people here suddenly expect like they are allowed to forget all the (very fundamental) basics and get into a job that involves coding without any coding at all. Now that leads to a bad hire.
Key point: I've never met anyone in the industry whose job it is to implement algorithms without reference materials all day long. It's more like... you might find yourself needing 1-2 clever bits a year, and the rest of the time you're tweaking model classes and hunting down UI bugs.
Algorithms are both easy to Google and these days, CoPilot will do the hard 4/5ths for you if you actually need them. It's also common that the really intense stuff is written by the PhD braintrust cofounder, not the people being hired after the fact.
What I want to know when I'm interviewing is simple: how do you approach solving problems; do you have an interesting personality that I'd like to interact with every day; are you the sort of person who has a strong instinct about what the next thing to do is, and will default to working on that thing without prompting?
Do you think learning new things is exciting? Are you able to admit you don't know something and do you feel comfortable asking for help when you're stuck?
Do you seem reliable? Are you a starter, a finisher or both?
Do you have a sense of humour? What do you do on the weekend?
Ultimately, I care what you know... but I am far more interested in your capacity to quickly master new things.
That's not what the author's question was about. Rather, it presented a problem, and wanted to see some approaches to that problem and a discussion of their properties (runtime, limitations, etc.).
> how do you approach solving problems > Do you think learning new things is exciting? Are you able to admit you don't know something and do you feel comfortable asking for help when you're stuck?
All very reasonable criteria. So how do you evaluate a candidate on those metrics, if not by giving example problems? (ideally different ones: a few theoretical ones like the one under discussion here, a few practical ones with access to google/Stackoverflow/etc.)
> do you have an interesting personality that I'd like to interact with every day
That's your prerogative, of course, but I wouldn't make that a hiring criterion. I have colleagues who are incredibly boring, yet deliver solid consistent results. Who cares about their personality?
I suppose I perceive the narrative differently; it really sounds like the author has a complex roleplaying scenario in mind, where there's a base assumption that he is the smartest person in the room and pre-plans 3-4 moments where he can assert this. If I was the candidate, I would be praying that this guy isn't my direct boss.
> So how do you evaluate a candidate on those metrics, if not by giving example problems?
Simple: I talk to them and get them talking about how they have solved problems in the past. I will dig in on interesting points and take careful note of telling details.
Perhaps this will help: I have enough experience as a developer that I can tell very quickly where the person I'm talking to falls on the skill spectrum. Part of my negative reaction to this post is that I sense the author does as well; however, where he chooses to make them feel small, I just keep asking questions because I want to give them more opportunities to sell me on their strengths. Their ability to whiteboard leetcode in an interview has laughably close to zero relationship to how they will perform on a Wednesday in jeans without feeling like they are being interrogated, so I largely skip that.
If you cannot quickly tell someone's experience (or BS level) from talking with them, then this interview style will not work for you.
> Who cares about their personality?
We have to agree to disagree here. Personality is huge, probably just behind potential. In real terms, if I am going to spend more time with my coworkers than I am with my life partner, they better be trending towards great company.
One of the questions I ask myself during an interview is whether I would want to be in a band with this person. If I wouldn't enjoy driving across the country in a van with them, what makes me think I would be able to do my best work if I'm trapped doing it with someone I don't like?
There are enough brilliant nice people in the world that I can comfortably avoid hiring brilliant jerks and come out way ahead on every metric.
> Do you have a sense of humour? What do you do on the weekend?
It's none of your business what I do on the weekend. It doesn't need to be similar to what you do.
Otherwise you'll tend to only hire people like you and reduce the pool of talent because you can only work with a very specific type of people.
My intention for the question is not to judge the answer, or compare it to what I like to do; the important issue is that they have an answer!
I cannot tell you how many folks I've interviewed (and not hired) who, when asked about their hobbies, immediately start listing programming languages and frameworks. This is not the answer I'm looking for.
In fairness to me, there's an almost infinite number of great answers to the question. You can say "music"! I will really respect someone who tells me that they like television. That person is honest. Cooking! Model rockets. Literally just say anything other than a bunch of acronyms.
I'm specifically not asking about which church or sex club they go to, and that's made contextually apparent from the conversation leading to that point.
I have a strong belief that a well-rounded candidate has interests beyond coding. If they live to code, they probably aren't going to be a good fit, so I save everyone the trouble.
When you never did any hiring, you assume you're mainly looking for good candidates. When you actually have the experience, you know that at least 30% of applicants can't even code the most basic thing.
So I was wondering how you, hiring more than 100 developers, weeded out those candidates who "do the talk but can't walk".
Another thing: how many of those 100 developers you hired were bad hires?
Perhaps this will help: I have enough experience as a developer that I can tell very quickly where the person I'm talking to falls on the skill spectrum. Part of my negative reaction to this post is that I sense the author does as well; however, where he chooses to make them feel small, I just keep asking questions because I want to give them more opportunities to sell me on their strengths. Their ability to whiteboard leetcode in an interview has laughably close to zero relationship to how they will perform on a Wednesday in jeans without feeling like they are being interrogated, so I largely skip that.
If you cannot quickly tell someone's experience (or BS level) from talking with them, then this interview style will not work for you.
I have had a dud rate of about 8-12% over the past 15 years. However, only 1/3rd of those were duds because they fooled me. It was almost always personality issues that I didn't anticipate; not getting on well with other people on the team, etc.
One of those that fooled me came in hot because he had started a popular testing framework, so I didn't dig. I fooled myself by accepting him based on the reputation of the project, not the person. Lesson learned.
However, I would like to turn the question around and brag a moment about how amazing some of the hires were. I hired the creator of Sass, a future Basecamp developer, several future founders. No whiteboards in sight.
Responses are awesome because it gives you an idea of what this person's background is and what breadth of knowledge they have and what they think is cool or exciting. Maybe they start with a contact closing and signals being denounced, or an interrupt being triggered, or the OS and window manager, or the browser and DOM events, or HTTP or TLS or TCP or Ethernet or WiFi or LTE or routing and the tiers of network infrastructure or DNS or load balancing or proxies or web servers and tiers of API services or databases... It's can lead to some really fun technical conversation, follow up about how they learned the things they know and experiences they've had, and most of the time the interviewer learns something too.
of course, the whole point of the question is nobody you're likely to interview can fully answer it. maybe someone like jim keller could but is someone like him really going to know about css stuff? maybe, maybe not.
That's why I really like this question - it's just such a cool jumping off point.
That might be something you are looking for in a member of your team, but it's not a great indicator that they'll bring much additional shareholder value.
That is, if your reaction to the problem is "Eww what a 'clever' problem, are you showing off that you know such stuff?" then you are not someone Google is looking for. (Of course, whether Google is right here is a separate question ...)
The danger here is that when you explain something to someone, and they get it, you get a warm fuzzy rush of happy brain chemicals. You will feel well-disposed towards candidates who appreciate and enjoy taking the breadcrumbs you feed them. Candidates who didn't think your clever insights were as clever as all that, or who fail to respond to your tutelage and try to solve it their own way... you'll get a negative feeling towards them. They're not playing your game.
That's great if you're interviewing a student to figure out if they will enjoy learning from you, and you will enjoy teaching them!
But while teacher/learner relationships might be part of your relationship with colleagues at work, it seems an odd part of the relationship to compatibility filter in an interview.
Google used to look for people who were excited about interesting problems. This is a question about basic complexity analysis and I would have chastised the interviewer if his feedback was ever on one of my interviews.
One interview I had years ago that I really liked was where they gave a sample of code with a number of issues and asked me to describe how I'd improve it. I really appreciated that a) it was much more like actual work than many questions, b) you could start with obvious things and work deeper, and c) there were many valid ways to improve the code, so you weren't just trying to guess the answer the interviewer had in mind.
I strongly suspect many interviewers (myself included) would have a few "top issues" and significantly favour candidates who pointed them out. But as the candidate you don't really know which issues will seem important to the interviewer.
It's not an irrelevant exercise though - code reviews are a real part of many roles and making sensible choices about what to comment on is a necessary skill. But it is hard to avoid turning the interview into an exercise in "guess what's on the top of my list"
This kind of 'let's watch people struggle with my favourite LC puzzle, which I think I know inside-out' just seems very inconsistent, not very predictive, and just quite flaky to me.
On the job, my very first response to any kind of problem that looks algorithmic is to google the hell out of it, even if I think I know the answer already. In future, it will be to get gpt to write it.
I think the actual challenges in real-world software are not algorithmic, but architectural and beyond, and then only after thoroughly understanding the problem in the first place.
Isn't that in itself is a flag that this is in fact not a great interview question, unless it's being used for hiring candidates fresh out of college, say during campus interviews.
It's a whole other debate that most software engineers, even at BigCo end up writing CSS code to move pixels only and lose their DSA knowledge over time...
It's a bad question because _how well_ a candidate does past that minimum threshold has _no correlation_ with their ability to do the job.
You can treat it as a binary question for that minimum floor. To use it for rating, you're doing your org and the candidate a disservice and you're wasting interviewing time that could be spent more productively and humanely.
They probably did know how to code, but they’ve spent the last 5 years working on something quite a bit different from string wrangling in Python. Add in the stress of a (first-time?) interview and a reasonable candidate can look like a total fool…
Is this common at Google?
If by DSA knowledge you mean leetcode puzzle knowledge, yes, quite certainly. But from my experience fresh grads are less effective engineers before they mature by working with industry scale systems for a few years. And at that point they would have been less able to do your problem but a far better engineer.
Which means your interview question is aimed backwards, maybe if you only used it against fresh college grads then the gain function would be positive, but you applied it against everyone.
> I continued to use [the interview question] because I got good signal from the candidates.
Working at Google, this is something I hear a lot: that some question, or other interview thing, is good for getting "signal" from candidates. But how do you actually know?
Like, I feel to definitely know a question is good you would actually need to track candidates after they start working and see if their performance on the question correlates with their job performance. (Unfortunately, it seems impossible to track candidates who don't get an offer.)
As an example, the author of TFA clearly is selecting for people who have the patience to sit around and be told how smart the author is. That is the signal in this “great interview question”: candidate will suck it up for hours while I tell them stuff.
These are hard to do if you think of an interview question like a high school quiz or a college final.
“Tell me about a software project you played a significant role in. What did it do, how did it work, who made the tech choices, what was your role in it, what went right, what went wrong, if you were to rebuild it what would you do differently, what did you learn?”
I know it’s really a lot of questions but it’s still just one in a way “talk to me, in a free form back and forth way about software development”.
That said, I do think it's best to have the person relaxed and at their best, so anything to minimize stress or discomfort will show candidates at their best.
You generally want to cover a range of skills, and technical skills like the ones covered in the article are an important part of that.
The reverse trie is genius, but with dynamic programming do you still need it? I would say that if in the general setting you can afford O(n^2) preparatory steps, and therefore you can just remember which (start, end) pairs correspond to a valid word with regular trie lookups. That is O(n) steps for each start position, giving a total of O(n^2).
But then, that means that my candidate string can only be a maximum of twice the length of the longest dictionary word, so the entire process is O(1).
For realistic English words, we're talking about a typical candidate string of length, what, 20 maybe? I need to do somewhere between 19-38 dictionary lookups of 1-19-letter strings to test every splitting of such a word. Why are we even doing big-O analysis here?
Only when you start looking at arbitrarily long strings to see if they are word sequences does this analysis even make sense.
One is the length of the word. The other is the number of dictionary elements.
If your language is constrained to 1,000 words, then the dictionary will be pretty small and the hash table can be constructed with no chains. That should have constant-time lookup no matter how long the query concatenation is.
I don't see why O(n^2) should be optimal. The question is only to determine if a match exists, and that can be expressed as a regular expression. In the following I omit all 1-letter words since my dictionary includes all the letters.
>>> words = [line.strip() for line in open("/usr/share/dict/words") if len(line.strip()) > 1]
>>> words.sort(key=len, reverse=True) # because Python uses left-to-right match, not longest
>>> pattern = "(" + "|".join(words) + ")+$"
>>> import re
>>> matcher = re.compile(pattern) # takes a few seconds
>>> matcher.match("ducksoup")
<re.Match object; span=(0, 8), match='ducksoup'>
>>> matcher.match("ducksquom")
>>> matcher.match("theremanywordsinthisone")
<re.Match object; span=(0, 23), match='theremanywordsinthisone'>
>>> matcher.match("theremanywordsqinthisone") is None
True
Python uses backtracking, so this probably isn't O(n), especially with the ability to choose the dictionary.But with there are non-backtracking matchers which would make this O(n). Here's re2 from https://github.com/google/re2 :
>>> import re2
>>> opts = re2.Options()
>>> opts.max_mem = 200000000
>>> matcher = re2.compile(pattern, opts)
>>> match = matcher.match("ducksoup")
re2/re2.cc:821: DFA out of memory: pattern length 2493008, program size
1323834, list count 1089251, bytemap range 54
>>> match.start(), match.end()
(0, 8)
>>> match = matcher.match("ducksquom")
re2/re2.cc:821: DFA out of memory: pattern length 2493008, program size
1323834, list count 1089251, bytemap range 54
>>> match is None
True
This is first time I've used re2. There's probably a way to prevent that warning message - I believe this is falling back to an NFA, which might not be O(n)?Anyway, here it is processing the concatenation of 100,000 words selected at random (I've omitted the warning message):
>>> s="".join(random.sample(words, 100000))
>>> len(s)
956249
>>> import time
>>> t1 = time.time(); print(f"is concatenation? {matcher.match(s) is not None} " + f"time: {time.time()-t1:.1f} seconds")
is concatenation? True time: 12.3 seconds
>>> t1 = time.time(); print(f"is concatenation? {matcher.match(s+'Q') is not None} " + f"time: {time.time()-t1:.1f} seconds")
is concatenation? False time: 12.3 seconds
Cut the size by 10 and it's about 10x faster: >>> s="".join(random.sample(words, 10000))
>>> len(s)
95457
>>> t1 = time.time(); print(f"is concatenation? {matcher.match(s) is not None} " + f"time: {time.time()-t1:.1f} seconds")
is concatenation? True time: 1.3 seconds
For this test it appears to be linear in the query size.Edit: if I increase max_mem to 2_000_000_000 then I don't get the warning message. But then the DFA-based search takes 141.5 seconds. That should be O(n) time in the query length, but an order of magnitude slower than the NFA-based search.
That's why O(n) analysis isn't enough to really figure out "a faster solution."
For part 1 using a hashtable to look up the dictionary words, the complexity is not O(n^2) where n is the length of the input string. It should be O(w^2) where w is the length of the longest word in the dictionary.
The hash key material of the dictionary words has at most w characters. When computing the hash key for the input string, it can be truncated at w, as any longer input won't lead to a hash key making a match in the hashtable.
For part 1 using a trie, the complexity is O(w) rather than O(n). This is actually more obvious since walking down a trie tree will stop at the last character of a dictionary word.
Again for part2, the complexity is O(w 2^n) using bruteforce with hashtable, O(w^2 n) using DP with hashtable, and O(w n) using DP with trie.
Edit: Also for short strings like dictionary words, O(w) is kind of meanless since with SIMD or SWAR the short word can be folded into one or two DWORD. You can effectively treat it as O(1).
I don't think the function "inDict" is even defined as either local or remote query (unless I missed it), which really influences the answers. Querying a short string against in-memory structure will be based on "n", but querying a long string against any remote database will be based on "w".
Though I imagine they could just ask "when was your last algorithms course?" (and possibly where, and what grade you got) and save a lot of time. Or they could just ask "what year did you graduate from (whatever school level)?"
But that wouldn't be as useful for making the candidate "jump through more hoops" and reinforcing compliance as well as status hierarchy - important things at most companies.
And some people may have cheated their way through their algorithms class, so maybe they can be detected if it was recent enough.
Unless you're an insecure, narcissistic engineer who feels "smarter" than everyone else or a cargo cult follower of some business fad, there are no magic questions to instantly separate "good" candidates from "bad."
You discover good engineers by whiteboarding and working on challenging problems together than can possibly turn into real commits. Ego and solving manhole cover problems doesn't delver good commits.
> Unless you're an insecure, narcissistic engineer who feels "smarter" than everyone else or a cargo cult follower of some business fad, there are no magic questions to instantly separate "good" candidates from "bad."
without taking a moment to actually think why these type of interviews exist.
> You discover good engineers by whiteboarding and working on challenging problems together than can possibly turn into real commits. Ego and solving manhole cover problems doesn't delver good commits.
You think everybody got time for that? Maybe in your company you get one or two CVS, but when you have 30 it’s impossible to conduct thorough testing. You optimize for false negatives.
I then push them on this until they realize that any implementation of the dictionary lookup function of a string would be o(n) at best, and as such the total runtime of the solution is O(n²).
Surely it's
String -> Hash -> Is hash in hashmap
All of those operations are O(1) so the total operation is O(1)
Well, in theory you may have conflicts in the bucket.
And recovering the bucket means a round-trip to memory, which is much more expensive than running over a list.
And in any case, "n" is the length of the word, but has nothing to do with the distribution of the dictionary. So one should use a "m" to capture that.
However, the whole discussion in the piece suffers from not taking into account the fact that words don't get arbitrarily long. Thus, insofar that for all practical purposes n <= 30, say, it is indeed O(30) = O(1).
I agree with you that this sounds weird, but if you imagine arbitrarily long strings as inputs, and don't want to have a lot of collisions, you need to take into account all of the input string in the hash function.
In reality they are safely assumed to take O(1), but w/e the interviewer is looking for a specific answer.
That being said, I think you could get around this with careful prompting (ie, ask first "what is the Big-O for dictionary lookup with the respect to the number of letters" before asking the overall big O). But I think there is a bit too much wiggle room to make this a completely consistent question, too much variance candidate to candidate.
I like the problem a lot, but wouldn't feel comfortable using it.
The only reason I can think that the optimal solution was immediately obvious to me is perhaps because in a previous life I did a lot of work building convoluted application specific indexing schemes for data in a distributed KV store that had basically zero useful built-in secondary indexing schemes where piles of parallel GETs were about as "free" as just doing one, but chaining strings of requests back & forth to the client iteratively made for sad times. This meant that it was essential to be clever about what could be coded into key names and precomputed from application/use-case context to know what keys to fetch "all at once" to gather as much relevant data as possible with as few serially dependent requests as possible. One such use case depended on a modified C-trie as part of the solution, so I got very familiar with what kinds of problems they're for and what their limitations are.
Given that, I can't tell what that says about the question, about me, or about Google. What I can say for sure is that because I can't tell the above, the technical interview at my company is signing an NDA and sitting down for a couple hours and just actively pairing on real work & real problems with real teammates.
With the nature of software development work being what it is, I really don't understand why we as an industry run interviews like they're game shows or try to manufacture workplace simulators with terrible model conformance.
You could get lucky and have each question have a similar relationship to past knowledge. It's happened to me before, I blew through an Amazon interview once that way, but got knocked for technical leadership. So maybe you mean something like that, but that too is random. I had passed the leadership but failed the algorithms at Amazon a few years prior.
> I really don't understand why we as an industry run interviews like they're game shows or try to manufacture workplace simulators with terrible model conformance.
I believe these tests are more about compliance and baseline cognitive ability than anything else, which is good for the nature of the FAANG role. They are testing that you are willing to study what they tell you and that you are able to retain material of that level of difficultly and demonstrate that in an adversarial situation.
And also:
"What's your favorite algorithm or data structure? It doesn't have to be the fastest or best, just your favorite? Why?" (This does have one right answer: shellsort.)
"Tell me about a time you made a mistake — not a time where one of your weaknesses turned out to be a strength and you saved the day. No. Tell me about a time you really screwed up."
That said, from the article this article links to, I find this puzzling:
> The candidate’s performance on the problem isn’t binary. The worst candidates don’t even manage to implement the fizzbuzz solution in 45 minutes. The best implement a memoized solution in 10 minutes, allowing you to make the problem even more interesting, e.g., asking how they would handle a dictionary too large to fit in main memory. Most candidates perform somewhere in the middle.
Two things I worry about: if the interview isn't "binary", how do you decide whether the candidate passed? Is doing just the simpler version fine as long as you ask for low pay? Or are you still failed? Or does it depend on whether other candidates did better, which is another way of saying you failed? It seems to me it's still binary.
And the other thing is: if the candidate did great in 10', please don't ask additional questions. The candidate is a "hire". Don't make it "interesting", an interview is not the place to shake things up to make it "interesting". Understand that for most people, interviewing is a stressful situation. For "interesting" times they have their friends and Netflix.
When I did interviews, I did a problem in a kind of similar way (although I skipped most of this article, so I dunno). I'd give the same problem to all candidates, but expectations were higher for candidates with more experience.
For any candidate, there was a minimum bar (essentially needed to be able to program a loop that did what they said it would do), and failure was a very clear signal; although I had a couple candidates that the only signal I got was anxiety.
How much over the minimum the candidate got would help decide between weak to strong hire. And then there's a group decision based on the results from everyone.
> And the other thing is: if the candidate did great in 10', please don't ask additional questions.
Part of the job as an interviewer is to fill the whole time though. If we're done in 10 minutes of a 45 minute slot, that's a lot of time to fill. Usually the next interviewer is busy until the next one. A candidate doesn't usually have 35 minutes of questions, and a trip to get snacks or drinks, etc doesn't take that long either. At that point, you just have to ask them about projects and see if they've got anything they want to talk about --- but since that's not my interview role, I won't judge that.
I did have one candidate who I was the last interviewer of the day and they didn't like my question and asked for something else and wouldn't participate at all. We were done in under ten minutes, so I just walked them out and we both got 35 minutes of our life back.
You’ve already lost me, dude. If the question can’t meaningfully differentiate between a junior and a high level senior, it’s a trash question.
How can a word be the combination of two words if the first word isn’t a word?
So, for example, if the first letter isn’t A or I, we can stop doing lookups for any combination of just a single letter and the remainder of the word.
Etc.
[1] This seems to me to be a practical limitation rather than theoretical given strings and numbers are really equivalent - it’s just that the space is mindbogglingly huge so if you wanted to do a general purpose O(1) string to number conversion it would require uncountably infinite space.
Is that true?
I am actually siding towards the candidates on this one
presumably it would be O(n_words_in_dictionary) for the dictionary check, which as the number of words in the dictionary is constant, would make the overall algorithm O(n) ...
Any better than that you can do would be O(min(n, k)) where k is the length of the longest word in the dictionary. But without loss of generality that would be O(n).
Fun write-up though. The comment saying this optimizes for clever freshmen is spot on in my book.
The maximum number of iterations is the length of the longest word in the dictionary.
This means that part 1 of the problem can be solved in O(1)+O(1).
That doesn’t mean you will be happy with it. Not all O(1)s are equal.
Big in space but O(1) (wrt word len at least)
I found one sentence very interesting:
> Moreover, the problem as I present it, has enough range to sift out bad candidates from good ones even if the candidates have seen the problem before, which is why I successfully used it even though it was banned at Google.
How do you measure the question's success? Is there some kind of feedback you get about how the person you made a decision about went on to perform? What about the people that ended up not being hired?
Not the latter. However, there are five interviews that each candidate goes through, and as an interviewer you get to read the interview feedback written by the other interviewers as well. So I can compare my assessment of a candidate against four other interviewers' assessment using their questions and feedback notes and rating assigned.
Is n the length of the word or the length of the dictionary? And if it's the length of the word (<< the length of the dictionary), why does it matter if it's linear or quadratic?
The problem as described:
“Write a function that given a string, determines if it’s a concatenation of two dictionary words.”
given that description I think you can't have left over fragments. But the code seems to allow fragments (don't know Python). Also it just returns a boolean? So we just need to know if there is two words contained within our word within the dictionary.
If that's the case, and the "dictionary" is an actual dictionary used by humans (they mean like a Dictionary like The Oxford American Dictionary right?), then, because the letters of a word are considered nouns (that is to say the letter b is a noun - Merriam-Webster https://www.merriam-webster.com/dictionary/b) then any word with more than one letter is a concatenation of two words. Which is anyway stupid because no one cares about concatenations of words, people care about compound words and this is not how you would find compound words.
I bet that's not what they want to hear though. I generally do really bad on questions.
"lampshade" => true
"grip" => false
"gripshell" => true
"fhtagnblue" => false
"timeblue" => true Some come up with a greedy algorithm which finds the first prefix that is in the dictionary, and then returns whether the rest of the string is in the dictionary
I would do that straight away. I don't see the point of implementing the more complex and longer to code "double dictionary checking for all partitions of the string"I guess this wasn't entirely without sense but sort of seemed to be missing the point.
If it has well-defined success state then it's maybe OK "implement a maze-solving algorithm for a 3^n cube". If there are implementation tests that the candidate can use, even better
To prove that it's optimal you "only" need to construct the pathological case, which I believe should not be hard to do by induction.
I think that pathological case could simply be that the input string is abcdefghij... and the dictionary contains all one-letter strings, plus all strings like ac, abd, abce, ..., bd, bce, bcdf ... and so on.
The base induction case would be xyz as the input and {x,y,z,xz} as the dictionary.
>The fastest I had a candidate solve the entirety of the problem with all bells and whistles was in 20 minutes, by a freshman from the University of Waterloo, the type who did high-school competitions.
>The most depressing failure I saw was a PhD graduate from University of Toronto who could not produce working code for the first section of the problem in 45 minutes.
>I once had a candidate with 15-years work experience give me lots of attitude for asking them such a simple question, while at the same time struggling at it.
All of this data points to the fact that this question may not be good. A phd graduate and a person with 15 years of experience rejected for someone who practices programming for competitions? What gets me is that the data is painting an obvious picture here. A very obvious picture. An obvious picture that we aren't sure what's a good interview and a bad interview question.
But the problem is that most people when looking at this completely miss it. It's not obvious to the interviewer and it's not obvious to alot of people who like google style questions. We literally have not much data and not much science backing any of this up.
It's an illustration of how biased humans are and illustration of how extra biased interviewing for software positions is. If there's anything more unknowingly biased and then the replication crisis in science it's technical interviews for companies. There needs to be real feedback loops that correlate interview question passing with Actual performance.
Google is in a good position to grab this data but I'm not sure they are doing so given how they just okayed this guys gut decision to go against the grain and use this question. I'm not against this question, but certainly to call this great in the face of controversial data that he himself gathered and listed on his post is just a complete blueprint of the extent of human blindness.
The reality of what's going on here is the person here in the interview is just getting off on dominating other people with a hard question. It's not on purpose but he's doing it without realizing it. The blog post in itself is a bit showy. It's like "I can answer this question but a phd graduate can't".
What I’ve seen happen is that there are two narratives they flip on: I hired this great person and they are doing amazing so I am amazing; I hired this person and they aren’t doing great so I fired them so I am amazing.
Of course, they also tend to be terrible at assessing who is actually good/bad.
I call it the “arbitrary onion”: layer after layer of plausible claims that each turn out to be just as arbitrary as the outer claim.
The sad reality was that it was waaaay too effective as a fizzbuzz eliminator (> 80%) which I don't know what it tells me about the candidates in general or just about our recruiting :-)
Actually the best questions I ask, and the ones that give me the best candidate "signal" are what I call the "clean code" style questions (as in code that is simple, easy to extend, not necessarily Uncle Bob Clean Code).
I'd rather not post examples here because I still use these questions, but the theme is as follows:
* These questions ask to implement functionality that you could see in a (super simple) real world application.
* They start super simple, and include several follow ups that increase in complexity.
* No data structures and algos beyond the basics. Loops, lists, hash maps and little else. What most programmers would use on the job on a daily basis.
I found in practice these work really well, specially with experienced programmers who can flex their low level design skills. People who ace leetcode style questions tend to do worse here, but no wonder, since they've been trained in a complete different style of interview.
Good questions allow multiple possible solutions[1], allowing a candidate to arrive at a successful solution via many different paths while still evaluating whether they are going to be able to perform well at the job. I don’t think this question comes close to meeting that standard.
[1] By which I mean fundamentally different approaches rather than syntactic variants on the same basic thing
Bad news: You have to walk this stable genius through 50 years of Suffix Tree and constant time LCA algorithms in 20 minutes.
abbbbbcdddddde => aB5cD6e
The number of developers who struggled or utterly failed with this was depressing.
Is any of this better then randomly selecting? Do we even know if there is a skill set that is optimal, or if multiple complementary skill sets work best? Do we know what they look like, how to select for them? Or are we all just filtering for whatever collective definition of cleverness people involved in a given set of interviewing and hiring tasks prefer?
I suspect on average it's better to filter out good candidates than filter in bad ones so as long as a selection process skews towards eliminating most candidates probabilities will have most teams covered on that front. Of course this only works when salaries in programming are rising faster then the rest of the labor market, attracting a growing cohort each fiscal cycle.
What is the runtime complexity of an algorithm? Set it up with varying amounts of data fed to it (or whatever N is) and then time the real thing (or a suitable minimal but complete example).
Then you can instrument or profile the algorithm in any number of different ways to ask what is being called so often.
If you do it right you can really measure what it is doing, and you're getting closer at least to "production" code where you're not just working with pencil and paper but you've got real algorithms running hardware with finite sized caches.
Ultimately it doesn't matter what anyone writes down on pencil and paper if the hardware doesn't agree with you. Usually I measure it first and then work backwards knowing the answer already.
I've never worked at that company and I don’t know if as an interviewer you pick a question from a catalog. If that’s the case, and if you’re so inclined in asking this type of nonsense, I’d suggest a small tweak: a third party selects the question with some sort of guarantee/constraint that this would be the first time the interviewer asks that question, and so is now in more or less level ground with the interviewee. No other option but to have a discussion among potential colleagues.
For starters, it's staggering how many people can't do it.
But also, it's a great question because it's simple enough that you don't have to be clever but you can get a feeling for all sorts of things. Do they ask questions? Any questions? What possible inputs might cause problems? What do they name their method? What tests do they write? (Project is set up already with an example test class ready to go.) Etc etc
The customer wants to increase their sales. They want something that auto-generate sales responses to incoming emails. Discuss.
The problem with technology workers is that they believe technology is important. The above question allows you, the interviewer, to see if they have actual critical thinking skills.
Annnnnnnnd I’m out.
Good taste — writing cohesive, loosely coupled code with a sensible bank of unit tests an a strong dose of product sensibility — is largely orthogonal to being good at solving algorithmic or ds questions. One of the best developers I know, who makes seven figures a year a year (trading software), would likely bomb the second part of this question. Contrawise, a young researcher I know who has won multiple scholastic programming competitions and crushes leet code, would destroy this problem in seconds … and yet, I pity his coworkers who have to live with his atrocious code.
More accurately, he would first look at the interviewer like he was a dolt, and then bomb the question :)
If they only know simple data structures, going from a hash to a hash of hashes is one of those "eureka" moments that can really shows how a candidate is willing to collaborate on a problem and react to new ideas.
Heaps are an interesting data structure to use for interviewing, but I do find they can be a little less intuitive, and their use cases translate less directly to business logic and are more of an optimization task. It depends on what talents you're looking for on your team!
https://iq.opengenus.org/time-complexity-of-hash-table/#:~:t....
The trie/reversed-trie solution is a little too clever to arise naturally from conversation (and probably not optimal because hardware implementation).
This seems pretty nitpicky, especially considering we're assuming substring creation is O(1). And isn't a hash table lookup of a string O(n) at worst, not at best? And since this seems like a fixed dictionary, we're only doing queries not insertions, couldn't you just choose a different hashtable algorithm like dynamic perfect hashing to guarantee O(1) lookups?
I also don't like how it starts off with the dictionary being a blackbox style function, and a later part of the question requires you to change the internals of this dictionary query function. My very first thought would be about the structure of the dictionary query. And then when told to ignore that, I would assume that's still the case when asking to later optimize the algorithm.
Not surprising. This sort of interview is selecting for people who have exactly this kind of academic knowledge.
Well, that removes most of the fun of it.
I think that’s actually a good example of why this is not a good question. It’s disconnected from any realities.
if i was (am retired) i would typically ask questions about the languages we were using. in my case this would typically be c++ and sql. so i would normally ask things like:
"tell me something about the copy constructor"
or:
"when would you want to do a union?"
amazing how many people were dumbfounded by these. and notice, no leetcode crap needed.
However, software engineering is so much more than writing code to solve a well specified problem.
- oftentimes, the problem is not well specified. The engineer needs to find out what the problem specification really is.
- In a lot of cases, a solution already exists publicly. The engineer needs to find documentation or code that solves the lecified problem or a similar problem. Usually somebody has already thought about optimization of this type of problem and has documented it.
- Often, a feature needs to be added to an existing codebase. Often a similar feature already exists in the codebase and the new feature can be implemented in a similar way. The engineer needs to find the similar feature and has to figure out how to generalize or adapt it into the new feature.
- A lot of times technical debt needs decreasing. This involves understanding feature implementations or models/paradigms in an existing codebase, understand how they are inefficient or needlessly complex. Creativity is the trait mostly needed in this case to address it.
- often new APIs need to be made, needing modeling and documentation. Also often existing APIs or newly imported libraries need to bevused to tackle problems.
- Above points address the purely technical skills. Non-technical skills are just as important. Communication, teamwork, prioritization, time and stress management, customer relations,...
Tldr - this great interview question targets just one aspect of the job.
With performance like that, I'd call it a "trash interview question".
When you use wildly inaccurate processes, you build bad teams and good people know you'd be wasting their time so they don't even go in the door.
Seems like the practical application of this is to quickly disallow strong memorable passwords of the form "correct horse battery staple". https://xkcd.com/936/ If so, deeply misguided.
- part 1: got it in like 20 seconds of thought
- part 2: got a variant with a trie in maybe 5 minutes thinking, wasn't exactly the same as the given one but close enough
- part 3: thought about it for an hour (got nerd sniped). got a trie + recursive dynamic programming solution that's O(N^2) in the size of the input string. the actual solution took me 15 minutes, the rest was trying to prove the the big-O
Let's say the string is wxyz. As we iterate, we'll split into the possible prefixes and suffixes:
w xyz
wx yz
wxy z
wxyz (suffix is empty string, we have to special-case this I think)
If we find a prefix-suffix pair where the prefix is in the trie, and a recursive function call on the suffix returns True, then we return True. If there is no such pair, then we return False.Looking up a string in the trie is O(N). But for the prefixes, we can "remember our place" in the trie, so that the cost of an incremental character is constant. So membership checking wxy only incurs the cost of y, given that we've already just looked up wx.
The prefix is either a word in the dictionary, or it isn't. If the prefix is not in the trie, we advance to the next character and repeat, with the next prefix grown by one character and the next suffix shrunk by one character. If the prefix is a word in the trie, we call our function recursively on the suffix, and memoize the result. If the recursive call comes back true, we're done, and we return true. If it's false, we advance to the next possible prefix and repeat. If we reach the end, the prefix now comprises the whole string, so we just return whether it's a full word in the trie or not.
In the base case of the recursion, the suffix is a single character, which is a constant-time lookup in the trie. So every recursive call after that can just look up that memoized result. Same for the 2-character suffix, 3-character suffix, and so on:
- 1 character: z: z: 1 operation
- 2 character: yz: y, z suffix memoized, yz: 3 operations
- 3 character: xyz: x, yz suffix is memoized, xy, z suffix is memoized, xyz: 5 operations
- 4 character: wxyz: w, xyz(memo), wx, yz(memo), wxy, z(memo), wxyz: 7 operations
- N characters: prefix 1, (N-1)(memo), prefix 2, (N-2)(memo), prefix 3, (N-3)(memo) ...: 2N operations
(Here I've spoken of the "memoized xyz suffix" and so on, so you might think the lookup time grows with the length of the suffix, because you have to hash it. But actually you don't, because you can just lookup by the index of the start of the suffix. So looking up the memoized results is also constant time.)So summing the cost of all those recursive steps, it is O(N^2) in the length of the string.
The word "cantonal" works as a good test case because it has lots of sub-words: can, cant, canto, canton, a, an, ant, anton, ton, tonal, on. The memoization doesn't show up for it though, so I'll append "q" to make "cantonalq", which should return False:
c -> no
ca -> no
can -> yes
t -> no
to -> yes
n -> no
na -> no
nal -> no
nalq -> no
ton -> yes
a -> yes
l -> no
lq -> no
al -> no
alq -> no
tona -> no
tonal -> yes
q -> no
tonalq -> no
cant -> yes
o -> no
on -> yes
(-alq suffix is memoized) -> no
ona -> no
onal -> no
onalq -> no
canto -> yes
(-nalq suffix is memoized) -> no
canton -> yes
(-alq suffix is memoized) -> no
cantona -> no
cantonal -> yes
(-q suffix is memoized) -> no
-> return False
I wouldn't have been able to do this in an interview setting. Most of my time was spent gazing vacantly off into space. Interviewers want you to narrate your thinking but I can't do that .. I have to shut the fuck up for a while as my brain works. But long periods of silence are intensely awkward in such a setting and I'd get very flustered and self-conscious. Maybe I could have narrated the "cantonalq" example, idk.Generally I don't mind "pure" DS/algo interview questions, so long as they're plausibly related to something one might actually do. I have actually used tries before on a real problem. Dynamic programming with memoization tricks? Never. Has anyone ever had something remotely similar come up in real work? I mean I know there are caches everywhere but that's not the same thing. It just seems like a sick hazing ritual that nerds inflict on one another.
Conclusion: bad interview question.
Maybe it is understandable since the article does not clarify any of this, but folks should notice that:
- This type of interview is part of a process, composed of multiple interviews. Not the only one, and not the deciding factor. Questions on behaviors, team dynamics, etc, are also part of the process and covered in other interviews (usually done by managers). You could fail a systems interview and still pass the interview process. The specific interview mentioned in the article seems not too complex, though, so I wouldn't be surprised if it is a screening interview that rejects candidates when not passed.
- People mentioning that the interviewer is wrong on time or space complexity are also missing the point. It doesn't matter. If someone in the interview mentions O(n) instead of O(n2) and can reason that, that's already a good signal. It is not so much about being right or wrong. It is about thinking about those things when coding, keeping them in mind, been able to express that and run someone else through your thought process, etc. Someone plainly saying something is O(n) without any further explanation can give a bad signal, but someone explaining why she thinks so can be a good signal, even if ultimately the solution is not O(n).
- IMO, folks saying that this favors junior engineers and people fresh out of college are wrong. This favors people who ultimately understand the foundations of programming. People fresh out of college had a refresher on that not so long ago, so they've been practicing it more. The thing is, some people in our industry get further away from that as they add seniority to their career, but there's also lots of senior people that are very well versed on this kind of thing. Specifically, the question in this article is extremely simple, so TBH I don't understand why someone senior would fail to know the answer to that question just because that person is senior. Maybe these companies are just not looking for that kind of senior engineer for this specific positions. That's all. There's also a lot of different seniority levels, too. I could see this interview having less weight on a decision as seniority increases, but still important.
- "this does not represent a realistic scenario". That's a given. As a company, you have maybe 8 hours total with the candidate (multiple rounds of interviews plus the on-site). Onboarding into the company and a team, until you're productive, will take weeks/months. There's no way of testing a realistic scenario on an interview. We get the best signal we can with the time we're given.
- Not passing the interview process is very demoralizing, but it does not mean the company thinks you're a bad engineer or that you're not good enough. The process is designed to minimize false positives, so they tend to prefer to reject candidates if there's any question or mixed signal in the interviews.
- I'm not saying the process is perfect, by the way. For example, there's also luck involved, too. It shouldn't, but there is. To give one example: interviewers are people too. They need to be trained in the process of interviewing, what signals to look for, etc. There are good and bad interviewers. Interviewers can have a bad day and judge some answer as unsatisfying when some other day it would be fine, etc.
sure, some will trip over themselves.
20m was the best?
poor screener?