Steve Ballmer's incorrect binary search interview question - https://news.ycombinator.com/item?id=41434637 - Sept 2024 (240 comments)
Ballmer's argument is essentially about tail risk. Expected value is absolutely not a good way to make bets if you value survival, because you only get one shot. Same reason you wouldn't go all in every time you get a poker hand that's "expected" to win. Because you'll (very probably) be bankrupt in a few hands.
Sure the mean is +$0.07 or whatever, but the spread on that surely goes over the 0 line. So there may well be marginally more chance of winning than losing, on average, but you're only gonna get one outcome. So if the goal is to play to win, or else, then you probably shouldn't unless you like owing Ballmer money.
What would be more interesting is to monte carlo simulate this strategy and look at the win/loss distribution. Presumably the choice is then not so clear cut.
If you're allowed to play the game a few trillion times or so, then by all means bleed him dry :P
Where are you getting that from? As far as I can tell, he makes no such arguments in the interview. The problem, and his explanation of the answer, are phrased purely in terms of expected value of a single iteration of the game. And the twist is the adversarial selection of the number, not risk of ruin.
It'd be an awful example of tail risk anyway. With the obvious strategy the tail is extremely fat.
Yes! The St. Petersburg "Paradox" shows that we intuitively know that. I put "paradox" in quotes because I don't think it's a paradox, it's just a sane reaction.
(Sam Bankman-Fried was a big fan of EV and famously declared that he would toss a coin where heads would double the "value" (?) of the world but tails would destroy it.)
In short, the St. Petersburg paradox goes as follows: a fair coin is tossed until heads come up, and the player wins $2^n, where n is the number of times the coin was flipped. So for example if heads come up on the first flip the player gets $2, if it comes up on the second they get $4, on the third, $8, on the tenth $1024 (2^10), etc. It's easy to show that the expected value of the game is infinite (approaches infinity).
Therefore, someone perfectly rational (?) should be willing to pay virtually any amount of money to play the game, because any finite amount of money is less than an infinite amount of money, and therefore the expected gain is always positive.
Yet you will probably not find many people (except SBF?) willing to pay millions of dollars to play that game.
It's only a paradox if we think it shows that people are not "rational". But I think it simply shows EV is not a good measure of risk, and everyone knows it.
Very complete and fascinating article about the St. Petersburg Paradox here:
Equating money with value is a simple trap as well. Who cares if you can win millions when a single loss wipes out all your savings? Since anything below a certain level of money leaves you trapped with no way out it could be argued that the value of being destitute is not 0 but -infinity which makes any risk of losing all money unacceptable. This is especially true in a world where people are willing to offer strange bets with arbitrarily high expected value as long as you have some money.
The missing context is that in case A, you need in 10 minutes to repay $50 debt to the Sicilian mafia, or else they'll kill you to make an example for others, and you have no other assets or other ways to make money in this short time. In case B, the situation is the same, but you owe $100 instead of $50.
There are standard arguments (e.g. the Von Neumann–Morgenstern utility theorem) that an agent with rational preferences, with remarkably weak definitions of the word “rational”, must have an utility function and a subjective probability function such that their behaviour is always governed by the EV of that utility with respect to that probability.
I’m glad to now know there’s a common example of that weakness.
Unlike most people here I actually think questions like this are a decent way to see how people think. I would expect people with math/stats/cs background to be able to at least start the conversation about this problem.
However when you hide hypotheses or add your own BS constraints as a gotcha without explicitly stating them is where you lose me.
If the question is "would you play this game" the reasonable mathematical translation is "determine if the expected value is greater than zero". If you're going to talk about tail risk you need to specify utility functions (possibly asymmetric for the two players!) and you need to explicitly say that's what you mean.
Not really! And that may be the point of the question. It's not testing if you can pattern match to plausible CS concepts.
If you get one play, and the goal is to win, do you take the chance? The whole question is about the difference in likelihood in the limit (expected value, infinite plays) and what is a likely outcome _of one round_.
To be honest, I think Steve just didn't grasp the mathematical deepness of the problem.
Which makes me wonder if it's related to another 'simple' game theory problem that came up in Matt Levine's money stuff:
"They made me do the math on 1000 coin flips. EV(heads) (easy), standard deviation (slightly harder), then they offered me a +EV bet on the outcome. I said “let’s go.”
They said “Wrong. If we’re offering it to you, you shouldn’t take it.”
I said “We just did the math.”
They said “We have a guy on the floor of the Amex who can flip 55% heads.”"
I like that anecdote and the takeaway, especially with regards to trading: if someone's offering you what seems obviously a +EV trade, why are they offering it to you and what are you missing? Whether that was Ballmer's intended lesson is another matter..
[0]https://www.bloomberg.com/opinion/articles/2024-05-14/amc-is...
Betting more than the Kelly fraction increases the risk of ruin, especially in the long run.
https://en.m.wikipedia.org/wiki/Kelly_criterion
Note: Not saying that this is applicable in the original post's situation. It's relevant to the parent comment though, and very useful in many situations, such as investing.
If he was trying to make that point, why set the bet at $1 - a loss that wouldn't imperil anyone?
The situation is entirely fictional, why not fictionally gamble with a five-figure sum?
> the spread on that surely goes over the 0 line.
Do you imagine starting with $1 or $1000? :)
Let's add a condition that Ballmer has infinite money, we start with a specific budget, and we can't continue playing if we exceed budget randomly changes after each game,
In the game where you start with $N, win $1 with probability p > 0.5 and lose $1 otherwise, the chance of eventually losing all your money is (p/(1-p))^N. [1]
So, the ruin chance actually becomes exponentially lower the more money you have at the start.
The steps in the random walk above belong to a simple, Bernoulli-like random distribution. Meanwhile the mixed strategy is a more complex discrete random variable because it can do more steps than just +1 and -1.
However, I believe that the same principle applies for the mixed strategy.
If you zoom out and consider "batches" of steps, you can apply the Central limit theorem and see that all these random walks work roughly the same. The caveat being that you need a large enough starting budget to "zoom out" :)
Granted, the standard deviation for the mixed strategy is ~$1. I would guesstimate that if you start with ~$1000, there's no way you will ever lose your money.
> What would be more interesting is to monte carlo simulate this strategy and look at the win/loss distribution. Presumably the choice is then not so clear cut.
Agree, this would be a nice demonstration! I will think about doing this next time I get a couple of hours of free time.
Setting the limit at 5 brings you to the interesting point of there being a good mix of win/loss outcomes. 4 would be too few guesses and you'd very likely lose, and 7+ you'd definitely win. So the question is only interesting _because_ the limit is chosen so that the spread puts your odds on both sides of the 0 line. Otherwise it'd be clear cut.
The standard deviation being ~$1 is interesting. To me that suggests that with a mean of $0.07 and a deviation of +/- $1, it's essentially 50/50 odds. There's technically a slight edge in your favor, probably 53/47, but barely. So given a game with essentially no edge, would you play? Framing it that way - deciding to what degree the game is winnable - it's essentially not. You should not particularly expect to win, no matter your strategy.
I think part of the trick with the Ballmer question as well is the question is not necessarily about 'can you find an optimal strategy?' - it's 'do you play the game or not?'. The paths chosen within the round don't ultimately matter to that question. It's only intermediately necessary to model the intra-round decision paths in order to get to the overall win/loss distribution for a single round.
If you do end up getting the time, do make another blog post!
You're calling an all-in 100% of the time in a cash game if your expected value is positive. If you don't, you can't afford to play at that table.
You're not going all-in with any hand expected to win because that's not how you maximize profit. It has nothing to do with the risk of going bankrupt. Because again, if that's a concern you shouldn't be sitting at the table.
Tournament poker is a bit different because there are certain points where you have positive chip EV and negative $ EV and the math changes.
The whole poker thing was merely an analogy in the first place.
This is absolutely true. All it takes is to understand that for the single person, the model isn't ergodic but all expected value based models assume ergodicity.
See [section 4 here](https://www.jasoncollins.blog/posts/ergodicity-economics-a-p...) on losing wealth on a positive value bet.
And some of the patterns are just obviously suboptimal if this is your only chance:
> With probability 0.9686%: Binary search, first guess is 1.
(I wonder what Ballmer would think that, when proposed to play this game, you first manually throw dice to draw a random number in the range 1 - 1,000,000 and if it's 9,686 or less, you start your binary search at 1. He might be impressed by your dedication to the mixed strategy.)
I think people are missing the forest for the trees on it mattering if Balmer was "right" or "wrong".
It's his interview question. He's using it as a way to see your thought process more than the answer you arrive at at the end.
I imagine if interviewees had these thoughtful disagreements, he'd either guide them to the reason he had a different answer or value their input.
that said, the problem screams binary search and you know your opponent is a computer person, so i guess the question is: if you make a bet that your opponent is making an adversarial choice that assumes you're going to do a vanilla binary search, can you improve your odds of coming out ahead by modifying your own binary search to always assume the target is an adversarial one?
The OP has a complex randomized strategy that guarantees to average at least $0.07 against any adversary; meanwhile, just by delaying his "pick" and stringing you along, Ballmer makes you take seven guesses and owe him a dollar each time.
If you were expecting to win $0.07 on average, how many rounds would you play before you realise you're being scammed?
The OP's article is interesting, but it assumes a very weak notion of "adversarial", in which Ballmer still commits to some initial choice.
Interestingly it's actually possible for a player to know this is the case if Ballmer uses a commitment scheme [1]. For example, at the start of the game Ballmer could generate 500 random bits, append his chosen number in the range 1-100 to this, hash the result and then send you that hash: At the conclusion of the game, he sends you the 500 random bits, and you can check that concatenating his chosen number (now revealed) to those bits and hashing the result produces the hash he originally sent. (If Ballmer lies and changes his number, he would need to somehow come up with 500 bits that when concatenated with this different number still produce the original hash. This is hard.)
It is by the author of HATERIS, a variant of Tetris that always gives you the worst piece.
Well, rereading what he (was reported to have) said, I now think that probably was his intent, and he was just sloppy. At least, he can't have it both ways: Either he genuinely commits to a number at the outset, and uses the word "adversarial" to mean a very weak form of adversary (one that is defeated in expectation by TFA's mixed strategy), or he is using "adversarial" in the standard (strong) sense, in which case he must be lying about committing to a number, which is a shifty mind game as you say.
I feel like there's an even simpler proof that you can beat adversarial-ballmer, with exactly the same expected positive outcome as binary search vs random ballmer.
I call my algorithm "randomly offset binary search". It goes like this:
1. Pick a random number between 0-100, call this 'offset'
2. Perform the binary search algorithm, except at each step add 'offset' to the value and mod by 100.
That's it. Now, even if Ballmer knows you're using this strategy, he can't make it perform any worse by selecting any specific number. Therefore your expected outcome is still $0.20 per game, beating the strategy proposed in this blog post.
That's what I get for not thinking it through properly, thank you for pointing that out!
Please please stop with this "if he's rich he must be smart" argument. Please?
I worked on Windows Mobile at the time the iPhone came out. We were all shitting ourselves.
Balmer's question seems fair for the complexity of the answer he was expecting.
As the interviewee you would presumably provide the (mathematically) wrong answer, but you'd show your thinking along the way, including a small demonstration of CS principles.
Keep in mind that Balmer had a long career, so if he ever asked this question, it was probably back in the 80s when no one expected you to come up with the complex solution outlined in the post.
If you did outline the correct answer, that would be amazing, and you'd be an instant hire. But the question doesn't fundamentally seem broken to me because either answer (taking the bet or not) needs to be well justified.
What you're saying sounds to me like "your answer doesn't need to be correct, it just needs to sound reasonable". What you're filtering on with this question is good bullshitters.
To me, the only reasonable to this question is "I don't know". I think even a mathematical genius like Terrence Tao would not be able to give you the answer to this on the spot. (Although I can also totally believe that he would instantly see this from some obscure theorem that only like 5 people on the planet know.)
And, indeed, what Microsoft really needed in the 80s was people who truly understood memory management in C, not gamblers left to hack their way into something that kind of worked sometimes. Microsoft's need to correct that hiring mistake later set them back significantly. Had they asked about the intricacies of C as it directly pertained to the job instead of unrelated trivia, they would be in a much stronger position now.
It doesn't matter that it was difficult to prove he was wrong. The issue is that it was impossible to prove he was right. And if anyone ever tried to bring that up to him, he never once heard them.
I believe an interviewer who is wrong and does not listen to you is a perfect example of the broken process. Especially given that he was an industry leader - in this interview he was providing a historical example of the process' merits - all while being entirely incorrect.
(typically there's discussion with all the interviewers and it isn't just "did the candidate get the question right or not). I personally think a lot of big tech interview questions are dumb but I think the process isn't as broken as I thought, seeing it from both sides.
The question is whether Ballmers ego would allow him this flexibility if it's his own question.
Some people might be very emotionally attached to the questions they created, but not so much to those they've been given as an interviewer.
I've fared well pointing out issues with questions in the past and gotten the job. I'd try to be diplomatic about it though and not outright say they're wrong. Instead pointing out how with a classical binary search the expected value is negative, but there are strategies from game theory to deal with adversarial picks and here you could reach a positive expected value.
Kind of a "yes, and..." approach. You acknowledge their view, and then you add a new perspective. But don't say they're wrong.
Funny enough in situations where I suspect the interviewer was given the question it probably wouldn't have helped, not due to emotional attachment, but because the interviewer had a tenuous grasp of the topic themselves and couldn't stray from the script they were given.
I'd say it's impossible to answer this question conclusively within the time frame of an interview. That makes it, in my opinion, a bad interview question.
My answer to this question would be to show that I understand what it would take to answer this question correctly (you'd have to find a mixed strategy that has a positive expected value for every choice of number), I wouldn't be able to give a confident "yes" or "no" answer on the spot. I think that's the only correct answer.
In practice, I think this question is advantegeous for those who confidently blurt out an answer and then make up a heuristic argument for it. But a heuristic argument can be found for both "yes" and "no".
Presumably Balmer did ask this question. At least a few times. And yet he never heard of the correct answer, and believed the incorrect answer to be correct.
That tells you that if anyone did say "actually, you're wrong" he never listened to them.
In this case, it would just be showing that you can reason about binary search and showing that the mean profit is 0.20 dollars
At least we get some quality fiction like https://aphyr.com/posts/340-reversing-the-technical-intervie... and its sequels.
Is it fair? Does he change his choice or pre-record it? Can you play multiple times?
Purely random distribution, totally fair? Sure play the game every time, the math pans out. It’s not that though.
It’s about showing your work
This looks right. Good work!
This is a very nice book covering mixed strategy in game theory.
A very nice motivating example from the book: "There are two cards, an ace and a deuce. Player A draws either of the two at random; B does not see which card is drawn. If A has drawn the ace, he says "I've got the ace" and demands a dollar from his opponent. If A has drawn the deuce, then he may either (A1) say "I've got the ace" and demand a dollar from his opponent or (A2) confess that he has got the deuce and pay his opponent a dollar. The opponent, if he is paid the dollar voluntarily, can only accept it. If, however, a dollar is demanded from him, then he may either (B1) believe that player A has got the ace and give him the dollar or (B2) demand a check so as to see whether A's statement is true or not. If it is found that A does have the ace, B must pay A two dollars. If, however, it is found that A is bluffing B and has the deuce, player A pays B two dollars. Analyze the game and find the optimal strategy for each player and the expected payoff."
People are unable to think randomly. They'll avoid the obvious "not random" numbers 2 and 99, for example. I read somewhere that most people, asked to pick a number between 0 and 10, will pick 7. And the next digit would probably be odd, and not 5, because 5 is not random. That leaves you with 71, 73, 77 and 79. 77 is not random, so 71, 73 or 79. I'd pick 73 as my first guess.
I'd say those were good odds!
(That's why when you're picking a "random" number, it's best to use an actual dice.)
This is how you win at hammer-paper-scissors, too.
Ballmer could also change the number he's thinking of as you make guesses, so part of the game would be guessing what he's thinking.
If I were to take the bet with him, I'd make him write down the number first hide it (turn the paper over / put it under a book, whatever).
I guess this is part of the clarifications one normally asks when in an interview setting, but he has specified numbers and not integers. One could choose (pi*2)/2 and you will owe a lot of money.
https://docs.google.com/spreadsheets/d/e/2PACX-1vThljkK2nUIL...
I find it interesting: it is definitely symmetrical, but I did not expect that in the final result 1/98 could be more important as a starting value, while 2-17/82-97 are not used at all.
The initial set of strategies wasn't very diverse and compensated for the binary search "weaknesses" on the ends of the spectrum by sometimes guessing 1 and 98.
But after adding some more pure strategies to the set, we've got a far better mixed strategy that prefers the numbers between 28-70 as the first pick: https://github.com/gukoff/ballmer_puzzle#winning-strategy
> Avg win if Ballmer chooses randomly: $0.16247848000093376
> Win if Ballmer chooses adversarially: $0.14657033010415976
So the goal is to find a set of strategies where adversarial avg win == random avg win? Or these numbers will never be equal?You can get the EV a lot closer to the optimal +0.2 (Although I was unable to prove how close) by dropping the requirement "do not increase worst-case complexity for the binary search" as this is lost with initial guesses outside 36-64 anyway. Deviating at a higher depth makes punishing specific guesses in the tails a lot cheaper, only giving up 1-2 cents of EV.
Otherwise it is a hidden mutable information game where Ballmer dynamically changes higher/lower for maximum tree depth and always make you lose.
As others said, if you don't expect adversary behavior in your data, it should be good enough.
> If Ballmer is choosing his secret number uniformly at random, then the expected value of the game is [that you win $0.20]. But, as Ballmer points out in the linked video, if he knows you’re going to do a plain old binary search, then he certainly won’t ever choose 50 as his secret number. In fact, he has no reason to choose 25 or 75 either. Or 12, 13, 37, 38, 62, 63, 87, or 88. If Ballmer avoids those eleven numbers, and chooses uniformly at random from among the rest, that alone is enough to lower the expected value of the game from [$0.20] to about [−$0.0045].
So I think Ballmer was being perfectly honest in what he said: he does know a strategy that makes the expected value of binary search counterintuitively negative, and that strategy is (as he says explicitly) to avoid the first few numbers that you're going to guess. No further speculation about errors or deception on his part is needed.
Huh? I have 100 - (1+2+4+8+16+32) = 100 - 63 = 37, where 2^i numbers can be guessed after exactly i wrong guesses plus one correct guess.
(Also, since six bits will serve to identify 64 different numbers, it should be impossible to have more than 36 numbers that can't be identified that way.)
I'll update with boring manual data later.
---- update ----
That was wrong; using a naive guessing method, I found 37 values that require 7 guesses.
It seems the real moral here is: The best time to plant a tree was 20 years ago.
a better moral of the story would be "a billion dollars does not guarantee that someone is right"
- nowhere it says he has to choose whole number, he could choose fractions (55.25) or even irrational like PI. Number of questions can be infinitive.
- nowhere it says, he may not change his number while the game runs.
You pay upfront for each question, and you hope game is not somehow rigged. It is not just question of algorithms.
Also money you win is a taxable income, payments for hazard are not taxable expenses...
I'd recommend never learning about philosophy as you'll disappear into nihilsm.
And lottery wins aren't taxable every where on the planet (e.g. the UK), so you made the same "mistake" as the author too!