Goldman Sachs model to predict World Cup game results didn’t come close (opens in new tab)

(bloomberg.com)

245 pointsrodionos7y ago130 comments

130 comments

yk7y ago

> And in any case, the model only generated probabilities of winning a game and advancing, and no team was given more than an 18.5 percent chance of winning the World Cup.

> [...]

> But Goldman Sach’s misfire is perhaps the most curious.

The model said, that there is a lot of uncertainty, and as it happens, it was entirely correct. A World Cup chance of 18.5 percent means, that 4 out of 5 times the team will not win, and that that is the highest chance does not say much about the model.

And in general this is one instance of the well practiced journalistic technique to wait for results first and then define a bar afterwards to criticize the results according to standards that did not exist when the performance happened. (I guess in this case it is even worse, we could construct a reasonable test of the model performed, I have the suspicion that that was in the original paper and that the journalist either did not understand it, or, more likely, choose to ignore it in favor of writing a better story.)

basch7y ago

Their model also had France at 2nd most likely, Belgium at 5th, and England at 7th. 3 of their top 7 made the Semi-Finals, and they called the eventual winner as Second Most Likely, and more likely than Germany. They actually predicted the Brazil/Belgium game in the Quarter Finals, but got the winner wrong. Brazil had 27 shots and 9 on target with 59% posession. Belgium only had three shots on target, and made two of them to win.

They overranked Germany, and underranked Croatia. Nearly every other person in the world did the same.

Look how disingenuous the Bloomberg article is. "Goldman Sachs updated the model throughout the tournament. It predicted a Brazil-Spain final on June 29 and Brazil-France on July 4. Its most recent prediction had England and Belgium squaring off for the cup. Both were eliminated in the semifinals." But their actual Brazil-France prediction had 8 teams left, and the winners of that round were all in the top 5. https://twitter.com/GoldmanSachs/statuses/101448576794142720... They even had Croatia over England, and France over Belgium.

PaulRobinson7y ago

> Brazil had 27 shots and 9 on target with 59% posession. Belgium only had three shots on target, and made two of them to win.

A modern model would accommodate for the fact that those numbers alone mean nothing, because they don't. Those are the numbers broadcasters reluctantly put on a screen for entertainment value, but they don't have real analytical power because they have no comparative metric.

How up or down were each of those numbers against previous wins and losses for each team?

What was Brazil's conversion from on-target shots before the tournament?

What was Belgium's success/failure rate on on-target shots they were defending against?

Likewise the other way around: were Brazil guilty of particularly poor defending? Were Belgium finding ways of making on-target shots count against all opposition, or was it luck on this game?

Any human analyst could tell you going into that game that Belgium were "lucky" and easily free scoring beyond expectations, able to make more of fewer opportunities. Likewise the consensus from most experts was that Brazil were guilty of mild complacency, the team were young and not yet formed into a strong unit yet (rather still just 11 strong individuals at any one point in time), and their on-target shots - whilst frequent - were of lower probability of being able to turn into goals due to distance, power, position, etc.

So why did the Bloomberg model not pick that up?

I actually think they did pretty well all things considering, but I'd love to see whether they did any runs on previous World cups to try and check their thinking and whether they over-fitted a little to a couple of key metrics. I think the lack of metrics from previous games might mean they relied on some headline numbers, but there's more that they could have done to get a better model here...

Still, it's not their job is it? Just a bit of fun... which is a good job, because I find it just a little bit amusing.

3 more replies

bambax7y ago

> The model said, that there is a lot of uncertainty, and as it happens, it was entirely correct. A World Cup chance of 18.5 percent means, that 4 out of 5 times the team will not win, and that that is the highest chance does not say much about the model.

But do you need a sophisticated model and lots of so-called "AI" to arrive at the conclusion that there's a lot of uncertainty?? The point of the model is to reduce uncertainty, not find that it's there and do nothing about it.

thousandautumns7y ago

The point of the model is absolutely not to reduce uncertainty, it is to quantify it, which are two very different things. No model reduces uncertainty in a probabilistic sense.

And no, you don’t need statistics or machine learning to say “there is a lot of uncertainty”, but you do in order to quantify that uncertainty.

2 more replies

pbhjpbhj7y ago

Uncertainty is a truism; that's why people want to use a prediction algo. Did the system so better on results it was more certain about?

Predicting the result of an A or B contest the bar is already defined. Either the system gets it right or doesn't, if it gets it right more often than not then (despite this being poor grounds mathematically, on a small result pool) popular press will report it as successful.

IMO if matches become easy to predict then rules will change to reduce that predictability.

LeifCarrotson7y ago

> Predicting the result of an A or B contest the bar is already defined.

I disagree: If team A has a 10-30% chance of winning, and A pulls off the upset, the correct answer was not "A Wins" it was "B has a 70-90% chance of winning".

For Goldman Sachs' investments, the bar is not to predict that A wins or that B wins, it's to predict the probability and variance regarding which team will win. Of course, from a single upset game, it's impossible to tell whether these estimates are correct. You'd need to see the success or failure of many trials.

1 more reply

usgroup7y ago

Model totally sucked against betting odds and if you used the model probabilities to price bets you would have lost a lot of money vs even an average bookmaker.

Score it yourself against implied probabilities from Betfair for example and marvel at the suckage.

jbob20007y ago

But Goldman Sachs are the kings of predicting uncertainty! This is their whole business! They make billions predicting certainty through the murky, uncertain waters of the global economy. Would you argue that the global economy is more uncertain that soccer? I'd say so. How is it that they can find success in the market but not in soccer?

I think this is a smoke signal. Soccer is corrupt; you can't predict the winner unless you know what's being passed around under the table. Goldman Sachs does these predictions so people read between the lines to see how corrupt it is.

My argument is: "Goldman is amazing at statistical analysis and they routinely practice it on much tougher models (the global economy), so they should have no problem predicting a simpler model (soccer). But since they drastically failed at predicting soccer, then there must be an equally drastic variable missing from their predictions. Since we can trust Goldman to use all available public information in their analysis, there must be critical information that is hidden from the public which affects the outcomes". I make some assumptions, but it's fairly sound, no?

appleiigs7y ago

Goldman's business model is not to predict the future. Goldman has 2 business models: 1) transfer risk, 2) provide advice. For #1, it's a middleman. For #2, it's paid for brain power, experience and speed.

cepth7y ago

Unclear if your comment is tongue in cheek, but assuming that you're serious, I'd encourage you to give a listen to a podcast episode like this: https://soundcloud.com/bettheprocess/episode-35-ted-knutson.

In the world of sports betting/analytics, you have baseball and basketball at the forefront, and then American football, soccer, and hockey (roughly in that order).

Off the top of my head, there are several reasons why the latter three sports have all lagged behind:

-Lack of data

It wasn't until the last 4-5 years that widely available, affordable, and accurate data for soccer matches was available. Companies like Opta have accomplished this by outsourcing the watching of games and the manual tagging of events, which was made possible by the advent of cheap cloud computing.

It should be self-evident why tracking the position and actions of 22 players is more complicated than something like baseball, where for the most part you are looking at one pitcher vs. one batter, much of which can be automated with computer vision that tracks pitch position, speed, and spin.

-Complexity

It's no accident that baseball was the first sport to be revolutionized by analytics. Most of the time, it's a static game, with a clearly defined action set. I.e. do I swing at the pitch or not. Do I throw a fastball or not. Do I attempt to steal a base or not.

In games like American football, soccer, and hockey, you have anywhere from 12-22 players on the field at a time. Tracking what the players without the ball or the puck are doing is a difficult task technically, as is quantifying their impact. Concepts like expected goals and expected goals added are recent ones.

-Sample size

Typical elite soccer leagues see each team play each other twice. In England and Spain, this means you have 38 games per season.

Baseball has a 162 game season and playoff games, basketball has an 82 game season and playoff games, etc. Coupled with the fact that quality data has been only collected for a few years, and you get other problems.

In basketball and baseball, the effects of aging on player performance and statistics is fairly well understood now. We can generally calculate the 5-year market value of a player etc. In the other sports I mentioned, we don't yet have that kind of time series data to be able to make those judgements.

Specific to the World Cup, there are other reasons why you may find it hard to predict results.

-Team chemistry and style

Even though the World Cup is the most high-profile soccer event in the world, most players are spending 1-3 months a year with their national teams. Their "day jobs" with their clubs teams take up most of their playing time and attention.

As anyone who has played the game Football Manager will know, managing a national team is a tough job. You have no say over how the players are practicing when they're away from you, and no control over the physical condition in which they arrive at the World Cup. This year, there was barely a month between the end of the regular European seasons and the start of the World Cup.

In that month's time, you have to get at least 11 players who have not played with each other, to learn your style of play. Do you want to play a pressing style? Are you attempting a slow buildup, or trying long balls? Etc. etc.

-Home field advantage

In baseball and basketball, most modern statistical models account for home field advantage. Having 60,000 Russian fans chanting and heckling likely played a role in the team's ability to upset Spain, particularly during penalty kicks.

This goes back to the sample issue. How many times before have Spain played Russia IN Russia in front of a large crowd? Probably never.

---

All this is to say, cut Goldman some slack. There are a number of non-nefarious reasons why you may expect a soccer model to produce some spectacular miscues.

2 more replies

rco87867y ago

> you can't predict the winner unless you know what's being passed around under the table

Pretty sure you just inadvertently identified why GS is so “great” at predicting economic movements.

jasode7y ago

Leonid Bershidsky and a lot of other journalists laughing at Goldman Sachs' incorrect predictions seem to miss the point.

The World Cup predictions from Goldman Sachs (and also UBS) are a form of recreation and entertainment with machine learning. It's an expression of quant nerd humor.

Analogous intellectual games would be engineers devising ridiculous Rube Goldberg contraptions[1] or programmers building "enterprise" FizzBuzz[2].

(I think it would add to the fun if GS uploaded their raw data and models to Github for others to play with.)

>It certainly didn't predict the final opposing France and Croatia on Sunday.

True, but it did predict France having better chance winning overall but was handicapped by a tougher draw. It also predicted France beating Croatia in round 16 instead of the final. The pdf says:

>While Germany is more likely to get to the final, France has a marginally higher overall chance of winning the tournament,

[1] https://en.wikipedia.org/wiki/Rube_Goldberg_Machine_Contest#...

[2] https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...

learnstats27y ago

On the other hand, this is a predictive task that has defined outcomes and clear historical data - by my understanding, it is easier than commercial uses of machine learning [at least, easier to measure the effectiveness].

It's also Goldman Sachs and UBS choosing to attach their names to these and stake some reputation on these predictions. If they had hit the bullseye, they would be lauding these results.

CoryG897y ago

It may be easier to measure the effectiveness (give a confidence level for the prediction), but just because there is clear historical data and defined outcomes, that does not mean you will be able to predict a particular outcome with any high level of certainty.

For example, imagine a tournament with a large number of participants, where the winner is picked simply by fairly choosing a single random participant.

If I then gave you all the perfect historical data going back decades, you could do statistical analysis and determine that the winner is completely random and therefore the probability of success, for any particular participant, is p~=(1/n), where n is the number of participants. Your confidence in correctly predicting any particular outcome will drop as n rises.

Not everything can be easily predicted just because you have enough data.

sgt1017y ago

Yup, the worst thing is that if they had got it right it would have been more or less due to pure chance, and it would have led to business flowing their way!

keiferski7y ago

Yeah, I don't know...this feels a bit like if we ended up correct, we were clearly serious, but if we ended up incorrect, it was clearly a joke.

raverbashing7y ago

People conflate statistics with actual results more often than not and I think those reporting on such stories and maybe even the original authors might fall for this.

It was not wrong to say Hillary had a 95% chance of winning the presidential election, but the confidence was low and that value still allowed for the opposite result to happen.

Also football has a lot of variance concerning team capability and end results. The better team might (and does) lose often, especially when going to penalty shoots.

With basketball, the stronger team will be easily scoring more in most cases.

corpMaverick7y ago

According to 538 Hillary had a 70% chance. Yes. Many people completely interpreted that like 70% would vote for Clinton. I had to explain the meaning of "70% chance" to people who should know better. I think they just heard the number and didn't give it a second thought.

afterburner7y ago

In general, people don't think.

ghaff7y ago

Basketball has a lot of points scored so one or two freak plays or lucky bounces aren't likely to affect an outcome. And there are a lot of games so even a star player having an off night or two isn't likely to affect the outcome of a season.

kgwgk7y ago

> had a 95% chance [...] but the confidence was low

So she had 95% chance of winning with 50% probability or what?

fny7y ago

So this is something that people don't seem to grok quite well, and it really depends on the type of statistical analysis used.

Say you make the assumption that the quantity being estimated is truly fixed: that there's some true value for the force of gravity or some true value for the number of people that vote for X or Y.

The second assumption that comes along is that the stochasticity observed comes from your perspective of observation, and not from the ground truth. To be more blunt, you know that of all the observations you make 95% of them have the probability of yielding the result observed... but the ground truth is still fixed. Gravity has a fixed quantity, despite your experimental error, and you may have been lucky enough to observe it in your sample.

Predicting elections with frequentist methods has this same characteristic, except the observed quantity itself shapeshifts and even lies... so then there are other complications that need to be dealt with.

This is where that 50% feeling comes from. There are two outcomes, one will be true. You're data analysis just tells you that if you repeat your procedure, you'd expect 95% of those result to give you the outcome you observed.

1 more reply

LeifCarrotson7y ago

There are two probabilities in question: The first, of course, is the probability of victory. The second is the probability that the first probability is correct.

Consider: If someone offered to give you $2 every time a fair coin toss came up heads, or take $0.50 every time it came up tails, you'd be foolish not to take that bet a million times as you can because you know that the coin has exactly a 50% chance of coming up heads.

However, if it was an unfair coin, you'd want to know the degree to which it was unfair, and you'd have to measure it. How much do you trust those measurements? You might say that you're 90% sure that the coin has a 40-60% chance of coming up heads, or give a probability of 2% that a $1.04 to $0.96 wager would be profitable while a $1.03 to $0.97 wager would be unprofitable.

Hillary had a 95% chance to win the election. But on top of the fact that 1 in 20 times she'd lose that election if that really was the probability, the 95% number was uncertain because the measurements were difficult to pin down - maybe she'd have lost 1 in 40 times, or maybe she'd have lost 1 in 5 times. All we know now is that she lost, and that many of the assumptions and measurements the pollsters had to make concerning factors like voter turnout, nationalism, corruption, foreign interference, debate results, and fundraising turned out to be inaccurate.

With unfair coin measurements, you can get very accurate numbers with just a handful tests. When predicting election results or World Cup games, you're much less likely to make an accurate estimate. The confidence is an estimate of how likely that estimate is to be accurate.

1 more reply

frockington7y ago

60% of the time, it works every time - Brian Fantana

raverbashing7y ago

it is captured in the 95% but that was probably a bit overestimated (and there are always unknown biases and improbable events can happen)

What's wrong is thinking 95% chance of winning means they will win

1 more reply

thousandautumns7y ago

Sort of. The problem is people get confused about the same statistical methodology they use in other situations when the measure they are estimating is also a percentage.

For example, if you are estimating the height of a male in the US, you would collect data on US males and get the average. But unless you surveyed every male in the US, there is some error associated with your estimate. So you would either construct error bounds (a frequentist approach) or a probability distribution (a Bayesian approach) around the mean height. So your results may dictate that the mean height of the American male is 5’11, plus or minus 2 inches. Those two inches represent uncertainty around your data collection. That’s the exact same thing that is done here, but with a percentage instead of a height. Outlets may predict Hillary winning at 95%, but the reality is their methodology should provide a plus-minus value around that. The problem is that few of them actually report that.

But it gets more confusing. That error bound is only around the mean. Pick a random guy out and not only will he likely not be 5’11, there is a decent chance he will be outside of that range of 5’9 - 6’1. You will get 5’7 guys and 6’4 guys pretty commonly. In the case of the election, it may actually be true that Hillary had a chance between say, 93% and 97% of winning. But even if that is the case, she will still lose between 3-7% of the time. But since we only have one reality to observe, we can’t know if she lost simply because we saw that 3-7% realized, or because they people coming up with that number screwed up. That’s why groups like 538 deserve more leeway. When they say that Donald Truml has a 30% chance of winning, and he does. That’s not that crazy. And therefore there is much less reason to assume they screwed something up than the people who predicted a 5% chance of Trump winning. It’s possible those models were right, but much less so.

boomboomsubban7y ago

The World Cup is about the worst sporting event for data led predictions like this, far too much can rely on a few events that are basically a coin flip. It would be interesting to see how the predictions went for something like the Premiere League tables.

laumars7y ago

Premiere League is over a far too long period of time with variables that can change completely without prediction (managers getting sacked, players leaving / joining, etc).

The reasons events like the World Cup are far more interesting is because it's over a shorter period of time.

I think the problem here isn't the event but rather the sport. Something like snooker or tennis will offer the same brevity over the period but with chance playing a less significant role due to the number of games played per match.

That all said, if my years of watching snooker has taught me anything, it's that people are not machines and thus will perform vastly different from day to day depending on how what mood they're in.

JanisL7y ago

Interesting observations here about team sports have a bunch of extra opportunities for randomness. Do you know if there's anything equivalent to Fargo Rate for snooker (or other individual sports like tennis)?

amelius7y ago

I'd like to see some scientific evidence of this (i.e. using multiple experiments, null hypothesis, etc.)

sgt1017y ago

I think that you have to come at this from an analytical direction rather than an empirical one; the problem is that everyone can say that the approaches that you have used to show how difficult it is to model the world cup are just the wrong ones and you need to redo the experiment.

Analytically the difference between premier league and the world cup is that you have momentum and continuity in the premier league and the world cup is essentially one shot. So in the PL team A will play team E and G and H before it plays team B, team B may play team E and H and Q (which played G). Team A may be winning games that your strength model shows they should lose, Team B may be losing games... and so on and so on. There is more evidence that might matter. More importantly you can be wrong quite a lot of the time in a season and still be right at the end of it (as the bounces of the ball even out over time). Not so much in the world cup - one goal knocks you out and there is no coming back! Basically the world cup demands an algorithm that works with less evidence and with a much higher degree of accuracy.

fwdpropaganda7y ago

> far too much can rely on a few events that are basically a coin flip

Can you give us some examples?

gtr7y ago

As there are relatively few goals, anything that can turn a goal into not a goal or vice versa can have a massive impact on the game. For example the penalty decision against Croatia in the final. Another thing that adds to the randomness is the chance that a key player may be sent off or injured.

1 more reply

kryptiskt7y ago

For example: Croatia got to the final via two penalty shootouts, which is very much like coin flipping.

1 more reply

zaphirplane7y ago

Someone deciding to handball the ball out of the way, a tackle that goes harder in a temper flare. A penalty that is saved/missed

megaman227y ago

It's an incredibly low scoring sport with a single-elimination bracket. A fluke goal can swing the whole bracket.

In the NBA, NHL, or MLB, seven game series tend to even out the variance, so the best team usually wins. And even in NCAA basketball, there's enough scoring that any individual play loses significance.

anonu7y ago

People love to beat up on these companies because of this stupid world cup prediction. Yes, Goldman is a giant vampire squid wrapped around the face of humanity (Matt Taibi quote). But it turns out it's really just great marketing for their research teams.

Also, I've seen some people say (not in this forum) that banks now look stupid because they're in the business of making predictions and they can't even get the world cup right. Guess what? Banks make no money on predictions. They make money on flows and taking spreads on trades they do with clients. Any research or prediction is meant to be a catalyst for that trade.

jasode7y ago

>Banks make no money on predictions. They make money on flows and taking spreads on trades they do with clients.

You're mostly right but to further clarify, an investment bank like Goldman Sachs has revenue from mostly "market making" spreads but it does also have activities that depend on predictions such as their proprietary trading (before the Volcker Rule shut them down) and their GSAM (Goldman Sachs Asset Management) fund. The GSAM is basically a hedge fund for their wealthy clients' money. They will run predictions on macro trends on data like interest rates, commodities, indexes, etc to help them pick stocks for their portfolio.

As the pdf noted, the World Cup data models and simulations came from Adam Atkins of GSAM.

anonu7y ago

The Volcker Rule shutdown approximately 0 amount of proprietary trading on wall street. Any articles you can point me to were merely media stunts by their respective firms.

The rule was too complex and onerous to be implemtable. Case in point, it's already being rolled back... Certainly because of the current administration we're in. But more because it was just a poorly written and thought out idea to start with.

ig17y ago

The line between marketing making and prop trading is blurrier than you think. Whenever you quote a price you're implicitly making a prediction on the future of the market.

chopin7y ago

I am pretty sure Banks make money on predictions if they get people on following them.

anonu7y ago

Yes... This is less a prediction and more a legal form of front running.

1 more reply

crispyambulance7y ago

I am somewhat shocked that GS would jump into the prediction business of the World Cup, even as joke. The risk of people getting the wrong idea about the prediction and GS itself is too great, even with a perfectly defensible model.

This is an enterprise for bookies, not Goldman Sachs.

TuringNYC7y ago

FYI - I worked at Goldman Sachs and then a hedge fund for a decade. On the Capital Markets / Trading side, you are literally a bookie. In fact the nomenclature is "you have a book." You are setting trading spreads based on where you think things will go. Depending on the market, your work may be more or less statistical and you're trying to gain a statistical advantage.

Maro7y ago

Off-topic: were you able to retire after that decade?

2 more replies

soVeryTired7y ago

Surely your book is meant to be hedged though?

1 more reply

denzil_correa7y ago

The "Ludic Fallacy" strikes again [0].

> The ludic fallacy, identified by Nassim Nicholas Taleb in his 2007 book The Black Swan, is "the misuse of games to model real-life situations."

...

> The alleged fallacy is a central argument in the book and a rebuttal of the predictive mathematical models used to predict the future – as well as an attack on the idea of applying naïve and simplified statistical models in complex domains. According to Taleb, statistics is applicable only in some domains, for instance casinos in which the odds are visible and defined.

Both Taleb's books, "The Black Swan" and "Fooled by Randomness" are an interesting take for such models. Meanwhile, most economists know about "Knightian Uncertainty" [1] which talks about differentiation of risk and uncertainty.

> "Uncertainty must be taken in a sense radically distinct from the familiar notion of Risk, from which it has never been properly separated.... The essential fact is that 'risk' means in some cases a quantity susceptible of measurement, while at other times it is something distinctly not of this character; and there are far-reaching and crucial differences in the bearings of the phenomena depending on which of the two is really present and operating.... It will appear that a measurable uncertainty, or 'risk' proper, as we shall use the term, is so far different from an unmeasurable one that it is not in effect an uncertainty at all."

[0] https://en.wikipedia.org/wiki/Ludic_fallacy

[1] https://en.wikipedia.org/wiki/Knightian_uncertainty

fwdpropaganda7y ago

Damn, do I disliked Nassim Taleb. I don't think I've ever heard him say anything deep. That wikipedia article is an excellent.

In [0] you have the following:

> The ludic fallacy, identified by Nassim Nicholas Taleb in his 2007 book The Black Swan, is "the misuse of games to model real-life situations."

And he gives an example of this:

> One example given in the book is the following thought experiment. Two people are involved:

> Dr. John who is regarded as a man of science and logical thinking

> Fat Tony who is regarded as a man who lives by his wits

> A third party asks them to "assume that a coin is fair, i.e., has an equal probability of coming up heads or tails when flipped. I flip it ninety-nine times and get heads each time. What are the odds of my getting tails on my next throw?"

> Dr. John says that the odds are not affected by the previous outcomes so the odds must still be 50:50.

> Fat Tony says that the odds of the coin coming up heads 99 times in a row are so low that the initial assumption that the coin had a 50:50 chance of coming up heads is most likely incorrect. "The coin gotta be loaded. It can't be a fair game."

> The ludic fallacy here is to assume that in real life the rules from the purely hypothetical model (where Dr. John is correct) apply. Would a reasonable person bet on black on a roulette table that has come up red 99 times in a row (especially as the reward for a correct guess is so low when compared with the probable odds that the game is fixed)?

So Nassim Taleb wanted to discuss "using games to model real-life situations" and to demonstrate the pitfalls he uses two characters. He _portrays_ the characters as "man of logical thinking" vs "man who lives by his wits", but as we'll see he's missing one dimension to his characterization.

The first problem here is that implicitely he's suggesting to the reader that the decisions of the "man of logical thinking" represent the pitfalls of "applying games to model real-life situations", whereas the the other guy's decision represent.... it's not specified, but clearly has a better outcome.

The second problem, is that he conflates "applying something you read on some textbook to real life without thinking" with "modelling real-life". He suggests to the reader that those two people are actually "logical" vs "instinct", but they're not. They're a dumb guy who knows maths vs a smart guy who doesn't know math. _Obviously_ real-life is more complex than your textbook examples, and so the smart guy is going to win because his fuzzy heuristics beat the first guys decisions which are optimal within his flawed model. An actual smart and logical person would update his model based on new evidence (i.e. "I was told that this coin was 50-50 but actually the chance of what I just saw is so small that it's more likely that I was just lied to") and then use maths to make predictions and beat the guy who's smart but doesn't know math.

So ironically, he wants to portray the dangers of using over-simplified models and to do that he uses an example where he obscured one dimension.

Nassim Taleb is really good a rhetoric but light on substance.

[0] https://en.wikipedia.org/wiki/Ludic_fallacy

neuromantik80867y ago

Nassim Taleb is basically a stopped clock. He's pretty big on pointing on how we're all prone to find illusory correlations (not his discovery) and he's a great promoter for Kahneman and Tversky, but there are other areas where he clearly is out of his depth. It's beyond obvious, for instance, that he's never gotten past Popper in his studies of philosophy of science. Unfortunately, his disciples are (ironically) quite terrible at thinking for themselves and buy into his demagoguery.

Basically a book by Nassim Taleb is an incoherent summary of the books that Nassim Taleb has read within the past year, with a few morsels of recycled insight here and there.

thousandautumns7y ago

Agreed. His ideas are often completely incoherent and occasionally devolve into a mixture of pseudoscience and philosophical babble. He is also difficult to argue against because he is essentially the statistics equivalent of a nihilist.

I’m not sure why there are so many people who take him seriously.

lowkeyokay7y ago

If anything, this is a clear illustration of poor use of probabilistic prediction. When used for investments you have many outcomes. If the model is any good, you will most of them right. In the World Cup you have very few. Even if you count all games played. Definitely not excusing Goldman Sachs here, they should have known better than to try to predict this. There was only a tiny chance this could be great advertisement for their model.

Ntrails7y ago

> they should have known better than to try to predict this.

There's no downside, only free publicity. If they, by good fortune and a following wind, get it right - then the publicity is incredible. If it's wrong they laugh and say "well, better stick to predicting what we're good at!" and they still get a shitload of headlines and awareness of their product.

This was not a mistake.

jamespo7y ago

Well, there's the downside of articles like this pointing out they've had 4 years to work on their models and they've got worse

geraldbauer7y ago

PS: If you want to build or train your own model or make predications, you can find open (structured) data about all world cups at the football.db, see https://github.com/openfootball/world-cup and https://github.com/openfootball/world-cup.json Enjoy the beautiful game.

kgwgk7y ago

The predictions were not so bad. At least one of the favourites won in the end. GS had France winning with 11.3% probability, second to Brazil with 18.5%. UBS was less fortunate, they had Germany (24%), Brazil (19.8%), Spain (16.1%) and England (8.5%) before France (7.3%).

I compared the logloss for their predictions with the "uniform" benchmark (giving each team 1/32 probability of winning, 1/16 probability of getting to the finals, etc) and the results are the following (if I transcribed the data properly):

Getting to second round:

GS: 0.495 UBS: 0.495 bench: 0.693

Getting to quarter-finals:

GS: 0.463 UBS: 0.459 bench: 0.562

Getting to semi-finals:

GS: 0.310 UBS: 0.327 bench: 0.377

Getting to final:

GS: 0.231 UBS: 0.269 bench: 0.234

World-cap winner:

GS: 0.097 UBS: 0.113 bench: 0.139

The performance of the models was ok until Croatia got to the finals. This hurt specially UBS, who predicted less than 0.9% probability of such an event (compared to 2.1% in Goldman's model).

Edit: these would have been the "best case" scores (if the high-probabilty teams had classified to each round, ignoring that this may be impossible due to the structure of the tournament):

GS: 0.432 0.302 0.220 0.141 0.079

UBS: 0.365 0.251 0.176 0.111 0.070

UBS could potentially achive lower logloss metrics because it had more extreme predictions.

cascom7y ago

Isn’t this a little like flipping a coin four times - getting heads four times in a row, and looking at your friend and saying “but you told me the odds were 50/50 each flip?!”

thousandautumns7y ago

Yes, it is.

rcdmd7y ago

This article didn't compare the Goldman Sachs model to any other models-- why not compare it with sports betting odds? Would Goldman have made or lost money betting their model was better than the crowd?

sunstone7y ago

Or compare it with the fivethirtyeight blog predictions.

vl7y ago

>Soccer, with the many factors that affect game outcomes — players’ injuries and intra-team conflicts, the refereeing, the weather, coaches’ errors and moments of inspiration — remains only a tightly-regulated game involving a few dozen people. The behavior and performance of big corporations, entire industries and nations is arguably even more difficult to model based on data about the past.

Author misses the way models work entirely, the larger the entity, the more statistics and averages kick in, and as a result, better model can be built.

Donald7y ago

Depends on the complexity of the interactions between variables. There are plenty of examples where we have excellent local models, but make (comparatively) worse prediction at scale. A pretty classic example is biology - we have excellent knowledge about how genotypes work and their interactions in cells, but models of phenotypes are typically expensive, error-prone, or non-existent.

dmichulke7y ago

I watched quite a few matches and among the things I saw in the matches but not in any statistics are:

- motivation (Germany and Croatia were the two extremes here, no idea how to measure it)

- team cohesion (number of articles in a few journals questioning the team cohesion, maybe also articles about individual players)

- creativity in offense (maybe measurable via "target missed from close distance" + "ball passed front of the goal")

- number of errors in defense that didn't lead to a goal

- percentage of times ball possession was lost from own goal to enemy's area (England was really bad here against Croatia)

lagadu7y ago

> - creativity in offense (maybe measurable via "target missed from close distance" + "ball passed front of the goal")

This one would benefit possession-based teams, so it would fail to give decent odds to the current world and european champions (France and Portugal respectively) which don't play possession. Of course it's possible they're outliers but we'll never know.

smcl7y ago

These kinda show what makes predicting football particularly difficult. I like the ideas, and I think we (or more likely some ML algorithm) can come up with the set of conditions that showed why France prevailed against the specific opposition at this specific World Cup ... but I suspect that the conditions would be pretty unique and invalid for Euro 2020, WC 2022 etc.

As you identified, motivation could be pretty hard to measure ... but even if we could it might be a pretty poor predictor anyway. France in the early stages didn't look very motivated, while England and Colombia looked pretty lively.

Team cohesion - the German team were pretty consistent (not dazzling, but consistent) and we know how that ended. Again France didn't really impress until the latter stages of the WC.

Creativity in offense - I guess it can indicate a sort of calm or confidence in front of goal but actually it can actually be seen as pretty negative. For example Arsenal a few years back came under fire for having plenty of possession in the 18 yard box but failing to convert. Spain's confident quick pass-and-move "tiki-taka" was ever-present and has in my eyes been impotent in the last few years (and more important as a neutral viewer - very frustrating to watch).

Defensive errors that didn't lead to a goal could be a nice indicator of the ability of a defence to pick up after each others mistakes - but at the same time these errors that lead to goals (i.e. Croatia's second goal in the final) are relatively rare and a lack of a goal could just point to the opposing team's inability to convert due to a poorly organised or a lack of opportunism from their strikers.

I'm not sure what you mean with the last one, but I think this could be a nice one - if you mean "times you lost possession in your own half". A profligate midfield and defence is bound to ship goals, I doubt there are many teams that can either fight back after trailing by a goal or two or score enough to maintain a reasonable buffer.

I applaud the effort though - it takes more creativity and care to think of some new angles (like you did) than to think of some possible counter examples (like I did)!

dmichulke7y ago

Thank you for the warm words, I guess the reason is my occupation plus the fact that I just spend my last few weeks watching many games with family and friends.

> I'm not sure what you mean with the last one, but I think this could be a nice one - if you mean "times you lost possession in your own half"

Almost, England lost the ball frequently (> 50+x% with a large x AFAI could see) due to the keeper sending out long balls. I'd like to measure that somehow. Could be done via number of seconds in possession after a goal kick, an indicator whether a hypothetical 85% marker of the field was reached or measuring whether the ball was at least 5x successfully passed (or resulted in a goal).

1 more reply

iainmerrick7y ago

Thanks to the use of more granular data, made possible by AI, this year’s model should have worked better than the 2014 one.

If anything, it worked worse.

"If anything"? All the results are available, so it would be easy to put a precise number on this. Measure the Bayesian regret, or just report the winnings if you had used the GS model to bet on the outcomes. Unless it reports some concrete numbers, this article is garbage.

It doesn't report any concrete numbers.

corpMaverick7y ago

Soccer is a sport with a big random component. This is probably why it is so exciting. An average team can beat a better team.

The reason is easy to see. The game can be decided by one, two or three key plays. Compare that to basket ball. To win a game you have to consistently score more and defend better. Rarely the game is decided by one or two plays. That only happens when the game is already very tight.

barrkel7y ago

I put money on Belgium (12.0 decimal odds) and Croatia (15.0) after the group stages, where some form was visible, combined with knowledge that they had some of the world's best players.

The odds shortened as the tournament progressed, I was able to hedge as the shortened odds made lay betting profitable.

(High variance in football outcomes means there's no guarantee of profit, I don't bet big sums.)

anoncoward1117y ago

This answer is very useful and contains proper strategy advice :)

If someone were to bet during the round of 16, if someone were to bet $1 on the bottom 8 and $2 on the top 8, the strategy would most likely yield a small profit or a small loss, rather than a total loss.

tirumaraiselvan7y ago

It's a fools errand to predict high variance events like football games.

pbhjpbhj7y ago

Only predict events that are easy to predict, never fail!

patagonia7y ago

Financial modeling is about risk adjust return. Because GS knows they can not determine with certainty the outcome of a given investment, they diversify and hedge. Most of all, GS is a market maker, the equivalent of a bookie. To say that GS’s models “didn’t come close” is to ignore all the ways in which such a grading scheme is different than GS’s actual business model. If their WC prediction efforts acted as anything more than a fun spirited PR project, it was likely that GS wanted to somehow keep its employees engaged and adding business value during the WC which they otherwise would have been certainly watched all month.

rossdavidh7y ago

In addition to the many other problems with this article, I would like to point out that if, somehow, Goldman Sachs had managed to create a model that could accurately predict the results, the game of soccer would have to be changed to make it more unpredictable somehow. It is intrinsic to the nature of sport that, in order to be entertaining, there has to be a realistic chance for more than one team to win. Not many people (even from the winning country) would bother watching if it were accurately predictable.

kulu20027y ago

Good... There was this discussion thread few days back on HN

https://news.ycombinator.com/item?id=17509407

Did this investment bank use same set of algorithms that they use for financial predictions?

...And then I remember there was this Octopus[1] who used to predict winners with 85% accuracy

[1]https://en.wikipedia.org/wiki/Paul_the_Octopus

IkmoIkmo7y ago

You'd have to run this world cup thousands of times by simulation, running it a single time and determining the results are not in line with the model is meaningless and silly.

It's as silly as saying my claim for the odds of nearly perfectly modelling a coin toss (approximately 50/50%) is wrong because a series of 10 coin tosses show different results from my model. The model is not any less correct.

Keyframe7y ago

It's as good time as any to plug in EA's simulation results: https://www.easports.com/fifa/news/2018/ea-sports-predicts-w...

msravi7y ago

Duh. Looks like there's a fundamental misunderstanding of how statistics works all around. The probability of an event does NOT predict a particular outcome. Ever. It only says that if the experiment is performed again and again and again, like a few thousand times, then X% of those will match that probability.

If I toss a fair coin you cannot predict the next outcome. You can only say that if I toss the coin a 1000 times, then close to 500 are going to turn up heads, and another 500 are going to turn up tails.

It was stupid of Goldman Sachs or whoever to predict an outcome. It was stupid of anyone else to lend credence to that prediction.

Hopefully, Goldman Sachs is not relying on prediction of singular outcomes to make their investment decisions. I don't think they are. Probably just marketing brouhaha to ride the soccer wave. Although I'm not sure if that worked as expected.

Sean17087y ago

> It was stupid of Goldman Sachs or whoever to predict an outcome.

If you read the actual report they did[0], they never claimed that any single outcome was more than 18.5% likely.

[0]: http://www.goldmansachs.com/our-thinking/pages/world-cup-201...

pbhjpbhj7y ago

I agree completely with your opening remarks.

>"You can only say that if I toss the coin a 1000 times, then close to 500 are going to turn up heads, and another 500 are going to turn up tails."

Sometimes you can do that and every single flip will be heads. It's unlikely, and across zillions of universes you'd only find it once - but we don't have a pool of universes that we can sample statistically.

hsienmaneja7y ago

They don’t have an edge like they do in their bread and butter markets, combined with a small sample set == high probability of a single year of sports predictions falling over like this

gesman7y ago

If GS would need to bet money - their actual business model would likely be to sell a bit of each higher probability losers (less risk) vs. buy big on a projected winners (higher risk).

blattimwind7y ago

This site is a good counter-example for website optimization: While it uses many assets, so a CDN domain makes sense, it spreads them out thinly. It loads over 100 CSS files, most of which are below 1K. Similarly it loads approximately 30 JS scripts, most of which are just a few K each. This is mitigated to a large extent by using HTTP/2.0, which permits a few dozen or so parallel requests, but it still means that a repeated load of the page takes 2-3 seconds. (Without HTTP/2.0 this probably takes ages, since browsers open only a few connections to each origin at most). There is also almost no difference between reloading with and without the cache.

rdlecler17y ago

In the world of models increasing precision for not necessarily increase accuracy.

Sean17087y ago

In case anyone was interested here is a table of how likely the model thought each team was to make it through any particular stage[0] along with the stage that that team went out in and the probability that the model gave for that particular outcome (i.e. [probability of making it through the final stage they made it through] - [probability of making it through the stage they went out in]).

                Groups  Round_16  Quarters  Semis  Finals    Out_In  Probability
        Brazil   87.5%     60.8%     42.0%  27.9%   18.5%  Quarters        18.8%
        France   81.4%     58.4%     36.6%  19.9%   11.3%       Won        11.3%
       Germany   80.5%     49.5%     30.5%  18.8%   10.7%    Groups        19.5%
      Portugal   75.2%     52.8%     32.2%  17.3%    9.4%  Round_16        22.4%
       Belgium   78.5%     51.1%     27.7%  15.8%    8.2%     Semis        11.9%
         Spain   72.3%     50.1%     28.8%  15.4%    7.8%  Round_16        22.2%
       England   73.1%     46.6%     24.4%  13.4%    6.5%     Semis        11.0%
     Argentina   79.7%     44.2%     24.1%  11.8%    5.7%  Round_16        35.5%
      Colombia   74.9%     37.3%     17.0%   8.5%    3.7%  Round_16        37.6%
       Uruguay   74.4%     34.6%     17.2%   7.2%    3.2%  Quarters        17.4%
        Poland   68.5%     30.5%     12.8%   5.8%    2.3%    Groups        31.5%
       Denmark   47.8%     26.3%     12.4%   5.2%    2.0%  Round_16        21.5%
        Mexico   52.0%     23.2%     10.5%   4.9%    1.9%  Round_16        28.8%
        Sweden   45.9%     19.4%      8.3%   3.7%    1.3%  Quarters        11.1%
          Iran   35.4%     18.1%      7.2%   2.6%    0.8%    Groups        64.6%
          Peru   37.3%     17.2%      6.8%   2.5%    0.8%    Groups        62.7%
     Australia   33.5%     15.4%      6.3%   2.3%    0.7%    Groups        66.5%
        Russia   47.9%     16.3%      6.0%   2.0%    0.7%  Quarters        10.3%
       Croatia   49.8%     16.9%      6.3%   2.1%    0.6%    Finals         4.2%
   Switzerland   52.8%     15.9%      6.1%   2.0%    0.6%  Round_16        36.9%
       Iceland   45.2%     15.1%      5.6%   1.8%    0.5%    Groups        54.8%
    Costa_Rica   36.8%     13.3%      4.7%   1.6%    0.5%    Groups        63.2%
        Serbia   32.9%     12.1%      4.5%   1.5%    0.5%    Groups        67.1%
         Japan   36.5%     12.8%      3.8%   1.3%    0.4%  Round_16        23.7%
  Saudi_Arabia   43.4%     12.7%      4.2%   1.3%    0.4%    Groups        56.6%
       Tunisia   35.2%     13.3%      4.1%   1.3%    0.4%    Groups        64.8%
         Egypt   34.4%      8.7%      2.5%   0.7%    0.2%    Groups        65.6%
   South_Korea   21.6%      5.9%      7.1%   0.5%    0.2%    Groups        78.4%
       Morocco   17.1%      6.8%      1.8%   0.5%    0.1%    Groups        82.9%
       Nigeria   25.2%      6.5%      1.7%   0.4%    0.0%    Groups        74.8%
       Senegal   20.1%      4.9%      1.2%   0.3%    0.0%    Groups        79.9%
        Panama   13.2%      3.3%      0.5%   0.1%    0.0%    Groups        86.8%

[0]: Exhibit 2 in http://www.goldmansachs.com/our-thinking/pages/world-cup-201...

Edit: Fix copy-paste errors and atrocious maths.

ernesth7y ago

Japan went out in Round_16

Croatia went out in Finals

And I do not understand what the last column means (except for France and teams out in group phase)

Sean17087y ago

Urgh, I hate that you can't edit HN comments.

First two were just me making a mistake because I write that in manually.

That last column makes no sense. It was supposed to be the probability that the model gave to the outcome that occurred, but I got the maths wrong.

1 more reply

ernesth7y ago

Great work :)

So all in all, the only teams for which the prediction was more than 1/2 were teams out in groups. That is a little underwhelming.

Ah, for Croatia, I believe, it should read 1.5%.

known7y ago

GarbageIn = ML = GarbageOut

known7y ago

I worked in GS; Soccer/football prediction is not their forte

tomelders7y ago

While I agree that it's somewhat silly to try and predict a word cup winner like this (and I suspect it was just a bit of fun anyway), there is one other reason that could explain why all these attempts got it so wrong.

Cheating.

Before people start booing, let's not forget where this tournament is being held, and all the other nefarious things that country has been up to recently.

teamk7y ago

FIFA has been corrupt for decades. Although supposedly its been cleaned up since Blatter was removed, it is doubtful the institutional corruption has been eliminated completely. The only question is how pervasive it is.

j / k navigate · click thread line to collapse

130 comments

yk7y ago

> And in any case, the model only generated probabilities of winning a game and advancing, and no team was given more than an 18.5 percent chance of winning the World Cup.

> [...]

> But Goldman Sach’s misfire is perhaps the most curious.

basch7y ago

They overranked Germany, and underranked Croatia. Nearly every other person in the world did the same.

PaulRobinson7y ago

> Brazil had 27 shots and 9 on target with 59% posession. Belgium only had three shots on target, and made two of them to win.

How up or down were each of those numbers against previous wins and losses for each team?

What was Brazil's conversion from on-target shots before the tournament?

What was Belgium's success/failure rate on on-target shots they were defending against?

Likewise the other way around: were Brazil guilty of particularly poor defending? Were Belgium finding ways of making on-target shots count against all opposition, or was it luck on this game?

So why did the Bloomberg model not pick that up?

Still, it's not their job is it? Just a bit of fun... which is a good job, because I find it just a little bit amusing.

3 more replies

bambax7y ago

thousandautumns7y ago

The point of the model is absolutely not to reduce uncertainty, it is to quantify it, which are two very different things. No model reduces uncertainty in a probabilistic sense.

And no, you don’t need statistics or machine learning to say “there is a lot of uncertainty”, but you do in order to quantify that uncertainty.

2 more replies

pbhjpbhj7y ago

Uncertainty is a truism; that's why people want to use a prediction algo. Did the system so better on results it was more certain about?

IMO if matches become easy to predict then rules will change to reduce that predictability.

LeifCarrotson7y ago

> Predicting the result of an A or B contest the bar is already defined.

I disagree: If team A has a 10-30% chance of winning, and A pulls off the upset, the correct answer was not "A Wins" it was "B has a 70-90% chance of winning".

1 more reply

usgroup7y ago

Model totally sucked against betting odds and if you used the model probabilities to price bets you would have lost a lot of money vs even an average bookmaker.

Score it yourself against implied probabilities from Betfair for example and marvel at the suckage.

jbob20007y ago

appleiigs7y ago

cepth7y ago

In the world of sports betting/analytics, you have baseball and basketball at the forefront, and then American football, soccer, and hockey (roughly in that order).

Off the top of my head, there are several reasons why the latter three sports have all lagged behind:

-Lack of data

-Complexity

-Sample size

Typical elite soccer leagues see each team play each other twice. In England and Spain, this means you have 38 games per season.

Specific to the World Cup, there are other reasons why you may find it hard to predict results.

-Team chemistry and style

-Home field advantage

This goes back to the sample issue. How many times before have Spain played Russia IN Russia in front of a large crowd? Probably never.

---

All this is to say, cut Goldman some slack. There are a number of non-nefarious reasons why you may expect a soccer model to produce some spectacular miscues.

2 more replies

rco87867y ago

> you can't predict the winner unless you know what's being passed around under the table

Pretty sure you just inadvertently identified why GS is so “great” at predicting economic movements.

jasode7y ago

Leonid Bershidsky and a lot of other journalists laughing at Goldman Sachs' incorrect predictions seem to miss the point.

The World Cup predictions from Goldman Sachs (and also UBS) are a form of recreation and entertainment with machine learning. It's an expression of quant nerd humor.

Analogous intellectual games would be engineers devising ridiculous Rube Goldberg contraptions[1] or programmers building "enterprise" FizzBuzz[2].

(I think it would add to the fun if GS uploaded their raw data and models to Github for others to play with.)

>It certainly didn't predict the final opposing France and Croatia on Sunday.

True, but it did predict France having better chance winning overall but was handicapped by a tougher draw. It also predicted France beating Croatia in round 16 instead of the final. The pdf says:

>While Germany is more likely to get to the final, France has a marginally higher overall chance of winning the tournament,

[1] https://en.wikipedia.org/wiki/Rube_Goldberg_Machine_Contest#...

[2] https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...

learnstats27y ago

It's also Goldman Sachs and UBS choosing to attach their names to these and stake some reputation on these predictions. If they had hit the bullseye, they would be lauding these results.

CoryG897y ago

For example, imagine a tournament with a large number of participants, where the winner is picked simply by fairly choosing a single random participant.

Not everything can be easily predicted just because you have enough data.

sgt1017y ago

Yup, the worst thing is that if they had got it right it would have been more or less due to pure chance, and it would have led to business flowing their way!

keiferski7y ago

Yeah, I don't know...this feels a bit like if we ended up correct, we were clearly serious, but if we ended up incorrect, it was clearly a joke.

raverbashing7y ago

People conflate statistics with actual results more often than not and I think those reporting on such stories and maybe even the original authors might fall for this.

It was not wrong to say Hillary had a 95% chance of winning the presidential election, but the confidence was low and that value still allowed for the opposite result to happen.

Also football has a lot of variance concerning team capability and end results. The better team might (and does) lose often, especially when going to penalty shoots.

With basketball, the stronger team will be easily scoring more in most cases.

corpMaverick7y ago

afterburner7y ago

In general, people don't think.

ghaff7y ago

kgwgk7y ago

> had a 95% chance [...] but the confidence was low

So she had 95% chance of winning with 50% probability or what?

fny7y ago

So this is something that people don't seem to grok quite well, and it really depends on the type of statistical analysis used.

Say you make the assumption that the quantity being estimated is truly fixed: that there's some true value for the force of gravity or some true value for the number of people that vote for X or Y.

1 more reply

LeifCarrotson7y ago

There are two probabilities in question: The first, of course, is the probability of victory. The second is the probability that the first probability is correct.

1 more reply

frockington7y ago

60% of the time, it works every time - Brian Fantana

raverbashing7y ago

it is captured in the 95% but that was probably a bit overestimated (and there are always unknown biases and improbable events can happen)

What's wrong is thinking 95% chance of winning means they will win

1 more reply

thousandautumns7y ago

Sort of. The problem is people get confused about the same statistical methodology they use in other situations when the measure they are estimating is also a percentage.

boomboomsubban7y ago

laumars7y ago

Premiere League is over a far too long period of time with variables that can change completely without prediction (managers getting sacked, players leaving / joining, etc).

The reasons events like the World Cup are far more interesting is because it's over a shorter period of time.

That all said, if my years of watching snooker has taught me anything, it's that people are not machines and thus will perform vastly different from day to day depending on how what mood they're in.

JanisL7y ago

amelius7y ago

I'd like to see some scientific evidence of this (i.e. using multiple experiments, null hypothesis, etc.)

sgt1017y ago

fwdpropaganda7y ago

> far too much can rely on a few events that are basically a coin flip

Can you give us some examples?

gtr7y ago

1 more reply

kryptiskt7y ago

For example: Croatia got to the final via two penalty shootouts, which is very much like coin flipping.

1 more reply

zaphirplane7y ago

Someone deciding to handball the ball out of the way, a tackle that goes harder in a temper flare. A penalty that is saved/missed

megaman227y ago

It's an incredibly low scoring sport with a single-elimination bracket. A fluke goal can swing the whole bracket.

anonu7y ago

jasode7y ago

>Banks make no money on predictions. They make money on flows and taking spreads on trades they do with clients.

As the pdf noted, the World Cup data models and simulations came from Adam Atkins of GSAM.

anonu7y ago

The Volcker Rule shutdown approximately 0 amount of proprietary trading on wall street. Any articles you can point me to were merely media stunts by their respective firms.

ig17y ago

The line between marketing making and prop trading is blurrier than you think. Whenever you quote a price you're implicitly making a prediction on the future of the market.

chopin7y ago

I am pretty sure Banks make money on predictions if they get people on following them.

anonu7y ago

Yes... This is less a prediction and more a legal form of front running.

1 more reply

crispyambulance7y ago

This is an enterprise for bookies, not Goldman Sachs.

TuringNYC7y ago

Maro7y ago

Off-topic: were you able to retire after that decade?

2 more replies

soVeryTired7y ago

Surely your book is meant to be hedged though?

1 more reply

denzil_correa7y ago

The "Ludic Fallacy" strikes again [0].

> The ludic fallacy, identified by Nassim Nicholas Taleb in his 2007 book The Black Swan, is "the misuse of games to model real-life situations."

...

[0] https://en.wikipedia.org/wiki/Ludic_fallacy

[1] https://en.wikipedia.org/wiki/Knightian_uncertainty

fwdpropaganda7y ago

Damn, do I disliked Nassim Taleb. I don't think I've ever heard him say anything deep. That wikipedia article is an excellent.

In [0] you have the following:

> The ludic fallacy, identified by Nassim Nicholas Taleb in his 2007 book The Black Swan, is "the misuse of games to model real-life situations."

And he gives an example of this:

> One example given in the book is the following thought experiment. Two people are involved:

> Dr. John who is regarded as a man of science and logical thinking

> Fat Tony who is regarded as a man who lives by his wits

> Dr. John says that the odds are not affected by the previous outcomes so the odds must still be 50:50.

So ironically, he wants to portray the dangers of using over-simplified models and to do that he uses an example where he obscured one dimension.

Nassim Taleb is really good a rhetoric but light on substance.

[0] https://en.wikipedia.org/wiki/Ludic_fallacy

neuromantik80867y ago

Basically a book by Nassim Taleb is an incoherent summary of the books that Nassim Taleb has read within the past year, with a few morsels of recycled insight here and there.

thousandautumns7y ago

I’m not sure why there are so many people who take him seriously.

lowkeyokay7y ago

Ntrails7y ago

> they should have known better than to try to predict this.

This was not a mistake.

jamespo7y ago

Well, there's the downside of articles like this pointing out they've had 4 years to work on their models and they've got worse

geraldbauer7y ago

kgwgk7y ago

Getting to second round:

GS: 0.495 UBS: 0.495 bench: 0.693

Getting to quarter-finals:

GS: 0.463 UBS: 0.459 bench: 0.562

Getting to semi-finals:

GS: 0.310 UBS: 0.327 bench: 0.377

Getting to final:

GS: 0.231 UBS: 0.269 bench: 0.234

World-cap winner:

GS: 0.097 UBS: 0.113 bench: 0.139

The performance of the models was ok until Croatia got to the finals. This hurt specially UBS, who predicted less than 0.9% probability of such an event (compared to 2.1% in Goldman's model).

Edit: these would have been the "best case" scores (if the high-probabilty teams had classified to each round, ignoring that this may be impossible due to the structure of the tournament):

GS: 0.432 0.302 0.220 0.141 0.079

UBS: 0.365 0.251 0.176 0.111 0.070

UBS could potentially achive lower logloss metrics because it had more extreme predictions.

cascom7y ago

Isn’t this a little like flipping a coin four times - getting heads four times in a row, and looking at your friend and saying “but you told me the odds were 50/50 each flip?!”

thousandautumns7y ago

Yes, it is.

rcdmd7y ago

sunstone7y ago

Or compare it with the fivethirtyeight blog predictions.

vl7y ago

Author misses the way models work entirely, the larger the entity, the more statistics and averages kick in, and as a result, better model can be built.

Donald7y ago

dmichulke7y ago

I watched quite a few matches and among the things I saw in the matches but not in any statistics are:

- motivation (Germany and Croatia were the two extremes here, no idea how to measure it)

- team cohesion (number of articles in a few journals questioning the team cohesion, maybe also articles about individual players)

- creativity in offense (maybe measurable via "target missed from close distance" + "ball passed front of the goal")

- number of errors in defense that didn't lead to a goal

- percentage of times ball possession was lost from own goal to enemy's area (England was really bad here against Croatia)

lagadu7y ago

> - creativity in offense (maybe measurable via "target missed from close distance" + "ball passed front of the goal")

smcl7y ago

Team cohesion - the German team were pretty consistent (not dazzling, but consistent) and we know how that ended. Again France didn't really impress until the latter stages of the WC.

I applaud the effort though - it takes more creativity and care to think of some new angles (like you did) than to think of some possible counter examples (like I did)!

dmichulke7y ago

Thank you for the warm words, I guess the reason is my occupation plus the fact that I just spend my last few weeks watching many games with family and friends.

> I'm not sure what you mean with the last one, but I think this could be a nice one - if you mean "times you lost possession in your own half"

1 more reply

iainmerrick7y ago

Thanks to the use of more granular data, made possible by AI, this year’s model should have worked better than the 2014 one.

If anything, it worked worse.

It doesn't report any concrete numbers.

corpMaverick7y ago

Soccer is a sport with a big random component. This is probably why it is so exciting. An average team can beat a better team.

barrkel7y ago

I put money on Belgium (12.0 decimal odds) and Croatia (15.0) after the group stages, where some form was visible, combined with knowledge that they had some of the world's best players.

The odds shortened as the tournament progressed, I was able to hedge as the shortened odds made lay betting profitable.

(High variance in football outcomes means there's no guarantee of profit, I don't bet big sums.)

anoncoward1117y ago

This answer is very useful and contains proper strategy advice :)

tirumaraiselvan7y ago

It's a fools errand to predict high variance events like football games.

pbhjpbhj7y ago

Only predict events that are easy to predict, never fail!

patagonia7y ago

rossdavidh7y ago

kulu20027y ago

Good... There was this discussion thread few days back on HN

https://news.ycombinator.com/item?id=17509407

Did this investment bank use same set of algorithms that they use for financial predictions?

...And then I remember there was this Octopus[1] who used to predict winners with 85% accuracy

[1]https://en.wikipedia.org/wiki/Paul_the_Octopus

IkmoIkmo7y ago

You'd have to run this world cup thousands of times by simulation, running it a single time and determining the results are not in line with the model is meaningless and silly.

Keyframe7y ago

It's as good time as any to plug in EA's simulation results: https://www.easports.com/fifa/news/2018/ea-sports-predicts-w...

msravi7y ago

It was stupid of Goldman Sachs or whoever to predict an outcome. It was stupid of anyone else to lend credence to that prediction.

Sean17087y ago

> It was stupid of Goldman Sachs or whoever to predict an outcome.

If you read the actual report they did[0], they never claimed that any single outcome was more than 18.5% likely.

[0]: http://www.goldmansachs.com/our-thinking/pages/world-cup-201...

pbhjpbhj7y ago

I agree completely with your opening remarks.

>"You can only say that if I toss the coin a 1000 times, then close to 500 are going to turn up heads, and another 500 are going to turn up tails."

hsienmaneja7y ago

They don’t have an edge like they do in their bread and butter markets, combined with a small sample set == high probability of a single year of sports predictions falling over like this

gesman7y ago

If GS would need to bet money - their actual business model would likely be to sell a bit of each higher probability losers (less risk) vs. buy big on a projected winners (higher risk).

blattimwind7y ago

rdlecler17y ago

In the world of models increasing precision for not necessarily increase accuracy.

Sean17087y ago

                Groups  Round_16  Quarters  Semis  Finals    Out_In  Probability
        Brazil   87.5%     60.8%     42.0%  27.9%   18.5%  Quarters        18.8%
        France   81.4%     58.4%     36.6%  19.9%   11.3%       Won        11.3%
       Germany   80.5%     49.5%     30.5%  18.8%   10.7%    Groups        19.5%
      Portugal   75.2%     52.8%     32.2%  17.3%    9.4%  Round_16        22.4%
       Belgium   78.5%     51.1%     27.7%  15.8%    8.2%     Semis        11.9%
         Spain   72.3%     50.1%     28.8%  15.4%    7.8%  Round_16        22.2%
       England   73.1%     46.6%     24.4%  13.4%    6.5%     Semis        11.0%
     Argentina   79.7%     44.2%     24.1%  11.8%    5.7%  Round_16        35.5%
      Colombia   74.9%     37.3%     17.0%   8.5%    3.7%  Round_16        37.6%
       Uruguay   74.4%     34.6%     17.2%   7.2%    3.2%  Quarters        17.4%
        Poland   68.5%     30.5%     12.8%   5.8%    2.3%    Groups        31.5%
       Denmark   47.8%     26.3%     12.4%   5.2%    2.0%  Round_16        21.5%
        Mexico   52.0%     23.2%     10.5%   4.9%    1.9%  Round_16        28.8%
        Sweden   45.9%     19.4%      8.3%   3.7%    1.3%  Quarters        11.1%
          Iran   35.4%     18.1%      7.2%   2.6%    0.8%    Groups        64.6%
          Peru   37.3%     17.2%      6.8%   2.5%    0.8%    Groups        62.7%
     Australia   33.5%     15.4%      6.3%   2.3%    0.7%    Groups        66.5%
        Russia   47.9%     16.3%      6.0%   2.0%    0.7%  Quarters        10.3%
       Croatia   49.8%     16.9%      6.3%   2.1%    0.6%    Finals         4.2%
   Switzerland   52.8%     15.9%      6.1%   2.0%    0.6%  Round_16        36.9%
       Iceland   45.2%     15.1%      5.6%   1.8%    0.5%    Groups        54.8%
    Costa_Rica   36.8%     13.3%      4.7%   1.6%    0.5%    Groups        63.2%
        Serbia   32.9%     12.1%      4.5%   1.5%    0.5%    Groups        67.1%
         Japan   36.5%     12.8%      3.8%   1.3%    0.4%  Round_16        23.7%
  Saudi_Arabia   43.4%     12.7%      4.2%   1.3%    0.4%    Groups        56.6%
       Tunisia   35.2%     13.3%      4.1%   1.3%    0.4%    Groups        64.8%
         Egypt   34.4%      8.7%      2.5%   0.7%    0.2%    Groups        65.6%
   South_Korea   21.6%      5.9%      7.1%   0.5%    0.2%    Groups        78.4%
       Morocco   17.1%      6.8%      1.8%   0.5%    0.1%    Groups        82.9%
       Nigeria   25.2%      6.5%      1.7%   0.4%    0.0%    Groups        74.8%
       Senegal   20.1%      4.9%      1.2%   0.3%    0.0%    Groups        79.9%
        Panama   13.2%      3.3%      0.5%   0.1%    0.0%    Groups        86.8%

[0]: Exhibit 2 in http://www.goldmansachs.com/our-thinking/pages/world-cup-201...

Edit: Fix copy-paste errors and atrocious maths.

ernesth7y ago

Japan went out in Round_16

Croatia went out in Finals

And I do not understand what the last column means (except for France and teams out in group phase)

Sean17087y ago

Urgh, I hate that you can't edit HN comments.

First two were just me making a mistake because I write that in manually.

That last column makes no sense. It was supposed to be the probability that the model gave to the outcome that occurred, but I got the maths wrong.

1 more reply

ernesth7y ago

Great work :)

So all in all, the only teams for which the prediction was more than 1/2 were teams out in groups. That is a little underwhelming.

Ah, for Croatia, I believe, it should read 1.5%.

known7y ago

GarbageIn = ML = GarbageOut

known7y ago

I worked in GS; Soccer/football prediction is not their forte

tomelders7y ago

Cheating.

Before people start booing, let's not forget where this tournament is being held, and all the other nefarious things that country has been up to recently.

teamk7y ago

j / k navigate · click thread line to collapse