> [...]
> But Goldman Sach’s misfire is perhaps the most curious.
The model said, that there is a lot of uncertainty, and as it happens, it was entirely correct. A World Cup chance of 18.5 percent means, that 4 out of 5 times the team will not win, and that that is the highest chance does not say much about the model.
And in general this is one instance of the well practiced journalistic technique to wait for results first and then define a bar afterwards to criticize the results according to standards that did not exist when the performance happened. (I guess in this case it is even worse, we could construct a reasonable test of the model performed, I have the suspicion that that was in the original paper and that the journalist either did not understand it, or, more likely, choose to ignore it in favor of writing a better story.)
They overranked Germany, and underranked Croatia. Nearly every other person in the world did the same.
Look how disingenuous the Bloomberg article is. "Goldman Sachs updated the model throughout the tournament. It predicted a Brazil-Spain final on June 29 and Brazil-France on July 4. Its most recent prediction had England and Belgium squaring off for the cup. Both were eliminated in the semifinals." But their actual Brazil-France prediction had 8 teams left, and the winners of that round were all in the top 5. https://twitter.com/GoldmanSachs/statuses/101448576794142720... They even had Croatia over England, and France over Belgium.
A modern model would accommodate for the fact that those numbers alone mean nothing, because they don't. Those are the numbers broadcasters reluctantly put on a screen for entertainment value, but they don't have real analytical power because they have no comparative metric.
How up or down were each of those numbers against previous wins and losses for each team?
What was Brazil's conversion from on-target shots before the tournament?
What was Belgium's success/failure rate on on-target shots they were defending against?
Likewise the other way around: were Brazil guilty of particularly poor defending? Were Belgium finding ways of making on-target shots count against all opposition, or was it luck on this game?
Any human analyst could tell you going into that game that Belgium were "lucky" and easily free scoring beyond expectations, able to make more of fewer opportunities. Likewise the consensus from most experts was that Brazil were guilty of mild complacency, the team were young and not yet formed into a strong unit yet (rather still just 11 strong individuals at any one point in time), and their on-target shots - whilst frequent - were of lower probability of being able to turn into goals due to distance, power, position, etc.
So why did the Bloomberg model not pick that up?
I actually think they did pretty well all things considering, but I'd love to see whether they did any runs on previous World cups to try and check their thinking and whether they over-fitted a little to a couple of key metrics. I think the lack of metrics from previous games might mean they relied on some headline numbers, but there's more that they could have done to get a better model here...
Still, it's not their job is it? Just a bit of fun... which is a good job, because I find it just a little bit amusing.
But do you need a sophisticated model and lots of so-called "AI" to arrive at the conclusion that there's a lot of uncertainty?? The point of the model is to reduce uncertainty, not find that it's there and do nothing about it.
And no, you don’t need statistics or machine learning to say “there is a lot of uncertainty”, but you do in order to quantify that uncertainty.
Predicting the result of an A or B contest the bar is already defined. Either the system gets it right or doesn't, if it gets it right more often than not then (despite this being poor grounds mathematically, on a small result pool) popular press will report it as successful.
IMO if matches become easy to predict then rules will change to reduce that predictability.
I disagree: If team A has a 10-30% chance of winning, and A pulls off the upset, the correct answer was not "A Wins" it was "B has a 70-90% chance of winning".
For Goldman Sachs' investments, the bar is not to predict that A wins or that B wins, it's to predict the probability and variance regarding which team will win. Of course, from a single upset game, it's impossible to tell whether these estimates are correct. You'd need to see the success or failure of many trials.
Score it yourself against implied probabilities from Betfair for example and marvel at the suckage.
I think this is a smoke signal. Soccer is corrupt; you can't predict the winner unless you know what's being passed around under the table. Goldman Sachs does these predictions so people read between the lines to see how corrupt it is.
My argument is: "Goldman is amazing at statistical analysis and they routinely practice it on much tougher models (the global economy), so they should have no problem predicting a simpler model (soccer). But since they drastically failed at predicting soccer, then there must be an equally drastic variable missing from their predictions. Since we can trust Goldman to use all available public information in their analysis, there must be critical information that is hidden from the public which affects the outcomes". I make some assumptions, but it's fairly sound, no?
In the world of sports betting/analytics, you have baseball and basketball at the forefront, and then American football, soccer, and hockey (roughly in that order).
Off the top of my head, there are several reasons why the latter three sports have all lagged behind:
-Lack of data
It wasn't until the last 4-5 years that widely available, affordable, and accurate data for soccer matches was available. Companies like Opta have accomplished this by outsourcing the watching of games and the manual tagging of events, which was made possible by the advent of cheap cloud computing.
It should be self-evident why tracking the position and actions of 22 players is more complicated than something like baseball, where for the most part you are looking at one pitcher vs. one batter, much of which can be automated with computer vision that tracks pitch position, speed, and spin.
-Complexity
It's no accident that baseball was the first sport to be revolutionized by analytics. Most of the time, it's a static game, with a clearly defined action set. I.e. do I swing at the pitch or not. Do I throw a fastball or not. Do I attempt to steal a base or not.
In games like American football, soccer, and hockey, you have anywhere from 12-22 players on the field at a time. Tracking what the players without the ball or the puck are doing is a difficult task technically, as is quantifying their impact. Concepts like expected goals and expected goals added are recent ones.
-Sample size
Typical elite soccer leagues see each team play each other twice. In England and Spain, this means you have 38 games per season.
Baseball has a 162 game season and playoff games, basketball has an 82 game season and playoff games, etc. Coupled with the fact that quality data has been only collected for a few years, and you get other problems.
In basketball and baseball, the effects of aging on player performance and statistics is fairly well understood now. We can generally calculate the 5-year market value of a player etc. In the other sports I mentioned, we don't yet have that kind of time series data to be able to make those judgements.
--
Specific to the World Cup, there are other reasons why you may find it hard to predict results.
-Team chemistry and style
Even though the World Cup is the most high-profile soccer event in the world, most players are spending 1-3 months a year with their national teams. Their "day jobs" with their clubs teams take up most of their playing time and attention.
As anyone who has played the game Football Manager will know, managing a national team is a tough job. You have no say over how the players are practicing when they're away from you, and no control over the physical condition in which they arrive at the World Cup. This year, there was barely a month between the end of the regular European seasons and the start of the World Cup.
In that month's time, you have to get at least 11 players who have not played with each other, to learn your style of play. Do you want to play a pressing style? Are you attempting a slow buildup, or trying long balls? Etc. etc.
-Home field advantage
In baseball and basketball, most modern statistical models account for home field advantage. Having 60,000 Russian fans chanting and heckling likely played a role in the team's ability to upset Spain, particularly during penalty kicks.
This goes back to the sample issue. How many times before have Spain played Russia IN Russia in front of a large crowd? Probably never.
---
All this is to say, cut Goldman some slack. There are a number of non-nefarious reasons why you may expect a soccer model to produce some spectacular miscues.
Pretty sure you just inadvertently identified why GS is so “great” at predicting economic movements.
The World Cup predictions from Goldman Sachs (and also UBS) are a form of recreation and entertainment with machine learning. It's an expression of quant nerd humor.
Analogous intellectual games would be engineers devising ridiculous Rube Goldberg contraptions[1] or programmers building "enterprise" FizzBuzz[2].
(I think it would add to the fun if GS uploaded their raw data and models to Github for others to play with.)
>It certainly didn't predict the final opposing France and Croatia on Sunday.
True, but it did predict France having better chance winning overall but was handicapped by a tougher draw. It also predicted France beating Croatia in round 16 instead of the final. The pdf says:
>While Germany is more likely to get to the final, France has a marginally higher overall chance of winning the tournament,
[1] https://en.wikipedia.org/wiki/Rube_Goldberg_Machine_Contest#...
[2] https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...
It's also Goldman Sachs and UBS choosing to attach their names to these and stake some reputation on these predictions. If they had hit the bullseye, they would be lauding these results.
For example, imagine a tournament with a large number of participants, where the winner is picked simply by fairly choosing a single random participant.
If I then gave you all the perfect historical data going back decades, you could do statistical analysis and determine that the winner is completely random and therefore the probability of success, for any particular participant, is p~=(1/n), where n is the number of participants. Your confidence in correctly predicting any particular outcome will drop as n rises.
Not everything can be easily predicted just because you have enough data.
It was not wrong to say Hillary had a 95% chance of winning the presidential election, but the confidence was low and that value still allowed for the opposite result to happen.
Also football has a lot of variance concerning team capability and end results. The better team might (and does) lose often, especially when going to penalty shoots.
With basketball, the stronger team will be easily scoring more in most cases.
So she had 95% chance of winning with 50% probability or what?
Say you make the assumption that the quantity being estimated is truly fixed: that there's some true value for the force of gravity or some true value for the number of people that vote for X or Y.
The second assumption that comes along is that the stochasticity observed comes from your perspective of observation, and not from the ground truth. To be more blunt, you know that of all the observations you make 95% of them have the probability of yielding the result observed... but the ground truth is still fixed. Gravity has a fixed quantity, despite your experimental error, and you may have been lucky enough to observe it in your sample.
Predicting elections with frequentist methods has this same characteristic, except the observed quantity itself shapeshifts and even lies... so then there are other complications that need to be dealt with.
This is where that 50% feeling comes from. There are two outcomes, one will be true. You're data analysis just tells you that if you repeat your procedure, you'd expect 95% of those result to give you the outcome you observed.
Consider: If someone offered to give you $2 every time a fair coin toss came up heads, or take $0.50 every time it came up tails, you'd be foolish not to take that bet a million times as you can because you know that the coin has exactly a 50% chance of coming up heads.
However, if it was an unfair coin, you'd want to know the degree to which it was unfair, and you'd have to measure it. How much do you trust those measurements? You might say that you're 90% sure that the coin has a 40-60% chance of coming up heads, or give a probability of 2% that a $1.04 to $0.96 wager would be profitable while a $1.03 to $0.97 wager would be unprofitable.
Hillary had a 95% chance to win the election. But on top of the fact that 1 in 20 times she'd lose that election if that really was the probability, the 95% number was uncertain because the measurements were difficult to pin down - maybe she'd have lost 1 in 40 times, or maybe she'd have lost 1 in 5 times. All we know now is that she lost, and that many of the assumptions and measurements the pollsters had to make concerning factors like voter turnout, nationalism, corruption, foreign interference, debate results, and fundraising turned out to be inaccurate.
With unfair coin measurements, you can get very accurate numbers with just a handful tests. When predicting election results or World Cup games, you're much less likely to make an accurate estimate. The confidence is an estimate of how likely that estimate is to be accurate.
What's wrong is thinking 95% chance of winning means they will win
For example, if you are estimating the height of a male in the US, you would collect data on US males and get the average. But unless you surveyed every male in the US, there is some error associated with your estimate. So you would either construct error bounds (a frequentist approach) or a probability distribution (a Bayesian approach) around the mean height. So your results may dictate that the mean height of the American male is 5’11, plus or minus 2 inches. Those two inches represent uncertainty around your data collection. That’s the exact same thing that is done here, but with a percentage instead of a height. Outlets may predict Hillary winning at 95%, but the reality is their methodology should provide a plus-minus value around that. The problem is that few of them actually report that.
But it gets more confusing. That error bound is only around the mean. Pick a random guy out and not only will he likely not be 5’11, there is a decent chance he will be outside of that range of 5’9 - 6’1. You will get 5’7 guys and 6’4 guys pretty commonly. In the case of the election, it may actually be true that Hillary had a chance between say, 93% and 97% of winning. But even if that is the case, she will still lose between 3-7% of the time. But since we only have one reality to observe, we can’t know if she lost simply because we saw that 3-7% realized, or because they people coming up with that number screwed up. That’s why groups like 538 deserve more leeway. When they say that Donald Truml has a 30% chance of winning, and he does. That’s not that crazy. And therefore there is much less reason to assume they screwed something up than the people who predicted a 5% chance of Trump winning. It’s possible those models were right, but much less so.
The reasons events like the World Cup are far more interesting is because it's over a shorter period of time.
I think the problem here isn't the event but rather the sport. Something like snooker or tennis will offer the same brevity over the period but with chance playing a less significant role due to the number of games played per match.
That all said, if my years of watching snooker has taught me anything, it's that people are not machines and thus will perform vastly different from day to day depending on how what mood they're in.
Analytically the difference between premier league and the world cup is that you have momentum and continuity in the premier league and the world cup is essentially one shot. So in the PL team A will play team E and G and H before it plays team B, team B may play team E and H and Q (which played G). Team A may be winning games that your strength model shows they should lose, Team B may be losing games... and so on and so on. There is more evidence that might matter. More importantly you can be wrong quite a lot of the time in a season and still be right at the end of it (as the bounces of the ball even out over time). Not so much in the world cup - one goal knocks you out and there is no coming back! Basically the world cup demands an algorithm that works with less evidence and with a much higher degree of accuracy.
Can you give us some examples?
In the NBA, NHL, or MLB, seven game series tend to even out the variance, so the best team usually wins. And even in NCAA basketball, there's enough scoring that any individual play loses significance.
Also, I've seen some people say (not in this forum) that banks now look stupid because they're in the business of making predictions and they can't even get the world cup right. Guess what? Banks make no money on predictions. They make money on flows and taking spreads on trades they do with clients. Any research or prediction is meant to be a catalyst for that trade.
You're mostly right but to further clarify, an investment bank like Goldman Sachs has revenue from mostly "market making" spreads but it does also have activities that depend on predictions such as their proprietary trading (before the Volcker Rule shut them down) and their GSAM (Goldman Sachs Asset Management) fund. The GSAM is basically a hedge fund for their wealthy clients' money. They will run predictions on macro trends on data like interest rates, commodities, indexes, etc to help them pick stocks for their portfolio.
As the pdf noted, the World Cup data models and simulations came from Adam Atkins of GSAM.
The rule was too complex and onerous to be implemtable. Case in point, it's already being rolled back... Certainly because of the current administration we're in. But more because it was just a poorly written and thought out idea to start with.
This is an enterprise for bookies, not Goldman Sachs.
> The ludic fallacy, identified by Nassim Nicholas Taleb in his 2007 book The Black Swan, is "the misuse of games to model real-life situations."
...
> The alleged fallacy is a central argument in the book and a rebuttal of the predictive mathematical models used to predict the future – as well as an attack on the idea of applying naïve and simplified statistical models in complex domains. According to Taleb, statistics is applicable only in some domains, for instance casinos in which the odds are visible and defined.
Both Taleb's books, "The Black Swan" and "Fooled by Randomness" are an interesting take for such models. Meanwhile, most economists know about "Knightian Uncertainty" [1] which talks about differentiation of risk and uncertainty.
> "Uncertainty must be taken in a sense radically distinct from the familiar notion of Risk, from which it has never been properly separated.... The essential fact is that 'risk' means in some cases a quantity susceptible of measurement, while at other times it is something distinctly not of this character; and there are far-reaching and crucial differences in the bearings of the phenomena depending on which of the two is really present and operating.... It will appear that a measurable uncertainty, or 'risk' proper, as we shall use the term, is so far different from an unmeasurable one that it is not in effect an uncertainty at all."
In [0] you have the following:
> The ludic fallacy, identified by Nassim Nicholas Taleb in his 2007 book The Black Swan, is "the misuse of games to model real-life situations."
And he gives an example of this:
> One example given in the book is the following thought experiment. Two people are involved:
> Dr. John who is regarded as a man of science and logical thinking
> Fat Tony who is regarded as a man who lives by his wits
> A third party asks them to "assume that a coin is fair, i.e., has an equal probability of coming up heads or tails when flipped. I flip it ninety-nine times and get heads each time. What are the odds of my getting tails on my next throw?"
> Dr. John says that the odds are not affected by the previous outcomes so the odds must still be 50:50.
> Fat Tony says that the odds of the coin coming up heads 99 times in a row are so low that the initial assumption that the coin had a 50:50 chance of coming up heads is most likely incorrect. "The coin gotta be loaded. It can't be a fair game."
> The ludic fallacy here is to assume that in real life the rules from the purely hypothetical model (where Dr. John is correct) apply. Would a reasonable person bet on black on a roulette table that has come up red 99 times in a row (especially as the reward for a correct guess is so low when compared with the probable odds that the game is fixed)?
So Nassim Taleb wanted to discuss "using games to model real-life situations" and to demonstrate the pitfalls he uses two characters. He _portrays_ the characters as "man of logical thinking" vs "man who lives by his wits", but as we'll see he's missing one dimension to his characterization.
The first problem here is that implicitely he's suggesting to the reader that the decisions of the "man of logical thinking" represent the pitfalls of "applying games to model real-life situations", whereas the the other guy's decision represent.... it's not specified, but clearly has a better outcome.
The second problem, is that he conflates "applying something you read on some textbook to real life without thinking" with "modelling real-life". He suggests to the reader that those two people are actually "logical" vs "instinct", but they're not. They're a dumb guy who knows maths vs a smart guy who doesn't know math. _Obviously_ real-life is more complex than your textbook examples, and so the smart guy is going to win because his fuzzy heuristics beat the first guys decisions which are optimal within his flawed model. An actual smart and logical person would update his model based on new evidence (i.e. "I was told that this coin was 50-50 but actually the chance of what I just saw is so small that it's more likely that I was just lied to") and then use maths to make predictions and beat the guy who's smart but doesn't know math.
So ironically, he wants to portray the dangers of using over-simplified models and to do that he uses an example where he obscured one dimension.
Nassim Taleb is really good a rhetoric but light on substance.
Basically a book by Nassim Taleb is an incoherent summary of the books that Nassim Taleb has read within the past year, with a few morsels of recycled insight here and there.
I’m not sure why there are so many people who take him seriously.
There's no downside, only free publicity. If they, by good fortune and a following wind, get it right - then the publicity is incredible. If it's wrong they laugh and say "well, better stick to predicting what we're good at!" and they still get a shitload of headlines and awareness of their product.
This was not a mistake.
I compared the logloss for their predictions with the "uniform" benchmark (giving each team 1/32 probability of winning, 1/16 probability of getting to the finals, etc) and the results are the following (if I transcribed the data properly):
Getting to second round:
GS: 0.495 UBS: 0.495 bench: 0.693
Getting to quarter-finals:
GS: 0.463 UBS: 0.459 bench: 0.562
Getting to semi-finals:
GS: 0.310 UBS: 0.327 bench: 0.377
Getting to final:
GS: 0.231 UBS: 0.269 bench: 0.234
World-cap winner:
GS: 0.097 UBS: 0.113 bench: 0.139
The performance of the models was ok until Croatia got to the finals. This hurt specially UBS, who predicted less than 0.9% probability of such an event (compared to 2.1% in Goldman's model).
Edit: these would have been the "best case" scores (if the high-probabilty teams had classified to each round, ignoring that this may be impossible due to the structure of the tournament):
GS: 0.432 0.302 0.220 0.141 0.079
UBS: 0.365 0.251 0.176 0.111 0.070
UBS could potentially achive lower logloss metrics because it had more extreme predictions.
Author misses the way models work entirely, the larger the entity, the more statistics and averages kick in, and as a result, better model can be built.
- motivation (Germany and Croatia were the two extremes here, no idea how to measure it)
- team cohesion (number of articles in a few journals questioning the team cohesion, maybe also articles about individual players)
- creativity in offense (maybe measurable via "target missed from close distance" + "ball passed front of the goal")
- number of errors in defense that didn't lead to a goal
- percentage of times ball possession was lost from own goal to enemy's area (England was really bad here against Croatia)
This one would benefit possession-based teams, so it would fail to give decent odds to the current world and european champions (France and Portugal respectively) which don't play possession. Of course it's possible they're outliers but we'll never know.
As you identified, motivation could be pretty hard to measure ... but even if we could it might be a pretty poor predictor anyway. France in the early stages didn't look very motivated, while England and Colombia looked pretty lively.
Team cohesion - the German team were pretty consistent (not dazzling, but consistent) and we know how that ended. Again France didn't really impress until the latter stages of the WC.
Creativity in offense - I guess it can indicate a sort of calm or confidence in front of goal but actually it can actually be seen as pretty negative. For example Arsenal a few years back came under fire for having plenty of possession in the 18 yard box but failing to convert. Spain's confident quick pass-and-move "tiki-taka" was ever-present and has in my eyes been impotent in the last few years (and more important as a neutral viewer - very frustrating to watch).
Defensive errors that didn't lead to a goal could be a nice indicator of the ability of a defence to pick up after each others mistakes - but at the same time these errors that lead to goals (i.e. Croatia's second goal in the final) are relatively rare and a lack of a goal could just point to the opposing team's inability to convert due to a poorly organised or a lack of opportunism from their strikers.
I'm not sure what you mean with the last one, but I think this could be a nice one - if you mean "times you lost possession in your own half". A profligate midfield and defence is bound to ship goals, I doubt there are many teams that can either fight back after trailing by a goal or two or score enough to maintain a reasonable buffer.
I applaud the effort though - it takes more creativity and care to think of some new angles (like you did) than to think of some possible counter examples (like I did)!
> I'm not sure what you mean with the last one, but I think this could be a nice one - if you mean "times you lost possession in your own half"
Almost, England lost the ball frequently (> 50+x% with a large x AFAI could see) due to the keeper sending out long balls. I'd like to measure that somehow. Could be done via number of seconds in possession after a goal kick, an indicator whether a hypothetical 85% marker of the field was reached or measuring whether the ball was at least 5x successfully passed (or resulted in a goal).
If anything, it worked worse.
"If anything"? All the results are available, so it would be easy to put a precise number on this. Measure the Bayesian regret, or just report the winnings if you had used the GS model to bet on the outcomes. Unless it reports some concrete numbers, this article is garbage.
It doesn't report any concrete numbers.
The reason is easy to see. The game can be decided by one, two or three key plays. Compare that to basket ball. To win a game you have to consistently score more and defend better. Rarely the game is decided by one or two plays. That only happens when the game is already very tight.
The odds shortened as the tournament progressed, I was able to hedge as the shortened odds made lay betting profitable.
(High variance in football outcomes means there's no guarantee of profit, I don't bet big sums.)
If someone were to bet during the round of 16, if someone were to bet $1 on the bottom 8 and $2 on the top 8, the strategy would most likely yield a small profit or a small loss, rather than a total loss.
https://news.ycombinator.com/item?id=17509407
Did this investment bank use same set of algorithms that they use for financial predictions?
...And then I remember there was this Octopus[1] who used to predict winners with 85% accuracy
It's as silly as saying my claim for the odds of nearly perfectly modelling a coin toss (approximately 50/50%) is wrong because a series of 10 coin tosses show different results from my model. The model is not any less correct.
If I toss a fair coin you cannot predict the next outcome. You can only say that if I toss the coin a 1000 times, then close to 500 are going to turn up heads, and another 500 are going to turn up tails.
It was stupid of Goldman Sachs or whoever to predict an outcome. It was stupid of anyone else to lend credence to that prediction.
Hopefully, Goldman Sachs is not relying on prediction of singular outcomes to make their investment decisions. I don't think they are. Probably just marketing brouhaha to ride the soccer wave. Although I'm not sure if that worked as expected.
If you read the actual report they did[0], they never claimed that any single outcome was more than 18.5% likely.
[0]: http://www.goldmansachs.com/our-thinking/pages/world-cup-201...
>"You can only say that if I toss the coin a 1000 times, then close to 500 are going to turn up heads, and another 500 are going to turn up tails."
Sometimes you can do that and every single flip will be heads. It's unlikely, and across zillions of universes you'd only find it once - but we don't have a pool of universes that we can sample statistically.
Groups Round_16 Quarters Semis Finals Out_In Probability
Brazil 87.5% 60.8% 42.0% 27.9% 18.5% Quarters 18.8%
France 81.4% 58.4% 36.6% 19.9% 11.3% Won 11.3%
Germany 80.5% 49.5% 30.5% 18.8% 10.7% Groups 19.5%
Portugal 75.2% 52.8% 32.2% 17.3% 9.4% Round_16 22.4%
Belgium 78.5% 51.1% 27.7% 15.8% 8.2% Semis 11.9%
Spain 72.3% 50.1% 28.8% 15.4% 7.8% Round_16 22.2%
England 73.1% 46.6% 24.4% 13.4% 6.5% Semis 11.0%
Argentina 79.7% 44.2% 24.1% 11.8% 5.7% Round_16 35.5%
Colombia 74.9% 37.3% 17.0% 8.5% 3.7% Round_16 37.6%
Uruguay 74.4% 34.6% 17.2% 7.2% 3.2% Quarters 17.4%
Poland 68.5% 30.5% 12.8% 5.8% 2.3% Groups 31.5%
Denmark 47.8% 26.3% 12.4% 5.2% 2.0% Round_16 21.5%
Mexico 52.0% 23.2% 10.5% 4.9% 1.9% Round_16 28.8%
Sweden 45.9% 19.4% 8.3% 3.7% 1.3% Quarters 11.1%
Iran 35.4% 18.1% 7.2% 2.6% 0.8% Groups 64.6%
Peru 37.3% 17.2% 6.8% 2.5% 0.8% Groups 62.7%
Australia 33.5% 15.4% 6.3% 2.3% 0.7% Groups 66.5%
Russia 47.9% 16.3% 6.0% 2.0% 0.7% Quarters 10.3%
Croatia 49.8% 16.9% 6.3% 2.1% 0.6% Finals 4.2%
Switzerland 52.8% 15.9% 6.1% 2.0% 0.6% Round_16 36.9%
Iceland 45.2% 15.1% 5.6% 1.8% 0.5% Groups 54.8%
Costa_Rica 36.8% 13.3% 4.7% 1.6% 0.5% Groups 63.2%
Serbia 32.9% 12.1% 4.5% 1.5% 0.5% Groups 67.1%
Japan 36.5% 12.8% 3.8% 1.3% 0.4% Round_16 23.7%
Saudi_Arabia 43.4% 12.7% 4.2% 1.3% 0.4% Groups 56.6%
Tunisia 35.2% 13.3% 4.1% 1.3% 0.4% Groups 64.8%
Egypt 34.4% 8.7% 2.5% 0.7% 0.2% Groups 65.6%
South_Korea 21.6% 5.9% 7.1% 0.5% 0.2% Groups 78.4%
Morocco 17.1% 6.8% 1.8% 0.5% 0.1% Groups 82.9%
Nigeria 25.2% 6.5% 1.7% 0.4% 0.0% Groups 74.8%
Senegal 20.1% 4.9% 1.2% 0.3% 0.0% Groups 79.9%
Panama 13.2% 3.3% 0.5% 0.1% 0.0% Groups 86.8%
[0]: Exhibit 2 in http://www.goldmansachs.com/our-thinking/pages/world-cup-201...Edit: Fix copy-paste errors and atrocious maths.
Croatia went out in Finals
And I do not understand what the last column means (except for France and teams out in group phase)
First two were just me making a mistake because I write that in manually.
That last column makes no sense. It was supposed to be the probability that the model gave to the outcome that occurred, but I got the maths wrong.
So all in all, the only teams for which the prediction was more than 1/2 were teams out in groups. That is a little underwhelming.
Ah, for Croatia, I believe, it should read 1.5%.
Cheating.
Before people start booing, let's not forget where this tournament is being held, and all the other nefarious things that country has been up to recently.