> 2) The main way the forecasts failed to be useful was that the questions themselves weren't capturing anything interesting.
I agree, having used PredictionBook [2] in the past, though the essay doesn't address what I think is a better solution. Predictions that aren't involved in a decision aren't worth anything from a decision analysis perspective, so that's one heuristic I keep in mind when trying to make predictions. Why should I care that Angry Birds AIs are getting better (e.g.)? If the information isn't a factor in any decision, its value of information [3] is zero.
Perhaps I haven't paid close enough attention to this, but in AI safety I never got a sense for what people would do with the forecasts.
[1] https://news.ycombinator.com/item?id=22127537
There's the idea that forecasting is only valuable if it's decision-relevant, or action-guiding, and so far no forecasting org has solved this problem. But I think this is the wrong bar to beat. Making something action-guiding is really hard -- and lots of things which we do think of as important don't meet this bar.
For example, think of research. I think many people who have made important progress didn't set out to write documents that would change how their boss/policymakers make decisions. Rather, they sought out something that they were curious about, or that seemed interesting, or just generally important... and mostly just tried to have true beliefs, more than having impactful actions. They're doing research, not decisions.
Similarly, many people think essays can be important and enabling people to do essays better has high impact. But if you pick a random essay by a public intellectual, and ask what decision was improved as a result, I think you'll be disappointed (though not as disappointed as with forecasting questions). And this is fine. Decision-making takes in a large number of inputs, considerations, emotions, etc. which influence it in strange, non-linear ways. It's mostly just a fact about human decision-making being complex, rather than a fact about essays being useless.
So I'm thinking that the evidence that should suggest to us that forecasting is valuable is not hearing an impactful person say "I saw forecast X which caused me to change decision Y", but rather "I saw forecast X which changed my mind about topic Y". Then, downstream, there might be all sorts of actions which changed as a result, and the forecast-induced mind-change might be one out of a hundred counterfactually important inputs. Yet we shouldn't propose an isolated demand for rigor that forecasting do the credit assignment problem any better than the other 99 inputs.
To use an example from your article, an essay helps you organize your thoughts, but making a simple decision tree would typically help more when applicable in my view. Essays are more general, of course, as not all predictions are decision-relevant. In the decision analysis class I took (which, incidentally, Vaniver from your link also took) you typically had to do both for the projects.
Understanding a problem space is absolutely necessary to setting up the decision analysis. So yes, research is important! And there is value (not considered in a typical decision analysis) in making predictions for their own sake. A large fraction, perhaps even the majority, of the predictions I've made on PredictionBook were more for my own education than anything else. And I learned a lot from that.
Looks cool, but what I'd be interested in would be a rigorous method of combining the outputs from multiple questions.
Given comments on the Good Judgement Project, a lot of people are using the wrong reference class, so have bad base rates. The work could easily be split, defining the right reference class, while having others refine the rates for each.
I'd also like to have easy ways to make conditional forecasts. One of the questions I got much better than the average on the GJP concerned the likelihood of a US-China deal on emissions. Many forecasters assumed China would never agree to such a deal, because of how much they need to emit.
I knew China had been working on renewables, and would be likely to agree to a deal. I just didn't know about the diplomatic world. A way to tell those that know diplomacy that the economic and technology landscape had changed would have changed the calculus.
Averaging is great and all, but platforms should make collaboration easier.