I'll just answer your question, why there can't be a null model. You can have a hypothesis that represents all differences between groups are due to chance. To make this a statistical model, something that can calculate the probability of an event, you have to make assumptions. Maybe it's just about the distribution. Maybe it's independence. But, it's always something. You said it yourself "You don’t assume any belief on the probability of a specific model to be true." To be a statistical model, to calculate the probability of an event, to calculate marginal likelihoods, to calculate Bayes factors, you have to do that.
This is largely a philosophical point. You can have a null model. Something you pick to represent "no effect". But there's not the null, this belief free model that's categorically different from a model with priors.
If there's a belief-free model that can give a marginal likelihood, then I'm wrong. I'd also very much like to know about it.
Let’s say you study P. You know that P belongs to the family 𝒫. For example, 𝒫 = {N(μ,σ²): μ∈ℝ, σ²>0}. To be aware of 𝒫 is a prerequisite to do any sort of testing. For example, to test H0: μ=0 vs H1: μ≠0. After all, a typical hypothesis test is just a likehood ratio test.
What you don’t have to know or even to assume existence of—if you don’t do Bayesian stuff—is a probability measure Π on 𝒫 (and its appropriate sigma-field). That’s the philosophical difference. But you have to have well-defined 𝒫 either way.
The priors is Π. Existence of 𝒫 means that you have a family of models, but it doesn’t force you to assume any priors about those models. I don’t see how not having Π makes P∈𝒫 less of a model. You are still allowed to do conditional reasoning, eg the aforementioned type I and type II errors.
> Let’s say you study P. You know that P belongs to the family 𝒫. For example, 𝒫 = {N(μ,σ²): μ∈ℝ, σ²>0}. To be aware of 𝒫 is a prerequisite to do any sort of testing.
Okay, sure: you’ve decided on a family of probability distributions. It’s surely an approximation (very few things you would test are actually Gaussian — for one thing, the negative tail often makes no sense).
> For example, to test H0: μ=0 vs H1: μ≠0. After all, a typical hypothesis test is just a likehood ratio test.
This is indeed the usual formulation.
> What you don’t have to know or even to assume existence of—if you don’t do Bayesian stuff—is a probability measure Π on 𝒫 (and its appropriate sigma-field). … But you have to have well-defined 𝒫 either way.
If 𝒫 is well defined, then there is some probability that H0 is true. But H0 occupies a lower-dimensional space than 𝒫 — it’s a measure-zero subset. Most probability measures 𝒫 (and all measures that are continuous on their parameters) give zero probability to H0. So (in Bayesian terms) H0 is a priori wrong w.p. 1. And in non-Bayesian terms, you’re calculating the likelihood of your measurements under two competing hypotheses, one of which is correct w.p. 0 even conditioned on one of the two hypotheses being correct.
And this results in what I consider to be useless headline results:
“This intervention has an effect with significance 0.02” — great, of course it has an effect. What is the effect? Can you say anything intelligent about effect size? Did you even try?
“We did not find significant evidence that some intervention causes some undesirable effect” — great, but that’s actually a statement about your trial and conveys essentially no information about whether the effect is there. I can do a study with n=1 and fail to find significant evidence of anything! But I also learn nothing! Why didn’t you either (a) come up with an actual reasonable hypothesis and test that or (b) put some confidence bounds on the size of the undesirable effect.
And you can do (b) without a Bayesian prior as long as it choose your hypothesis well. “Our data is inconsistent with the intervention causing the undesired effect in more than 0.001% of cases” with some clarification as to what “inconsistent” means.
> But H0 occupies a lower-dimensional space than 𝒫 — it’s a measure-zero subset.
In non-Bayesian framework your hypotheses don't have to be a part of a measurable structure at all. Nevertheless, if you have a measure, it doesn't have to be a zero at every point. I think it is quite intuitive to see. Let's look at two questions: (1) Does X have an effect? (2) How large is that effect? If your prior puts a non-zero probability that the answer to the first question is "No", then priors for the second question will have non-zero at point 0, even though the probability of any other point may be zero.
> And this results in what I consider to be useless headline results
These headlines don't have anything to do with Bayesian vs non-Bayesian as far as I see. People not doing power analysis is a people problem, not an issue with a statistical framework.
And some don't.
> So (in Bayesian terms) H0 is a priori wrong w.p. 1.
Or not.
> And in non-Bayesian terms, you’re calculating the likelihood of your measurements under two competing hypotheses, one of which is correct w.p. 0 even conditioned on one of the two hypotheses being correct.
If one of the two (H0 or H1) is correct and you don't know which one then it could be either... Of course is you knew a priori which one is correct you wouldn't be considering the other at all.
Google “the null hypothesis”.
If you mean null model, then I’m not fighting against anyone. We all agree which null model to use is a choice to be made.
Otherwise, I’m not even sure what you’re trying to convince me of at this point. I’ll restate the essence of my first comment more concisely.
Bayes factors are a method of model comparison. You take the ratio of marginal likelihoods for two models given the data. Choosing a null model for this purpose requires more assumptions than doing null hypothesis testing with frequentist statistics. Mixing the schools of thought of Bayesian and frequentist makes things more confusing than operating within them individually. Bayes factors have other uses than null hypothesis testing.
Maybe you straight up disagree with one of those sentences to the point you could quote it and say “this is wrong because…”.
I have this feeling you got to my second paragraph on my first comment and started quoting stuff before reading it all the way through. Or maybe you just had some stuff you really wanted to talk about. Because my whole point was how calculating Bayes factors and the normal mentality of null hypothesis testing don’t play nice together, but can have other benefits. So a line by line comparison doesn’t really make sense.
> If you mean null model, then I’m not fighting against anyone. We all agree which null model to use is a choice to be made.
I don’t understand what difference you are trying to imply by drawing a distinction between a null model and a null hypothesis.
> Otherwise, I’m not even sure what you’re trying to convince me of at this point. I’ll restate the essence of my first comment more concisely.
I will try to make it as clear as possible.
> Bayes factors are a method of model comparison.
Are you implying that hypothesis testing isn’t? That’s just false. And I’ve explained why.
> You take the ratio of marginal likelihoods for two models given the data. Choosing a null model for this purpose requires more assumptions than doing null hypothesis testing with frequentist statistics.
And in frequentist statistics you just calculate likehood because you can’t integrate over your model probabilities to get marginal likehood because you don’t assume your models to have a probability of being true. That’s the only extra assumption you have in Bayesian statistics. Everything else is the same. If you are saying that there are some other extra assumptions, that’s just false as I’ve explained in my previous comments. There are no extra assumptions for a “null model” beyond putting a prior on it.
> Mixing the schools of thought of Bayesian and frequentist makes things more confusing than operating within them individually. Bayes factors have other uses than null hypothesis testing.
There is no any confusing “mixing”. It’s just statistical decision theory. In the frequentists approach you calculate the risk of your decision rule for each model and call it a day. In the Bayesian approach you go one step further and average your risks using your priors to get the “total” Bayes risk.
Both approaches have uses other than null hypothesis testing. Null hypothesis testing is just a particular case of a decision problem with a 0-1 loss function. The loss is 0 if you have chosen the correct hypothesis and it is 1 if you have encountered type I or type II error.
How so?
I could choose the same null model that predicts that the observation is distributed as, say, a standard Gaussian. What additional assumptions are required?
it says it right there in the blog post...