Just some observable metric.
If they literally can't come up with a single observable predictive difference then the predictive aspects of their models are actually equivalent and they are only narratively different and don't "really disagree". Like Copenhagen interpretation vs many worlds.
If "democracy" is just metaphysics then it's irrelevant. But if it has actual tangible effects such as "can you vote?", "can you protest the government?", "is the leader of the opposition arrested?", "do most people think they live in a democracy?", "how popular is new legislation compared to previous years?", etc...
Then you can make predictions about it and test them!
You can even do local predictions if both can agree, such as "will the combined incomes of my family be higher or lower in 4 years time?" as low coupling proxies for gdp. (Ideally one would use probabilities for loosely linked proxies like that and use the probability differences the two theories assign to give bits of evidence to one over the other, so you'd want many many such proxies, ideally uncorrelated ones)