> Also depends what you count as correct
I feel that should be the job of the authors to think about what correctness means for their application, and under what circumstances they can give a correct answer. Ideally, also warn the user when the inputs don't fall within that domain or give them options to handle such cases.
Let's talk about an example: say you have a neat algorithm for fast multiplication of sparse matrices, approximately correct up to a defined Frobenius error. Say we have some reasonable treatment of dimension mismatch and NaN, inf,... with option flags already. Now if I wanted to know whether your algorithm has the same numerical guarantee for a dense matrix, I'd have to read your paper, run some tests, ask around, and I might never be quite sure.
It seems much more sane to put the onus on the authors to be explicit about the behaviour in that case. Your and others' comments seem to put the onus mostly on the user for thinking about the behaviour of others' code. That does not seem like a good practice, especially in the context of complex math and nested composition.