I didn't read the paper, but it seems they're trying to fix a ML model by a ML model. I am not sure whether that's a good idea, but I digress. Besides, how do they know what is a hallucination and what is a non-hallucination (cf. a similar debate on disinformation)?