The problem lies with the test itself, which might have an unknown false positive rate. Although in this case we're basically looking at what I understand to be the gold standard in RNA/DNA evidence, combined with matching symptoms.
Also apparently they had 2 separate teams testing the samples using different methodology, so we've got at least a decent amount of confidence that something is going on with that sample, although it doesn't rule out systematic bias.