Extraordinary claims require extraordinary evidence.
That's not to say that a 70B model is necessary, but surely something larger than 3B is doable, especially given that the results of the paper directly imply a significant reduction in memory requirements for training such a model.