It is utterly impossible for two trials of that size (thousands) to result in 60% and 90% effectiveness, respectively, by chance. I can't quite do a t-test in my head any more, but I'd guess that difference passes the 5% significance threshold somewhere around n=30.