It shows that Julia's tests are systematically finding (and leading to fixes) of numerical bugs that are pervasive throughout the rest of the LLVM ecosystem. And since Julia's LLVM is patched to solve these while other variants of LLVM are not, Julia is more correct in these aspects than other languages which rely on the Base build of LLVM. Of course Julia doesn't solve "all bugs", but some of them (like the correctness of certain math library implementations) really make you question how hard other language tests are hammering those for correctness testing (Julia has a lot of numerical tests checking the precision of such methods against MPFR bigfloats at higher precision to ensure ~X ulp correctness for example). Julia definitely spends a lot of time testing numerical correctness than it does testing something like a web server. It's just a prioritization thing.