/home/bjourne/p/laser/benchmarks/gemm/gemm_bench_float32.nim(77, 8) Warning: use `std/os` instead; ospaths is deprecated [Deprecated] /home/bjourne/p/laser/benchmarks/gemm/gemm_bench_float32.nim(101, 8) template/generic instantiation of `bench` from here /home/bjourne/p/laser/benchmarks/gemm/gemm_bench_float32.nim(106, 21) template/generic instantiation of `gemm_nn_fallback` from here /home/bjourne/p/laser/benchmarks/gemm/arraymancer/blas_l3_gemm.nim(85, 34) template/generic instantiation of `newBlasBuffer` from here /home/bjourne/p/laser/benchmarks/gemm/arraymancer/blas_l3_gemm_data_structure.nim(30, 6) Error: signature for '=destroy' must be proc[T: object](x: var T) or proc[T: object](x: T)
Anyway the reason for your competitive performance is likely that you are benchmarking with very small matrices. OpenBLAS spends some time preprocessing the tiles which doesn't really pay off until they become really huge.