Feature detection for WebAssembly[0] is stuck in spec discussions, and SIMD general availability is blocked on either that or its own mechanism for backwards compatibility[1].
The issue is that a WebAssembly binary that contains instructions unknown to the engine (e.g. SIMD instructions not supported by a particular engine) won't validate, even if the functions aren't used at runtime. The only way to work around this is to compile your binary NxMx... times and detect which feature set is supported before loading a binary. It's a real pain in the tail when trying to support new WebAssembly features.
e.g. check out this snippet from canvas.apps.chrome which supports WebAssembly threads on Chrome with a non-thread fallback for e.g. mobile / Firefox:
var X;
try {
X = (new WebAssembly.Memory({
initial: 1,
maximum: 1,
shared: !0
})).buffer instanceof SharedArrayBuffer ? !0 : !1
} catch (a) {
X = !1
}
var ua = r(X ? ["js/threads/ink.js", "defines_threads.js"] : ["js/nothreads/ink.js", "defines.js"])
, va = ua.next().value
, wa = ua.next().value;
[0]: https://github.com/WebAssembly/conditional-sections
[1]: https://github.com/WebAssembly/simd/issues/356If the CPU is really faster than the GPU, that really demonstrates how inefficient the WebGL backend really is, compared to something like CUDA.
This is especially painful on mobile where GPU and CPU memory are the same physical RAM, and the "map buffer" operation corresponds to an actual instruction to the memory controller rather than synchronizing memory across PCIe lanes.
[0]: https://www.khronos.org/registry/webgl/specs/latest/2.0/#5.1... [1]: https://www.khronos.org/registry/webgl/specs/latest/2.0/#3.7... - Note the "non-normative" block describing the potential to bypass the specified blocking behavior for getBufferSubData.
(Yes, there are unofficial wheels for various CPUs, but, not sure if that passes your security requirements.)
You need tensorflow to actually use the models trained with tensorflow.