That sounds like a big issue, but surely assuming for either case is bad.
I expect OS's will expose an API which, when queried, will indicate the level of AI inference available.
Similar to video decoding/encoding where clients can check if hardware acceleration is available.