When we do need A/B testing, we’ll probably use something like Seldon. As for predictions/second, not very much at the moment: 1 per 30 seconds maybe? It’s not deployed into a Kubernetes cluster because of scaling requirements, it’s because that’s where all our other services greet deployed till, and it’s more beneficial (ops and cost wise) to also deploy into there than it is to bother with having a separate workflow for deploying to lambda’s or SageMaker.