Ask HN: Cheap hosting for ML API with intermittent usage

1 pointsmtoohig5y ago2 comments

Does a VPS/cloud provider exist that charges by the minute or has flexible RAM/CPU needs? Currently, I have a $5/month VM on Linode for my app but now I want to add some ML features; however, running the ML API on the same VM as my app causes oom errors and kills the process.

Of course I could purchase a larger VM at a greater fixed cost per month but since this API will only be called a few hundred times per month there must be an option for per minute or per call pricing while the API mostly sits idly waiting for a request.

I am aware of serverless but the time to load the ML models for each call seems like it would take way too long to get a response, unless I have a misunderstanding about serverless then do please inform me.

And if it matters for any of the answers I'm using FastAPI and Celery for the web side and task queue then I have Yolov3 to detect objects of interest from an image then pass the object image to another model for OCR and make a prediction of the text it finds. I'm new to ML, so I've got a lot to learn and appreciate all the feedback.

2 comments

2 comments · 1 top-level

mindhash5y ago· 1 in thread

Check out algorithmia - if that works for you.

Instead of looking for ML specific services, look at docker or container options. Its better if they have instance warm ups that way you can launch the vm when necessary.

Also see if you can split your solution into 2 parts. One that gathers intelligence in an offline batch mode. This data should help the second part to respond faster. This would work with some Algos only. Not so much for Deep learning

Explore deep learning compression techniques to reduce size of the model.

There ain't many services for one off models. It doesn't make economical sense for cloud providers.

mtoohigOP5y ago

Thank you for your response.

Algorithmia doesn't seem like what I was looking for but glad to know it exists in case I ever get to that level.

If I understand you correctly about splitting my solution into 2 parts for batches will not fit my needs since I want responses as soon as possible.

I have not heard of model compression yet, and will look further into that. But for now, it seems that I would be best to just pay up for the larger VM that can handle my needs. I didn't want to pay a greater fixed cost per month for my little fun project but I guess I have to.

j / k navigate · click thread line to collapse