So I built https://gpu.land/.
It’s a simple service that only does one thing: rents out Tesla V100s in the cloud.
Why is it awesome?
- It’s dirt-cheap. You get a Tesla V100 for $0.99/hr, which is 1/3 the cost of AWS/GCP/Azure/[insert big cloud name].
- It’s dead simple. It takes 2mins from registration to a launched instance. Instances come pre-installed with everything you need for Deep Learning, including a 1-click Jupyter server.
- It sports a retro, MS-DOS-like look. Because why not:)
The most common question I get is - how is this so cheap? The answer is because AWS/GCP are charging you a huge markup and I’m not. In fact I’m charging just enough to break even, and built this project really to give back to community (and to learn some of the tech in the process).
HN special: email me a few lines about yourself and what you’re working on and get $10 in free credit. I’m at hi@gpu.land.
Otherwise I’m around for any questions!
Although lately they have had demand outstrip supply.
Hetzner used to have GTX 1080 instances for about $100 a month, no longer though, and I'm lucky to be grandfathered in to about 8 of them.
I told myself a couple years ago that compute will only get cheaper over time. In reality, compute has gotten MORE expensive over time! I cannot match or scale the compute price I locked in a couple years ago with Hetzner. Some of that is the crypto market raising GPU prices, but it is also NVIDIA's licensing making their cheaper cards unavailable in cloud servers...
But I still don't know what it would cost to get something useful out of it. Can you (or anybody who knows about ML) tell me a very very ballpark amount of what costs it incurs to train a model like www.remove.bg that automatically removes background from photos? I'm not trying to build a clone but curios what sort ot financial investment it takes to make such things.
Edit: maybe you can get some decent results just with COCO segmentation labels: https://towardsdatascience.com/background-removal-with-deep-...
Sorry if I missed it, but one thing I couldn't find on the website is what legal entity is behind the service. It would be important for me and the organisations I work with to know who we are trusting with our data.
It might even be a legal obligation to have that info on the site depending on which country you are located in (Germany for sure, not sure about others).
Do you support multi-node training? Particularly, can I reserve for example a 64-GPU instance via 8 nodes and perform distributed training?
See the "Are instances within my account connected?" item in the FAQ - https://gpu.land/faq
If you're going to need 64 GPUs I'll have to increase your account limit (currently 16 GPUs per account). Email me at hi@gpu.land
Can you share more stories on data persistence / checkpointing? If I have a job that requires 8 V100 with 3 days, what types of reliability am I looking at?
I run a credit card processor on AWS, my important personal websites on Linode, and my fun time websites and video conferencing on a small Chicago host called Genesis Hosting with a website out of the 90s and dirt cheap pricing (but excellent support). Match your price and SLA with how much pain it's going to cause you if it goes down, and don't put pay extra to put all your stuff on the same 5 9s host if you don't really have to.
I would love to hear your thoughts on this
So the way I would describe your progression as a student (at least from my experience):
- Traditional ML -> probably can run on your laptop
- Simple DL -> colab is great. Cheapest there is.
- SOTA DL -> you're probably looking at training times well into 10s of hours / days, so you'll need something that can last longer than 12h of colab time. Plus at this point you're probably sophisticated enough that you want to setup your instance once and start/stop it rather than going through setup with colab every time. That's where https://gpu.land/ fits it.
So tl;dr; - absolutely use colab first (it's cheapest), but when you outgrow it, consider gpu.land.
I will add to FAQ, thanks for pointing out.
An AVX-512 Skylake-X cloud compute instance costs $10 per CPU-core per month at Vultr (https://www.vultr.com/products/cloud-compute/), and you can do about 18 DenseNet121 inferences per CPU-core per second (in series, not batched) using tools like NN-512
GPU cloud compute is almost unbelievably expensive. Even Linode charges $1000 per month, or $1.50 per hour (look at the GPU plans: https://www.linode.com/pricing/#row--compute). It's really hard to keep that GPU saturated, which is what you need to do to get your money's worth
As AVX-512 becomes better supported by Intel and AMD chips, it becomes more attractive as an alternative to expensive GPU instances for workloads with small amounts of inference mixed with other computation
Perhaps readers would benefit from an apples-to-apples comparison of V100 to some CPU in a training per dollar metric? Preferably using something like MLPerf. You do mention inference but I think most people looking at https://gpu.land/ will be far more interested in training rather than inference.
I think the most direct competitor to this ShowHN would be https://vast.ai/
The biggest difference is probably security / guaranteed uptime. With vast you're getting what it says on the tin - a machine from a marketplace. Could come from anyone / anywhere. No idea what else is running on it. Ours are hosted in a professional DC, managed and secured as they should be.
If anyone's curious, there's a detailed comparison page with other platforms here - https://gpu.land/versus
https://blog.roblox.com/2020/05/scaled-bert-serve-1-billion-...
They're not training so I think that's the difference.
How much does this apply to the Xeon CPUs found in servers?
So it is best to forbid it in the TOS and just autokill the miners, rather than wait for the chargebacks. Especially if the compute is being sold at cost, since there is no buffer of profits to balance out the abuse.
So we actually explicitly prohibit mining in the T&Cs, it's just that in the FAQ I wanted to be a bit more human and dissuade people.
Also, there's quite a few protections built-in at the network layer to make sure mining isn't possible. Ports, ips, dns, even DPI. I learnt a lot about the early days of bitcoin and mining protocols when I was building gpu.land:)
But again, thanks for sharing this. Really good to know.
This was my first front-end project so any feedback 100% welcome.
Actually, I'm renting V100s. Got lucky to know the right person at the right time:)
For the same reason, I'm only getting value out of the gpus when I am right ready to train. So I would be much happier if I could push a docker, or maybe even a conda environment spec, along with my code, attach data storage, and run e.g. train.py (or more likely a shell script that calls it) to completion and then immediately release the GPUs. Everything else is just the overhead of as quickly as possible trying to get the environment right, run my script, and then shut down the instance as soon as it's done. It would be awesome to have this kind of functionality, or is there a way to do that I missed?
Regarding environments, it's funny, when I was starting out I would kill for a pre-configured instance because I wanted to focus on modeling and didn't care if package X was version x.x.y or x.x.z. But as you grow as an ML engineer and develop your own toolkit these things start to matter.
So when creating a machine on gpu.land you have the choice of going pre-configured or just having a clean Ubuntu image. The former is meant for newcommers while the latter for pros. That was my thingking!
I saw someone ask on reddit about separating storage from the GPU instance, so that one could do data transfer and other setup without reserving a gpu. I want to echo how important this is. Another case is where I might want to use only one GPU and then scale up to 8 for training, or i might want to have N GPU instances attached to the same storage to run jobs in parallel. There are lots of other examples, but overall it would add much flexibility.
On the other hand, it might encourage people to be more "peaky" in their use, which could be a challenge for you. From what I understand, you are much better off if I want 1 gpu for 70 hrs vs, 70 gpus for an hour, in which case I understand how you might want to encourage steady use.
(Least) P100 < V100 < A100 (Most)
thanks!
Added to feature requests!
And (this may sound naive) I work with other companies' data, I have a responsibility to them to keep it safe, do you have any concerns with this being used for "professional" applications or are you targeting research / hobby?
To be clear, I am dealing with things that my clients are comfortable with me working with on mainstream cloud providers, not state secrets or data with legislative requirements.
1) There's a finite capacity in the high 10s of GPUs. Currently the service is utilized at <10% capacity, so there's a lot of room to grow before we run out. If that was to happen (or even if we get to 50%), I could go and request more and (hopefully) expand the capacity quickly.
2)When building gpu.land I specifically wanted to make it safe for users to upload / store sensitive data. Of course the service is more geared towards hobbyists / researchers, just because that's an easier market to reach for a solo dev - but there is nothing at the technical level that sacrifices data privacy. For example, data on instances is encrypted both at rest and in transit within the DC + you and only you have SSH keys to your instance and nobody else can access it. Check out our security & privacy section in our FAQ - https://gpu.land/faq
*edit: I see that only Linux hosts are available at the moment so that kills my use case but I'll keep the question up for grins and giggles
You should check out some of the services people mention in https://www.reddit.com/r/cloudygamer/
EDIT: found in the FAQ:
Compute: $0.99/hr / 1x Tesla V100 (running instance only) Storage: $0.02/GB/month (running and stopped instances)
If you have a say 200GB harddrive you're only paying $4/month for storage.
> Is my instance guaranteed?
> Yes, unless you run out of credit. To prevent that from happening be sure to setup automatic top ups.
For on-demand this is true, but spot GPU instances can have competitive pricing.
At this moment, a p3.2xlarge instance (1 Tesla v100) is 92 cents/hr on the AWS spot market in us-east-1. The p3.8xlarge (4 GPU) is $3.62/hr.
Also, I am assuming that requesting more GPU instances allows me to access more persistent storage?