Show HN: Attaching to a virtual GPU over TCP (opens in new tab)

(thundercompute.com)

344 pointsbmodel1y ago108 comments

We developed a tool to trick your computer into thinking it’s attached to a GPU which actually sits across a network. This allows you to switch the number or type of GPUs you’re using with a single command.

108 comments

80 comments · 28 top-level

steelbrain1y ago· 9 in thread

Ah this is quite interesting! I had a usecase where I needed a GPU-over-IP but only for transcoding videos. I had a not-so-powerful AMD GPU in my homelab server that somehow kept crashing the kernel any time I tried to encode videos with it and also an NVIDIA RTX 3080 in a gaming machine.

So I wrote https://github.com/steelbrain/ffmpeg-over-ip and had the server running in the windows machine and the client in the media server (could be plex, emby, jellyfin etc) and it worked flawlessly.

toomuchtodo1y ago

Have you done a Show HN yet? If not, please consider doing so!

https://gist.github.com/tzmartin/88abb7ef63e41e27c2ec9a5ce5d...

https://news.ycombinator.com/showhn.html

https://news.ycombinator.com/item?id=22336638

bhaney1y ago

This is more or less what I was hoping for when I saw the submission title. Was disappointed to see that the submission wasn't actually a useful generic tool but instead a paid cloud service. Of course the real content is in the comments.

As an aside, are there any uses for GPU-over-network other than video encoding? The increased latency seems like it would prohibit anything machine learning related or graphics intensive.

tommsy641y ago

There is a GPU-over-network software called Juice [1]. I've used it on AWS for running CPU-intensive workloads that also happen to need some GPU without needing to use a huge GPU instance. I was able to use a small GPU instance, which had just 4 CPU cores, and stream its GPU to one with 128 CPU cores.

I found Juice to work decently for graphical applications too (e.g., games, CAD software). Latency was about what you'd expect for video encode + decode + network: 5-20ms on a LAN if I recall correctly.

[1] - https://github.com/Juice-Labs/Juice-Labs

johnisgood1y ago

I am increasingly growing tired of these "cloud" services, paid or not. :/

2 more replies

trws1y ago

Some computation tasks can tolerate the latency if they’re written with enough overlap and can keep enough of the data resident, but they usually need more performant networking than this. See older efforts like rcuda for remote cuda over infiniband as an example. It’s not ideal, but sometimes worth it. Usually the win is in taking a multi-GPU app and giving it 16 or 32 of them rather than a single remote GPU though.

lostmsu1y ago

How do you use it for video encoding/decoding? Won't the uncompressed video (input for encoding or output of decoding) be too large to transmit over network practically?

1 more reply

Fnoord1y ago

I mean, anything you use a GPU/TPU for could benefit.

IPMI and such could use it. Like, for example, Proxmox could use it. Machine learning tasks (like Frigate) and hashcat could also use such. All in theory, of course. Many tasks use VNC right now, or SPICE. The ability to extract your GPU in the Unix way over TCP/IP is powerful. Though Node.js would not be the way I'd want such to go.

crishoj1y ago

Interesting. Do you know if your tool supports conversions resulting in multiple files, such as HLS and its myriad of timeslice files?

steelbrain1y ago

Since it’s sharing the underlying file system and just running ffmpeg remotely, it should support any variation of outputs

radarsat11y ago· 4 in thread

I'm confused, if this operates at the CPU/GPU boundary doesn't it create a massive I/O bottleneck for any dataset that doesn't fit into VRAM? I'm probably misunderstanding how it works but if it intercepts GPU i/o then it must stream your entire dataset on every epoch to a remote machine, which sounds wasteful, probably I'm not getting this right.

bmodelOP1y ago

That understanding of the system is correct. To make it practical we've implemented a bunch of optimizations to minimize I/O cost. You can see how it performs on inference with BERT here: https://youtu.be/qsOBFQZtsFM?t=69.

The overheads are larger for training compared to inference, and we are implementing more optimizations to approach native performance.

semitones1y ago

> to approach native performance.

The same way one "approaches the sun" when they take the stairs?

2 more replies

radarsat11y ago

Aah ok thanks, that was my basic misunderstanding, my mind just jumped straight to my current training needs but for inference it makes a lot of sense. Thanks for the clarification.

ranger_danger1y ago

Is DirectX support possible any time soon? This would be huge for Windows VMs on Linux...

1 more reply

Cieric1y ago· 4 in thread

This is interesting, but I'm more interested in self-hosting. I already have a lot of GPUs (some running some not.) Does this have a self-hosting option so I can use the GPUs I already have?

cpeterson421y ago

We don't support self hosting yet but the same technology should work well here. Many of the same benefits apply in a self-hosted setting, namely efficient workload scheduling, GPU-sharing, and ease-of-use. Definitely open to this possibility in the future!

donnygreenberg1y ago

If you want a PyTorch-like experience on your own GPUs (either static or cloud), see https://github.com/run-house/runhouse

covi1y ago

If you want to use your own GPUs or cloud accounts but with a great dev experience, see SkyPilot.

ellis0n1y ago

You can rent out your GPUs in the cloud with services like Akash Network and rent GPUs at thundercompute.com.. manager's path, almost like self-hosting :)

doctorpangloss1y ago· 4 in thread

I don't get it. Why would I start an instance in ECS, to use your GPUs in ECS, when I could start an instance for the GPUs I want in ECS? Separately, why would I want half of Nitro, instead of real Nitro?

bmodelOP1y ago

Great point, there are a few benefits:

1. If you're actively developing and need a GPU then you typically would be paying the entire time the instance is running. Using Thunder means you only pay for the GPU while actively using it. Essentially, if you are running CPU only code you would not be paying for any GPU time. The alterative for this is to manually turn the instance on and off which can be annoying.

2. This allows you to easily scale the type and number of GPUs you're using. For example, say you want to do development on a cheap T4 instance and run a full DL training job on a set of 8 A100. Instead of needing to swap instances and setup everything again, you can just run a command and then start running on the more powerful GPUs.

doctorpangloss1y ago

Okay, but your GPUs are in ECS. Don't I just want this feature from Amazon, not you, and natively via Nitro? Or even Google has TPU attachments.

> 1. If you're actively developing and need a GPU [for fractional amounts of time]...

Why would I need a GPU for a short amount of time during development? For testing?

I don't get it - what would testing an H100 over a TCP connection tell me? It's like, yeah, I can do that, but it doesn't represent an environment I am going to use for real. Nobody runs applications to GPUs on buses virtualized over TCP connections, so what exactly would I be validating?

2 more replies

billconan1y ago

it's more transparent to your system, for example, if you have a gui application that needs gpu acceleration on a thin client (Matlab, solidworks, blender), you can do so without setting up ECS. you can develop without any gpu, but suddenly have one when you need to run simulation. this will be way cheaper than AWS.

I think essentially this is solving the same problem Ray (https://www.ray.io/) is solving, but in a more generic way.

it potentially can have finer grained gpu sharing, like a half-gpu.

I'm very excited about this.

bmodelOP1y ago

Exactly! The finer grain sharing is one of the key things on our radar right now

1 more reply

billconan1y ago· 4 in thread

is this a remote nvapi?

this is awesome. can it do 3d rendering (vulkan/opengl)

bmodelOP1y ago

Thank you!

> is this a remote nvapi

Essentially yes! Just to be clear, this covers the entire GPU not just the NVAPI (i.e. all of cuda). This functions like you have the physical card directly plugged into the machine.

Right now we don't support vulkan or opengl since we're mostly focusing on AI workloads, however we plan to support these in the future (especially if there is interest!)

billconan1y ago

sorry, I didn't mean nvapi, I meant rmapi.

I bet you saw this https://github.com/mikex86/LibreCuda

they implemented the cuda driver by calling into rmapi.

My understanding is if there is a remote rmapi, other user mode drivers should work out of the box?

czbond1y ago

I am not in this "space", but I second the "this is cool to see", more stuff like this needed on HN.

cpeterson421y ago

Appreciate the praise!

dishsoap1y ago· 3 in thread

For anyone curious about how this actually works, it looks like a library is injected into your process to hook these functions [1] in order to forward them to the service.

[1] https://pastebin.com/raw/kCYmXr5A

almostgotcaught1y ago

How did you figure out these were hooked? I'm assuming some flag that tells ld/ldd to tell you when some symbol is rebound? Also I thought a symbol has to be a weak symbol to be rebound and assuming nvidia doesn't expose weak symbols (why would they) the implication is that their thing is basically LD_PRELOADed?

yarpen_z1y ago

Yes. While I don't know what they do internally, API remoting has been used for GPUs since at least rCUDA - that's over 10 years ago.

LD_PRELOAD trick allows you to intercept and virtualize calls to the CUDA runtime.

the84721y ago

Ah, I assumed/hoped they had some magic that would manage to forward a whole PCIe device.

cpeterson421y ago· 2 in thread

Given the interest here we decided to open up T4 instances for free. Would love for y'all to try it and let us know your thoughts!

dheera1y ago

What is your A100 and H100 pricing?

cpeterson421y ago

We are super early stage and don't have A100s or H100s live yet. Exact pricing TBD but expect it to be low. If you want to use them today, reach out directly and we can set them up :)

tptacek1y ago· 2 in thread

This is neat. Were you able to get MIG or vGPUs working with it?

bmodelOP1y ago

We haven't tested with MIG or vGPU, but I think it would work since it's essentially physically partitioning the GPU.

One of our main goals for the near future is to allow GPU sharing. This would be better than MIG or vGPU since we'd allow users to use the entire GPU memory instead of restricting them to a fraction.

tptacek1y ago

We had a hell of a time dealing with the licensing issues and ultimately just gave up and give people whole GPUs.

What are you doing to reset the GPU to clean state after a run? It's surprisingly complicated to do this securely (we're writing up a back-to-back sequence of audits we did with Atredis and Tetrel; should be publishing in a month or two).

1 more reply

the_reader1y ago· 2 in thread

Would be possible to mix it with Blender?

bmodelOP1y ago

At the moment out tech is linux-only so it would not work with Blender.

Down the line, we could see this being used for batched render jobs (i.e. to replace a render farm).

comex1y ago

Blender can run on Linux…

1 more reply

teaearlgraycold1y ago· 2 in thread

This could be perfect for us. We need very limited bandwidth but have high compute needs.

bmodelOP1y ago

Awesome, we'd love to chat! You can reach us at founders@thundercompute.com or join the discord https://discord.gg/nwuETS9jJK!

goku-goku1y ago

Feel free to reach out www.juicelabs.co

talldayo1y ago· 2 in thread

> Access serverless GPUs through a simple CLI to run your existing code on the cloud while being billed precisely for usage

Hmm... well I just watched you run nvidia-smi in a Mac terminal, which is a platform it's explicitly not supported on. My instant assumption is that your tool copies my code into a private server instance and communicates back and forth to run the commands.

Does this platform expose eGPU capabilities if my host machine supports it? Can I run raster workloads or network it with my own CUDA hardware? The actual way your tool and service connects isn't very clear to me and I assume other developers will be confused too.

bmodelOP1y ago

Great questions! To clarify the demo, we were ssh'd into a linux machine with no GPU.

Going into more details for how this works, we intercept communication between the CPU and the GPU so only GPU code and commands are sent across the network to a GPU that we are hosting. This way we are able to virtualize a remote GPU and make your computer think it's directly attached to that GPU.

We are not copying your CPU code and running it on our machines. The CPU code runs entirely on your instance (meaning no files need to be copied over or packages installed on the GPU machine). One of the benefits of this approach is that you can easily scale to a more / less powerful GPU without needing to setup a new server.

billconan1y ago

does this mean you have a customized/dummy kernel gpu driver?

will that cause system instability, say, if the network suddenly dropped?

1 more reply

throwaway888abc1y ago· 2 in thread

Does it work for gaming on windows ? or even linux ?

cpeterson421y ago

In theory yes. In practice, however, latency between the CPU and remote GPU makes this impractical

boxerbk1y ago

You could use a remote streaming protocol, like Parsec, for that. You'd need your own cloud account and connect directly to a GPU-enabled cloud machine. Otherwise, it would work to let you game.

Zambyte1y ago· 2 in thread

Reminds me of Plan9 :)

K0IN1y ago

can you elaborate a bit on why? (noob here)

Zambyte1y ago

In Plan 9 everything is a file (for real this time). Remote file systems are accessible through the 9P protocol (still used in modern systems! I know it's used in QEMU and WSL). Every process has its own view of the filesystem called a namespace. The implication of these three features is that remote resources can be transparently accessed as local resources by applications.

bkitano191y ago· 2 in thread

this is nuts

cpeterson421y ago

We think so too, big things coming :)

goku-goku1y ago

www.juicelabs.co

mmsc1y ago· 1 in thread

What's it like to actually use this for any meaningful throughput? Can this be used for hash cracking? Every time I think about virtual GPUs over a network, I think about botnets. Specifically from https://www.hpcwire.com/2012/12/06/gpu_monster_shreds_passwo... "Gosney first had to convince Mosix co-creator Professor Amnon Barak that he was not going to “turn the world into a giant botnet.”"

cpeterson421y ago

This is definitely an interesting thought experiment, however in practice our system is closer to AWS than a botnet, as the GPUs are not distributed. This technology does lend itself to some interesting applications with creating very flexible clusters within data centers that we are exploring.

orsorna1y ago· 1 in thread

So what exactly is the pricing model? Do I need a quote? Because otherwise I don't see how to determine it without creating an account which is needlessly gatekeeping.

bmodelOP1y ago

We're still in our beta so it's entirely free for now (we can't promise a bug-free experience)! You have to make an account but it won't require payment details.

Down the line we want to move to a pay-as-you-go model.

kawsper1y ago· 1 in thread

Cool idea, nice product page!

Does anyone know if this is possible with USB?

I have a Davinci Resolve license USB-dongle I'd like to not plugging into my laptop.

kevmo3141y ago

You can do that with USB/IP: https://usbip.sourceforge.net/

rubatuga1y ago· 1 in thread

What ML packages do you support? In the comments below it says you do not support Vulkan or OpenGL. Does this support AMD GPUs as well?

bmodelOP1y ago

We have tested this with pytorch and huggingface and it is mostly stable (we know there are issues with pycuda and jax). In theory this should work with any libraries, however we're still actively developing this so bugs will show up

winecamera1y ago· 1 in thread

I saw that in the tnr CLI, there are hints of an option to self-host a GPU. Is this going to be a released feature?

cpeterson421y ago

We don't support self-hosting yet but are considering adding it in the future. We're a small team working as hard as we can :)

Curious where you see this in the CLI, may be an oversight on our part. If you can join the Discord and point us to this bug we would really appreciate it!

tamimio1y ago· 1 in thread

I’m more interested in using tools like hashcat, any benchmark on these? As the docs link returns error.

bmodelOP1y ago

We haven't tested it with hashcat yet but plan on doing so. If you get to it before us please let us know how it works!

m3kw91y ago· 1 in thread

So won’t that make the network the prohibitive bottle neck? Your memory bandwidth is 1gbps max

teaearlgraycold1y ago

Cloud hosts will offer 10Gb/s. Anyway, in my experience with training LoRAs and running DINOv2 inference you don’t need much bandwidth. We are usually sitting at around 10-30MB/s per GPU.

test202408091y ago· 1 in thread

pocl (Portable Computing Language) [1] provides a remote backend [2] that allows for serialization and forwarding of OpenCL commands over a network.

Another solution is qCUDA [3] which is more specialized towards CUDA.

In addition to these solutions, various virtualization solutions today provide some sort of serialization mechanism for GPU commands, so they can be transferred to another host (or process). [4]

One example is the QEMU-based Android Emulator. It is using special translator libraries and a "QEMU Pipe" to efficiently communicate GPU commands from the virtualized Android OS to the host OS [5].

The new Cuttlefish Android emulator [6] uses Gallium3D for transport and the virglrenderer library [7].

I'd expect that the current virtio-gpu implementation in QEMU [8] might make this job even easier, because it includes the Android's gfxstream [9] (formerly called "Vulkan Cereal") that should already support communication over network sockets out of the box.

[1] https://github.com/pocl/pocl

[2] https://portablecl.org/docs/html/remote.html

[3] https://github.com/coldfunction/qCUDA

[4] https://www.linaro.org/blog/a-closer-look-at-virtio-and-gpu-...

[5] https://android.googlesource.com/platform/external/qemu/+/em...

[6] https://source.android.com/docs/devices/cuttlefish/gpu

[7] https://cs.android.com/android/platform/superproject/main/+/...

[8] https://www.qemu.org/docs/master/system/devices/virtio-gpu.h...

[9] https://android.googlesource.com/platform/hardware/google/gf...

fpoling1y ago

Zscaler uses a similar approach in their remote browser. WebGL in the local browser exposed as a GPU to a Chromium instance in the cloud.

somat1y ago

What makes me sad is that the original sgi engineers who developed glx were very careful to use x11 mechanisms for the gpu transport, so it was fairly trivial to send the gl stream over the network to render on your graphics card. "run on the supercomputer down the hall, render on your workstation". More recent driver development has not shown such care and this is usually no longer possible.

I am not sure how useful it was in reality(usually if you had a nice graphics card you also had a nice cpu) but I had fun playing around with it. There was something fascinating about getting accelerated graphics on a program running in the machine room. I was able to get glquake running like this once.

userbinator1y ago

It's impressive that this is even possible, but I wonder what happens if the network connection goes down or is anything but 100% stable? In my experience drivers react badly to even a local GPU that isn't behaving.

delijati1y ago

Even a directly attached eGPU via thunderbold 4 was after some time too slow for machine learning aka training. As i work now fully remote i just have a beefy midi tower. Some context about eGPU [1].

But hey i'm happy to be proofed wrong ;)

[1] https://news.ycombinator.com/item?id=38890182#38905888

ellis0n1y ago

In 2008, I had a powerful server with XEON CPU, but the motherboard had no slots for a graphics card. I also had a computer with a powerful graphics card but a weak Core 2 Duo. I had the idea of passing the graphics card over the network using Linux drivers. This concept has now been realized in this project. Good job!

xyst1y ago

Exciting. But would definitely like to see a self hosted option.

cpeterson421y ago

We created a discord for the latest updates, bug reports, feature suggestions, and memes. We will try to respond to any issues and suggestions as quickly as we can! Feel free to join here: https://discord.gg/nwuETS9jJK

j / k navigate · click thread line to collapse

108 comments

80 comments · 28 top-level

steelbrain1y ago· 9 in thread

toomuchtodo1y ago

Have you done a Show HN yet? If not, please consider doing so!

https://gist.github.com/tzmartin/88abb7ef63e41e27c2ec9a5ce5d...

https://news.ycombinator.com/showhn.html

https://news.ycombinator.com/item?id=22336638

bhaney1y ago

As an aside, are there any uses for GPU-over-network other than video encoding? The increased latency seems like it would prohibit anything machine learning related or graphics intensive.

tommsy641y ago

[1] - https://github.com/Juice-Labs/Juice-Labs

johnisgood1y ago

I am increasingly growing tired of these "cloud" services, paid or not. :/

2 more replies

trws1y ago

lostmsu1y ago

How do you use it for video encoding/decoding? Won't the uncompressed video (input for encoding or output of decoding) be too large to transmit over network practically?

1 more reply

Fnoord1y ago

I mean, anything you use a GPU/TPU for could benefit.

crishoj1y ago

Interesting. Do you know if your tool supports conversions resulting in multiple files, such as HLS and its myriad of timeslice files?

steelbrain1y ago

Since it’s sharing the underlying file system and just running ffmpeg remotely, it should support any variation of outputs

radarsat11y ago· 4 in thread

bmodelOP1y ago

The overheads are larger for training compared to inference, and we are implementing more optimizations to approach native performance.

semitones1y ago

> to approach native performance.

The same way one "approaches the sun" when they take the stairs?

2 more replies

radarsat11y ago

Aah ok thanks, that was my basic misunderstanding, my mind just jumped straight to my current training needs but for inference it makes a lot of sense. Thanks for the clarification.

ranger_danger1y ago

Is DirectX support possible any time soon? This would be huge for Windows VMs on Linux...

1 more reply

Cieric1y ago· 4 in thread

This is interesting, but I'm more interested in self-hosting. I already have a lot of GPUs (some running some not.) Does this have a self-hosting option so I can use the GPUs I already have?

cpeterson421y ago

donnygreenberg1y ago

If you want a PyTorch-like experience on your own GPUs (either static or cloud), see https://github.com/run-house/runhouse

covi1y ago

If you want to use your own GPUs or cloud accounts but with a great dev experience, see SkyPilot.

ellis0n1y ago

You can rent out your GPUs in the cloud with services like Akash Network and rent GPUs at thundercompute.com.. manager's path, almost like self-hosting :)

doctorpangloss1y ago· 4 in thread

bmodelOP1y ago

Great point, there are a few benefits:

doctorpangloss1y ago

Okay, but your GPUs are in ECS. Don't I just want this feature from Amazon, not you, and natively via Nitro? Or even Google has TPU attachments.

> 1. If you're actively developing and need a GPU [for fractional amounts of time]...

Why would I need a GPU for a short amount of time during development? For testing?

2 more replies

billconan1y ago

I think essentially this is solving the same problem Ray (https://www.ray.io/) is solving, but in a more generic way.

it potentially can have finer grained gpu sharing, like a half-gpu.

I'm very excited about this.

bmodelOP1y ago

Exactly! The finer grain sharing is one of the key things on our radar right now

1 more reply

billconan1y ago· 4 in thread

is this a remote nvapi?

this is awesome. can it do 3d rendering (vulkan/opengl)

bmodelOP1y ago

Thank you!

> is this a remote nvapi

Essentially yes! Just to be clear, this covers the entire GPU not just the NVAPI (i.e. all of cuda). This functions like you have the physical card directly plugged into the machine.

Right now we don't support vulkan or opengl since we're mostly focusing on AI workloads, however we plan to support these in the future (especially if there is interest!)

billconan1y ago

sorry, I didn't mean nvapi, I meant rmapi.

I bet you saw this https://github.com/mikex86/LibreCuda

they implemented the cuda driver by calling into rmapi.

My understanding is if there is a remote rmapi, other user mode drivers should work out of the box?

czbond1y ago

I am not in this "space", but I second the "this is cool to see", more stuff like this needed on HN.

cpeterson421y ago

Appreciate the praise!

dishsoap1y ago· 3 in thread

For anyone curious about how this actually works, it looks like a library is injected into your process to hook these functions [1] in order to forward them to the service.

[1] https://pastebin.com/raw/kCYmXr5A

almostgotcaught1y ago

yarpen_z1y ago

Yes. While I don't know what they do internally, API remoting has been used for GPUs since at least rCUDA - that's over 10 years ago.

LD_PRELOAD trick allows you to intercept and virtualize calls to the CUDA runtime.

the84721y ago

Ah, I assumed/hoped they had some magic that would manage to forward a whole PCIe device.

cpeterson421y ago· 2 in thread

Given the interest here we decided to open up T4 instances for free. Would love for y'all to try it and let us know your thoughts!

dheera1y ago

What is your A100 and H100 pricing?

cpeterson421y ago

We are super early stage and don't have A100s or H100s live yet. Exact pricing TBD but expect it to be low. If you want to use them today, reach out directly and we can set them up :)

tptacek1y ago· 2 in thread

This is neat. Were you able to get MIG or vGPUs working with it?

bmodelOP1y ago

We haven't tested with MIG or vGPU, but I think it would work since it's essentially physically partitioning the GPU.

One of our main goals for the near future is to allow GPU sharing. This would be better than MIG or vGPU since we'd allow users to use the entire GPU memory instead of restricting them to a fraction.

tptacek1y ago

We had a hell of a time dealing with the licensing issues and ultimately just gave up and give people whole GPUs.

1 more reply

the_reader1y ago· 2 in thread

Would be possible to mix it with Blender?

bmodelOP1y ago

At the moment out tech is linux-only so it would not work with Blender.

Down the line, we could see this being used for batched render jobs (i.e. to replace a render farm).

comex1y ago

Blender can run on Linux…

1 more reply

teaearlgraycold1y ago· 2 in thread

This could be perfect for us. We need very limited bandwidth but have high compute needs.

bmodelOP1y ago

Awesome, we'd love to chat! You can reach us at founders@thundercompute.com or join the discord https://discord.gg/nwuETS9jJK!

goku-goku1y ago

Feel free to reach out www.juicelabs.co

talldayo1y ago· 2 in thread

> Access serverless GPUs through a simple CLI to run your existing code on the cloud while being billed precisely for usage

bmodelOP1y ago

Great questions! To clarify the demo, we were ssh'd into a linux machine with no GPU.

billconan1y ago

does this mean you have a customized/dummy kernel gpu driver?

will that cause system instability, say, if the network suddenly dropped?

1 more reply

throwaway888abc1y ago· 2 in thread

Does it work for gaming on windows ? or even linux ?

cpeterson421y ago

In theory yes. In practice, however, latency between the CPU and remote GPU makes this impractical

boxerbk1y ago

You could use a remote streaming protocol, like Parsec, for that. You'd need your own cloud account and connect directly to a GPU-enabled cloud machine. Otherwise, it would work to let you game.

Zambyte1y ago· 2 in thread

Reminds me of Plan9 :)

K0IN1y ago

can you elaborate a bit on why? (noob here)

Zambyte1y ago

bkitano191y ago· 2 in thread

this is nuts

cpeterson421y ago

We think so too, big things coming :)

goku-goku1y ago

www.juicelabs.co

mmsc1y ago· 1 in thread

cpeterson421y ago

orsorna1y ago· 1 in thread

So what exactly is the pricing model? Do I need a quote? Because otherwise I don't see how to determine it without creating an account which is needlessly gatekeeping.

bmodelOP1y ago

We're still in our beta so it's entirely free for now (we can't promise a bug-free experience)! You have to make an account but it won't require payment details.

Down the line we want to move to a pay-as-you-go model.

kawsper1y ago· 1 in thread

Cool idea, nice product page!

Does anyone know if this is possible with USB?

I have a Davinci Resolve license USB-dongle I'd like to not plugging into my laptop.

kevmo3141y ago

You can do that with USB/IP: https://usbip.sourceforge.net/

rubatuga1y ago· 1 in thread

What ML packages do you support? In the comments below it says you do not support Vulkan or OpenGL. Does this support AMD GPUs as well?

bmodelOP1y ago

winecamera1y ago· 1 in thread

I saw that in the tnr CLI, there are hints of an option to self-host a GPU. Is this going to be a released feature?

cpeterson421y ago

We don't support self-hosting yet but are considering adding it in the future. We're a small team working as hard as we can :)

Curious where you see this in the CLI, may be an oversight on our part. If you can join the Discord and point us to this bug we would really appreciate it!

tamimio1y ago· 1 in thread

I’m more interested in using tools like hashcat, any benchmark on these? As the docs link returns error.

bmodelOP1y ago

We haven't tested it with hashcat yet but plan on doing so. If you get to it before us please let us know how it works!

m3kw91y ago· 1 in thread

So won’t that make the network the prohibitive bottle neck? Your memory bandwidth is 1gbps max

teaearlgraycold1y ago

Cloud hosts will offer 10Gb/s. Anyway, in my experience with training LoRAs and running DINOv2 inference you don’t need much bandwidth. We are usually sitting at around 10-30MB/s per GPU.

test202408091y ago· 1 in thread

pocl (Portable Computing Language) [1] provides a remote backend [2] that allows for serialization and forwarding of OpenCL commands over a network.

Another solution is qCUDA [3] which is more specialized towards CUDA.

In addition to these solutions, various virtualization solutions today provide some sort of serialization mechanism for GPU commands, so they can be transferred to another host (or process). [4]

One example is the QEMU-based Android Emulator. It is using special translator libraries and a "QEMU Pipe" to efficiently communicate GPU commands from the virtualized Android OS to the host OS [5].

The new Cuttlefish Android emulator [6] uses Gallium3D for transport and the virglrenderer library [7].

[1] https://github.com/pocl/pocl

[2] https://portablecl.org/docs/html/remote.html

[3] https://github.com/coldfunction/qCUDA

[4] https://www.linaro.org/blog/a-closer-look-at-virtio-and-gpu-...

[5] https://android.googlesource.com/platform/external/qemu/+/em...

[6] https://source.android.com/docs/devices/cuttlefish/gpu

[7] https://cs.android.com/android/platform/superproject/main/+/...

[8] https://www.qemu.org/docs/master/system/devices/virtio-gpu.h...

[9] https://android.googlesource.com/platform/hardware/google/gf...

fpoling1y ago

Zscaler uses a similar approach in their remote browser. WebGL in the local browser exposed as a GPU to a Chromium instance in the cloud.

somat1y ago

userbinator1y ago

delijati1y ago

Even a directly attached eGPU via thunderbold 4 was after some time too slow for machine learning aka training. As i work now fully remote i just have a beefy midi tower. Some context about eGPU [1].

But hey i'm happy to be proofed wrong ;)

[1] https://news.ycombinator.com/item?id=38890182#38905888

ellis0n1y ago

xyst1y ago

Exciting. But would definitely like to see a self hosted option.

cpeterson421y ago

j / k navigate · click thread line to collapse