Taming Servers for Fun and Profit (opens in new tab)

(blog.railway.com)

32 pointsdban1y ago20 comments

20 comments

> There are probably more effective methods of achieving the same, but it costs us less than a dollar to provision 50 servers using Claude to screen-scrape every minute during the install.

I think this is an important thing to remember/consider. I can't tell you how many personal projects I've stalled on worrying about costs "XYZ service/platform/API is expensive" without considering what "expensive" actually means.

Yes, they could have used OCR/image recognition-type software but what's easier than piping an image to an API and asking it?

LLMs frustrate me with their inconsistency/"fuzziness" (repeating instructions, putting them in all caps, saying "please" just rubs me the wrong way) but I know personally I have a bad habit of "That would be too expensive" or "How does it scale to X" when neither the cost nor the scale would ever be a real issue in the thing I'm writing.

bovermyer1y ago

Using Claude to parse screen-scrapes of a server's boot status is certainly novel. I did not expect a mention of AI usage in an article like this.

With that said, I wonder why they used AI at all here. Could they not have keyed off certain keywords or other information present in a screen scrape, rather than rely on Claude to parse it?

ordersofmag1y ago

The article says "...we can obtain a near real-time image of the server screen". Implying that what they have is an image file, not text. Getting keywords out of an image file would normally take an OCR step. A single API call to Claude does the trick without the extra tooling.

chatmasta1y ago

I asked ChatGPT to make text from a photo of a whiteboard and it just executed some Pytesseract code…

I’m reminded of the guy who setup an iPhone farm to use the iOS on-device OCR because he couldn’t find anything better.

Nextgrid1y ago

I wonder how much of this is just to get a talking point about AI in the article (and I guess it works - we're talking about it).

If you literally need to just detect whether it's at the firmware splash screen or not, simply checking if enough pixels on the image are white would detect that splash screen just fine.

ca5081y ago

author here; Claude use here was pure laziness - and personally, I found it quite funny that it worked. We could sample pixels and try and build that detection, but $<1c per run to write a prompt and get some json was too hilarious not to ship to prod.

Maybe it needs to have more complex logic/detection and we need something more complex down the road. But it's like easy and cheap OCR for now.

what was kinda funnier was that I tried to get Claude to generate its own Go client code to upload the image and run the prompt; it totally totally hallucinated on that part :).

cadamsdotcom1y ago

Great writeup even though the dots aren’t connected between a lot of the aspects presented. Didn’t know about udev exposing consistent device names for network interfaces for example, or the efforts to open source switch software, or how easy it’s become to run intra-datacenter BGP. Thanks for all the links!

And of course the brilliant use of AI and discussion of how cost-effective it is. “Hook it up to an AI to save money” is the world we can look forward to. In this case the problem is recognizing which state a thing is in from a list of known states. Once the LLM gives the state in text form, all kinds of automation are unlocked. I think that class of problem - converting a state based on an image into text form - is wildly common, and will be on the lookout for it in my own automation work!

ca5081y ago

> the dots aren’t connected between a lot of the aspects presented

that's on me (author); I tried to cut the content down to a manageable post size that covered some interesting stuff - but probably dropped the connective tissue in the process. We'll keep this in mind for next time.

Nextgrid1y ago

Out of interest, why not just PXE boot prebuilt images (buildroot/etc) that run from memory as your OS? That would save you the hassle of maintaining a stateful server, installing an OS, ensuring configuration is up to date, etc.

orev1y ago

As a general rule, a statement that starts with “why don’t you just…” typically leans far too heavily on the “just” to handwave away the reasons why the next part of the statement isn’t going to be helpful.

In this case, you’re assuming a huge number of things like infrastructure and other requirements are in place, and all of those things take a lot of time and work, if they’re even appropriate at all.

Imustaskforhelp1y ago

Probably because it isn't memory efficient ? I am not sure , please take what I am saying is with a grain of salt. I may be wrong , I usually am.

But even if that's the case , couldn't they use something like vram if they are running out of memory of are we back to square one?

Nextgrid1y ago

I mean yes, you are sacrificing some RAM. But a typical Linux OS to run VMs/containers could fit in a couple gigs. For the amount of RAM these servers have, this is peanuts and would save a lot of headaches dealing with statefulness.

3 more replies

j / k navigate · click thread line to collapse

20 comments

joshstrange1y ago

> There are probably more effective methods of achieving the same, but it costs us less than a dollar to provision 50 servers using Claude to screen-scrape every minute during the install.

Yes, they could have used OCR/image recognition-type software but what's easier than piping an image to an API and asking it?

bovermyer1y ago

Using Claude to parse screen-scrapes of a server's boot status is certainly novel. I did not expect a mention of AI usage in an article like this.

With that said, I wonder why they used AI at all here. Could they not have keyed off certain keywords or other information present in a screen scrape, rather than rely on Claude to parse it?

ordersofmag1y ago

chatmasta1y ago

I asked ChatGPT to make text from a photo of a whiteboard and it just executed some Pytesseract code…

I’m reminded of the guy who setup an iPhone farm to use the iOS on-device OCR because he couldn’t find anything better.

Nextgrid1y ago

I wonder how much of this is just to get a talking point about AI in the article (and I guess it works - we're talking about it).

If you literally need to just detect whether it's at the firmware splash screen or not, simply checking if enough pixels on the image are white would detect that splash screen just fine.

ca5081y ago

Maybe it needs to have more complex logic/detection and we need something more complex down the road. But it's like easy and cheap OCR for now.

what was kinda funnier was that I tried to get Claude to generate its own Go client code to upload the image and run the prompt; it totally totally hallucinated on that part :).

cadamsdotcom1y ago

ca5081y ago

> the dots aren’t connected between a lot of the aspects presented

Nextgrid1y ago

orev1y ago

Imustaskforhelp1y ago

Probably because it isn't memory efficient ? I am not sure , please take what I am saying is with a grain of salt. I may be wrong , I usually am.

But even if that's the case , couldn't they use something like vram if they are running out of memory of are we back to square one?

Nextgrid1y ago

3 more replies

j / k navigate · click thread line to collapse