Anything can be a message queue if you use it wrongly enough (opens in new tab)

(xeiaso.net)

624 pointscendyne3y ago239 comments

239 comments

137 comments · 37 top-level

tschumacher3y ago· 23 in thread

I once used a MySQL database as a replacement for a message queue. This was the easiest solution to implement since all the servers were already connected to the database anyways. A server would write a new row to the table and all the servers would remember the last row they had already seen. Occasionally the table is cleared. I'm sure there are some race conditions in the system but its only purpose is to send Discord notifications when someone breaks a highscore in a video game, so its not really critical. It's still working that way today.

hardwaresofton3y ago

See also: postgres with SKIP LOCKED.

Great talk on it from Citus Con 2022:

Queues in PostgreSQL | Citus Con: An Event for Postgres 2022 -- https://www.youtube.com/watch?v=WIRy1Ws47ic&list=PLlrxD0Htie...)

Citus is a horizontal scale out extension for postgres -- imagine the power of those two things together!

JK you don't have to imagine:

https://www.citusdata.com/blog/2018/01/24/citus-and-pg-partm...

andrewstuart3y ago

I wrote an implementation in Python.

https://github.com/starqueue/starqueue

The code is in there for Postgres, MS SQL and MySQL (which all support SKIP LOCKED) though at some point I abandoned all but Postgres.

If I was to write another message queue then I wouldn’t use a database, I’d use the file system based around Linux file moves, which are atomic. What I really want is a message queue that is fast and requires zero config, file based message queues are both…. better than a database.

3 more replies

xvinci3y ago

Since you seem to be from citusdata: I used cstore_fdw 2 - 3 years back and at least when paired with TPC-H it was horrendously broken for both small (10 gig) and large (100 gig) datasets. It has been integrated into some other product by the time being, I hope you managed to improve it.

1 more reply

Ozzie_osman3y ago

This is actually pretty common, and usually a "good enough" solution. You can also add things like scheduling (add a run_at column), at least once execution (mark a row when it is being processed, delete it only when successful), topics, etc with minor modifications to your table.

If you want something that works "well enough" I'd say it's a reasonable choice.

larperdoodle3y ago

Yeah, I'm using it as a transactional outbox to ensure at least once delivery to SNS.

Can't really think of a better way to ensure that a message is always sent if the DB transactions succeeds and is never sent if the DB transaction fails

1 more reply

waplot3y ago

Nothing wrong with using the DB as a mq, especially if the load is small enough. Plenty of tools are built on that, these two come to mind

https://github.com/procrastinate-org/procrastinate

https://github.com/bensheldon/good_job

renewiltord3y ago

Segment did so quite successfully https://segment.com/blog/introducing-centrifuge/

hu33y ago

> We decided to store Centrifuge data inside Amazon’s RDS instances running on MySQL. RDS gives us managed datastores, and MySQL provides us with the ability to re-order our jobs.

Interesting. Thanks for sharing

pinkcan3y ago

every half year or so I remember centrifuge, and get a little sad they didn't write more about it

glun3y ago

If you don't want to publish events from uncommitted transactions you'll have to first store them in a local table and then move them to the queue after the commit. But if all consumers have direct access to the database anyway...

bob10293y ago

I am doing the same with SQL Server. The messages table is more of a bus than a queue in our case (columns like ReplyToId, etc). Using it for RPC communication between cloud bits. Much cheaper than Azure Service Bus and friends.

Digit-Al3y ago

Just out of interest: any reason you're not using the SQL server service broker?

2 more replies

_yb2s3y ago

Why not sqlite with a lockfile? /s

moron4hire3y ago

I briefly worked for a major corporation 15 years ago that did this with SQL Server to create distributed worker processes to handle all the AI-generated used car listings and photo recolorings [0] for almost all of the used car lots in the country.

[0] Why take hundreds of photos of Honda Civics in red, green, blue, and black when you already have a dozen in white?

dylan6043y ago

Why even take the dozen in white when they have a model you can render in any manner? Most car commercials do not have real cars in them. Maybe the shots of a car actually in motion, but most of the static shots are 3D models placed onto backgrounds. I don't know why, but I was surprised by this when I worked in a post house that did a lot of car commercials. One of the roles for a coworker was to get flown around to locations to take the images for the background plates using photogrammetry. "Can't fly an Alexa through the back glass to zoom in on the dash now can we" was one comment.

1 more reply

rollcat3y ago

I've built a hybrid task queue/process supervisor on top of SQL. Classical task queues like Celery didn't exactly fit our use case: a single process could run for hours or days, but in case of a node failing, it must be resurrected elsewhere as soon as possible (within seconds). I didn't have the time to re-architect everything for Kubernetes, or rewrite half the product in Erlang; so I built that weird thing. It's been super stable, running mission critical code, and making us money - for several years now.

wombatpm3y ago

I’ve yet to find the project where Celery is the best solution, despite using it on several

takinola3y ago

I implemented a message queue in MySQL too and it worked pretty well. Incoming messages would be written to the table and the workers would poll the database each cron period and process whatever rows were in the queue. To avoid race conditions, the workers would lock the records they were working on and then delete them as soon as the work was complete. It was simple but it worked just fine for my purposes

aa-jv3y ago

This has been a thing since before databases were relational. 4G languages (Progress, etc.) were especially nice for their ability to wrap a queue table around a series of reversible transactions, if you coded things right .. meaning a lot of modules written for app infrastructure were based on an 'inbox table' methodology ..

scarface743y ago

I’ve run into all sorts of database locking issues and concurrency issues when using a database as a queue. I saw that mistake made a long time ago and I would never do it myself.

wolfgang423y ago

Database engines are getting features like SELECT FOR UPDATE SKIP LOCKED, so what were once serious blockers on this idea may no longer be as much of a problem.

3 more replies

avereveard3y ago

Nuxeo unironically used a db table as a pub sub system between the cluster nodes for cache invalidations.

ljm3y ago

It’s not so out of the ordinary. A few libraries in Rails create message queues in Postgres using advisory locks and listen/notify.

Hell, if it’s not an RDBMS then it’ll be Redis (at a much greater expense for a managed instance). I’ve seen that setup in the Ruby world far more often than using a dedicated message queue.

whartung3y ago· 12 in thread

I know of a Famous Large Company that used email as their message queue to synchronize the data across two of their large systems.

It's a perfectly apt message queue, just a bit heavyweight. But if it's "light enough", it comes "for free" with many OSes.

anotherevan3y ago

One of my customers used email (a gmail account, no less) as a message queue between their front end site and the back-office processor. This worked quite well for close to a decade I think.

It basically evolved from when applications from their original customer facing site were emailed and manually entered into the back-office system by a human. They were looking to automate this with a minimum of changes to too many of the moving parts at once, so I reformatted the sent email to contain an XML payload so the new back-office automation could read and process it (and in a pinch, a human could still review any problem applications) using Java's mail APIs.

Things evolved, the front-end web site got replaced with a Wordpress site, but the email message queue kept working for a long time. In the last year or so it was getting more and more onerous though. Reconciling information between the front-end and the mail box showed not all emails were being delivered, and authentication to gmail was becoming more and more of a burdensome moving target.

I just recently replaced the whole thing with an API call made from the back-office to the Wordpress site to access the stored data. (The original site didn't store, just emailed, which was why this was not an option historically.)

schwartzworld3y ago

I wrote a static site generator with comments that used email as an ingestion cue.

1 more reply

EvanAnderson3y ago

Active Directory is (was?) capable of running replication traffic (for very particular use cases) over SMTP. I always thought that was ingenious.

doubled1123y ago

Email is a message queue in the most literal of senses, isn't it?

xena3y ago

Yes, it's actually somewhat of a decent (if cursed) message queue for many usecases too. Not to mention the debuggability (you already know how to use an email client).

2 more replies

KaiserPro3y ago

A famous financial news paper used an FTP server as its kafka "fan in" node.

It was so old that the 1u case started to droop in the middle. (it was an sparc something or other with real SCSI ultra 320 drives in them)

simonjgreen3y ago

Before there was REST and other http based standards ebXML was a hotness and tools like bizspark etc. You could see the logical progression through:

I fill in the order form and post it. They mail back an invoice.

I fax the order form, they fax back an invoice.

My computer sends the order form in a structured way over email.

My computer sends the order form in a structured way over http.

And that's why many large telcos and banks still use ebXML in their B2B transactions. It's fundamentally the same business process and logic it's always been, with glacially slow improvement in performance over time.

vidarh3y ago

I co-founded a webmail provider in '99, and when we needed a message queue reaching for our heavily customized Qmail setup was a relatively natural choice. I've mentioned it here before. Provides all the routing and retries we needed, and made debugging trivial (e-mail from our desktop clients to the queues worked; cc:'ing a real mailbox with copies worked; checking out queues with POP3 worked...)

ahefner3y ago

I always thought using e-mail in this would be good for systems that occasionally need a human in the loop. Normally one service processes and mails results to the next, occasionally something is forwarded to a human, who can make some decision, edit the mail, then forward it along back into the automated path.

lucianbr3y ago

Using SMTP to transfer messages seems just plain common sense. Far different than IPv6-over-S3.

c543y ago

Ha! I think i know what company you’re referring to

SteveNuts3y ago

Oracle came to my mind but I'm sure it's in common use in many places

jamest3y ago· 9 in thread

The satire in the title is reminiscent of how Firebase was born.

We were previously working on a chat system called Envolve (https://www.envolve.com), that was 'Facebook Chat for any website'. A game that was using us for in-game chat created channels, used display: none on them, and passed game state through the chat.

We scratched our head, asked them why, and learned they wanted to focus on the frontend, not to deal with realtime message passing.

This led us to create a 'headless version' of our chat infra (re-written in Scala) that became the Firebase Realtime Database.

bombcar3y ago

This is an important lesson - if someone is using your tool in unexpected ways don’t just shut them down; there’s likely a business case that could be identified and specialized in.

dylan6043y ago

Nobody wants the product you want to make. They all want you to make the product they need, and will do things the devs of your product could never imagine with your product.

3 more replies

fatnoah3y ago

> This is an important lesson - if someone is using your tool in unexpected ways don’t just shut them down; there’s likely a business case that could be identified and specialized in.

One of the craziest cases of this I've seen was with a web-based survey application I was the principal engineer on in the mid to late 2000's. A big feature we implemented was the ability to create surveys to support multiple languages. To make translation easier for our customers to translate surveys outside of the application, there were ways to export/import the text strings as well as a standalone screen that allowed finding and editing them. Both of these were made easier by the fact that the strings had semantic identifiers like "/survey/question/choice" or similar.

Since this functionality worked so well, we also used it internally for all text in the application. As a convenience, both the import/export and edit screen were capable of editing these strings. One customer figured this out and, due to the naming, was easily able to identify the text for all screens and dialogs. They ended up completely modifying the application interface through this interface by editing text, adding HTML elements, and injecting CSS blocks. They added descriptive tutorial text, guides for users, and branded the application well beyond built-in functionality. It was pretty amazing.

agumonkey3y ago

I saw the opposite happen so often and every time it's very painful to see people have knee jerk reaction instead of thinking outside the box.

1 more reply

NovemberWhiskey3y ago

... and this is great if you're a business, but if you're an internal platform team then it really sucks.

kitd3y ago

Yes.

If people CAN do something with your tool, they will.

The moment you say "yeah, no one is stupid enough to do that", they will.

Be ready for it!

1nighthawk3y ago

It's amazing with how much creativity users will abuse your creations. ;) And in many cases, something new is born out of it. The problem is getting the information about just how people are using your service differently than you've intended. Sometimes it's impossible to tell from traces, analytics or logfiles alone. But finding out can be quite an advantage, especially if you're a startup probing for PMF. The best thing that can happen is that you have a channel to your customers that's constantly open. At WunderGraph, we use Slack for that, and it considerably lowers the barrier to just check in and have a quick discussion. The sooner you find out about use cases, the better - and ideally create a product around it. :)

james-revisoai3y ago

Did you detect this MixPanel or something?

jamest3y ago

We were talking to (and using the sites) of all our customers of any reasonable size. This was one of them.

mikece3y ago· 9 in thread

As a junior dev I used folder pairs (to_process and processed) as a way to move messages between loosely coupled systems with file system watchers picking up new files in the to_process folder. Very light weight, got the job done, and was told it was in production for over a decade (even after better best practices came into use).

wolfgang423y ago

This is a very sensible design for small-scale queueing systems. A few implementation notes I’ve learned along the way:

- You’ll want writers to create a temp file and then rename it into place; otherwise your reader might pick up a file that’s only half-written. (POSIX rename() is atomic within a filesystem.)

- It’s also helpful to have an `in_process` folder, and a cronjob to send alerts if a task sits in that place too long. That way you can quickly catch crashes and other failures.

- You can have multiple readers; they cooperate by rename()ing the input file into in_process/ before they start working on it, and ignoring ENOENT (which indicates some other process got to it first).

- This pattern is great for loosely-coupled systems. In my case I was importing orders into a system that ran very slowly; once a reliable import queue was available new use cases kept presenting themselves, and it was very easy to add them; since the API was just “write a file in the correct format”, individual scripts could be written in whatever way was easiest for the particular task at hand. (It helped that the input format hadn’t changed in a couple decades; if it had needed changing a lot you’d want a more structured system. But this is the same as any distributed systems interface design.)

vidarh3y ago

With respect to how to move things into folders, the simplest advice is to tell people to just read the Maildir spec. And read Qmail as well. While Qmail is dated as a mailsystem, as a description of how to compose a decoupled system that reliably processes filesystem based queues, it contains lots of little nuggets.

otagekki3y ago

3 years ago, I extensively used the all-company's shared filesystem to pass information between 2 independent Jenkins instances (One on Windows for jobs that worked best under a Windows machine and one on Redhat which was considered as the "main" instance); and between the Jenkins and target application servers (Windows and Redhat) which all had the company's filesystem mounted. It took time to perfect, but worked wonderfully once I adopted the rename-in-place technique as described by parent.

I wonder if the system is still in place. Last time I checked the Jenkins folder was occupying 270 GB (!) of the 10 TB shared FS, most likely because FS block size was 1 MB.

derefr3y ago

FYI, this is an old Unixism called a "spool directory" :)

WeAddValue3y ago

FYI, which was probably derived from the old mainframe-ism which had spooling before Unix was written. See https://en.wikipedia.org/wiki/Spooling

1 more reply

mikece3y ago

So having "discovered" this approach as a young .NET dev retroactively gives me a smug smile that I came up with such a battle-tested pattern. Then again, it's also quite an obvious and simple pattern which is why it was used in the first place, I'm sure.

albrewer3y ago

At my old company, we had a system where we had a limited number of physical machines that multiple testers would need to remote into. We didn't want to disallow two people logging in simultaneously because sometimes that was necessary, and if someone stayed logged in then wen to lunch, where was no way to kick them off to get access without waling to the server closet, pulling up the machine on the KVM, and kicking them that way.

I wrote some VBScript that wrote a locking folder into a sub-folder in a network folder, then launched VNC targeting the proper VM. When VNC was closed, the script would complete and delete the locking folder. That let us know what was available and when. I always thought it was hacky, but at ~60 lines it's incredibly simple and has never really failed.

florbo3y ago

Sometimes a filesystem-based queue like that is all you really need. Recently I used S3 API compatible storage for something similar, as introducing AMQP or pubsub would have been immensely overkill for this low volume component.

fukawi23y ago

I'm doing the same right now. The incoming messages are coming from a radio network at 1200 baud, so I don't foresee capacity problems :D

d0gsg0w00f3y ago· 5 in thread

I always thought it would be fun to host a Rube Goldberg competition for systems engineering. Whoever could accomplish a simple task with the most ridiculous system would win.

Something like: read the 54th line of this file hosted at xx address.

Then a submission could look like:

    1. FTP a file to a server with the address.
    2. the server reads the file and spins up a VM.
    3. The VM polls an endpoint to download code to pull the address.
    4. The code downloads the file and splits it into a single file per line.
    5. A script loads all the files into an array and accesses array[53] to get the answer.

...you get the idea

myself2483y ago

I used to host a competition called Anything But Ethernet.

(You could use ethernet hops, they just didn't score points, so say you had some ethernet-to-whatever bridge, only the 'whatever' segment would count.)

Over the years, we saw physical proof that a T1 circuit would in fact run over barbed wire, we had a human-in-the-loop following a DDR-esque stream of arrows to stomp on a pad that encoded data from one hop to the next, we had a writable RFID tag stuck to the front bumper of an R/C car that would drive back and forth between readers...

promiseofbeans3y ago

A vudeo of the barbed wire one: https://youtu.be/MYRJ76RMAQY

d0gsg0w00f3y ago

That's awesome. Was this part of a company? University? I've brought the idea up to coworkers but nobody wants to waste extra cycles for something this dumb.

1 more reply

Nextgrid3y ago

> Whoever could accomplish a simple task with the most ridiculous system would win.

See also:

* Microservices

* "Web scale"

* Kubernetes

* Cloud computing

quickthrower23y ago

Use Nix to install Conda to install NVM to install Node to install …

headPoet3y ago· 5 in thread

"anything stored in that pointer to memory you got back from malloc() is stored in an area of ram called "the heap", which is moderately slower to access than it is to access the stack." Is this true, or a myth? Ignoring the allocation cost and access patterns making cache misses more likely, surely memory is just memory

twoodfin3y ago

By definition, everything on the heap got where it is dynamically, so the minimum number of pointers to chase to find it is 1.

In contrast, what’s where on the stack can often be (and often must be) known statically by the compiler and accessed directly, even moved entirely out of memory and into a register. (It’s possible to do this with suitably constrained dynamic allocations, but the optimization is much harder.)

cdcarter3y ago

Some architectures (e.g. the 65816, the 6502's "big brother" used in the Apple IIgs) include stack pointer relative addressing modes, which will complete the memory read and return data faster than the same instruction with a full address encoded.

proto_lambda3y ago

Many architectures have hardware support for stacks, which could be slightly faster than arbitrary load/stores. Only works in the function owning the stack frame of course, if you pass a pointer to a stack object somewhere else, it's back to being normal memory.

xena3y ago

I'm pretty sure this is still the case. I'm not sure how cross-platform that assumption is (it won't work in Go where it has heapstacks I don't think), but classically yeah the stack is put is slightly faster memory with fewer access barriers.

comex3y ago

I believe that’s inaccurate, at least on a modern CPU. The bookkeeping for the stack is faster, since ‘allocating’ and ‘deallocating’ is just subtracting from and adding to a register. And the area of the stack in active use at any given time is usually tiny (well under a kilobyte, even though the full stack is usually several megabytes), so it’s likely to stick around in L1 cache. And return addresses on the stack get special treatment by the branch predictor. But other than that, it’s treated the same as any other memory.

Veserv3y ago· 4 in thread

Very funny.

Also, you can actually make it cost competitive if the object you store is the last n milliseconds of packets instead of one packet each. So, instead of incurring two API calls per packet, you incur two API calls per minimum buffering time. If S3 is zero-rated for any ingress/egress then you get “infinite” bandwidth for 4.22$ * 3 (for the active case) = 12.66$ a day if you are willing to accept 500 ms minimum latency, or ~600$ a day at a more reasonable 10 ms. If you are saturating even just a 1 Gb link for a whole day that is ~10,000 GB which would be ~700$ via the blessed channel, so you could very well come out ahead.

You could do even better if you out-of-band signal the readiness so you do not need to poll while idle. Then you only incur a cost while actively transmitting so as long as you average 1 Gb/s on the channel you should be coming out even or ahead with minimal latency impact.

derefr3y ago

This isn't theoretical; many companies do PostgreSQL async 1:N physical replication, by using e.g. https://pgbackrest.org/ to have the primary push WAL segment files (a.k.a. "the last n milliseconds of packets" in the write-ahead log) as objects to S3. All the read-replicas then independently discover the new objects in S3 as they become available; fetch them; and replay them.

> You could do even better if you out-of-band signal the readiness so you do not need to poll while idle.

S3 and its clones have "object lifecycle notifications", where you can be informed by a push-based mechanism whenever a new object is put into the bucket.

But — what do you have to do, to get these notifications?

Subscribe to a real message queue, that S3 puts these notifications into.

So using it here would be somewhat cheating ;)

Karrot_Kream3y ago

Yup we do something very similar for MySQL replication ourselves.

candiddevmike3y ago

Can you failover/promote using pgbackrest?

1 more reply

KRAKRISMOTT3y ago

Try it with CloudFlare's S3 equivalent offering

xena3y ago· 4 in thread

I was hoping someone would notice this by arguing about my pricing estimates in the arguments, but nobody did so I'm gonna spoil the surprise: if you have JavaScript enabled your browser will render slightly different prices if you view this post from Hacker News. This will persist to when you visit from not-Hacker News too, just so you don't notice the difference. The prices range from 0.8x to 5x the ones I figured out in the AWS calculator.

If you want to see how I did it, view source and search "gaslight".

candiddevmike3y ago

If you have to point out how clever you are, you're probably not as clever as you think you are

tpxl3y ago

Intentionally misleading your readers is exceptionally poor writing.

mkl3y ago

Um... why?

xena3y ago

Because I thought it could be fun if people took the bait lol

1 more reply

andrewstuart3y ago· 4 in thread

Could someone explain this please?

wolfgang423y ago

This is one of those articles where the journey is far more interesting than the destination.

(Edit: this comment made more sense when it was replying to a different complaint about the article; the parent comment seems to have been edited in the interim.)

xena3y ago

Here's the key paragraph in the article:

> In Linux, you can create a TUN/TAP device to let applications control how network or datagram links work. In essence, it lets you create a file descriptor that you can read packets from and write packets to. As long as you get the packets to their intended destination somehow and get any other packets that come back to the same file descriptor, the implementation isn't relevant. This is how OpenVPN, ZeroTier, FreeLAN, Tinc, Hamachi, WireGuard and Tailscale work: they read packets from the kernel, encrypt them, send them to the destination, decrypt incoming packets, and then write them back into the kernel.

ignoramous3y ago

Setup a TUN/TAP device, which is a file one can read (egress packets) / write to (ingress packets). Setup appropriate ip routes.

Read (egress) packets from the TUN/TAP device.

Serialize them, assign sequence numbers, and upload them to S3.

On the recieve side, poll for newer objects (packets) in S3.

Write them to your TUN/TAP device.

---

Here, the TUN device is a router using S3 as gateway.

dikei3y ago

So the premise of the problem is the need to egress traffic to the internet from a private VPC, without using an expensive NAT gateway. The solution is to do NAT manually by having an EC2 instance in a public VPC, and tunneling traffic from the private VPC through that instance.

The meat of the article is how to create that tunnel using S3 for data transfer instead of using a more traditional VPN service.

mtlmtlmtlmtl3y ago· 3 in thread

I love this! Reminds me of Harder Drive by Tom 7:

https://youtube.com/watch?v=JcJSW7Rprio

It's better to go in unspoiled so I won't reveal any details.

Sniffnoy3y ago

I'm going to have to spoil some of it to point out that part of what he's done is reinventing the idea of delay-line memory, although he makes no mention of this: https://en.wikipedia.org/wiki/Delay-line_memory

...of course if that's all there were to it it wouldn't be interesting, but I won't reveal what the rest is as per your suggestions. :)

mtlmtlmtlmtl3y ago

True, I suppose. Though the implementation details are certainly refreshing :)

KineticLensman3y ago

Spoiler: "we can of course ignore air resistance because chainsaws cut through the air like butter"

h2odragon3y ago· 3 in thread

What could be worse than IPv6? this

Someone submit it as "IPv8" immediately.

TobTobXX3y ago

> What could be worse than IPv6?

IPv4?

But seriously, I appreciate that the blog was written using IPv6, not some other old and deprecated legacy tech.

ranger_danger3y ago

IPv9 is already a thing

h2odragon3y ago

Should this be IPv13 then?

Codesleuth3y ago· 3 in thread

> What if there was a way you could reduce that cost for your own services by up to 700%?

How can something be reduced over 100% What is it that they actually mean here?

Kye3y ago

Like any good extended comedy bit, you have to read to the punchline to get it.

Codesleuth3y ago

Ah, good call. I'll continue...

hdjjhhvvhga3y ago

I believe they meant "7 times", so around 85%.

sam1r3y ago· 3 in thread

>> Read time in minutes: 40

I wonder if this part is satire, if so, how come? I definitely feel it should be much less.

jerf3y ago

Estimate includes the additional latency you will experience while downloading the post through the mechanism described in the post, which is the only true way to read it.

xena3y ago

My read time estimate code is here: https://github.com/Xe/site/blob/aa3608afa6c62695ca0ab139f823...

I've been trying to play with the constants over the years to make the read time estimate more "accurate", but it's a tough nut to crack in general. So I can go over my numbers more accurately, how long did it take you to read it?

ignoramous3y ago

fwiw, it took 5m to read. Nb: I was already familiar with a lot of the terms in the post (partly because I've already experimented relaying IP over Cloudflare's DurableObjects instead of S3), and skipped the dialectics.

1 more reply

loeg3y ago· 3 in thread

Tl;dr: S3-based tun/tap virtual network device.

KaiserPro3y ago

my dear child, the fun is not the destination, the fun is the slow decent into madness to get there. For other people: don't read the summary, read the whole thing. its a work of art.

loeg3y ago

You don't need to condescend. Knowing the subject matter in advance doesn't lessen the experience.

revskill3y ago

Wait, to understand all of the SRE things, one might need another 6 months.

1 more reply

pavlov3y ago· 2 in thread

The only thing worse would be this “on the blockchain.”

On second thought it seems likely that some charlatan has already years ago raised a couple of millions in an ICO for such a project. “We’re making the entire Internet web3-compatible!”

xg153y ago

I mean. Chain, queue, it's not that different if you squint hard enough...

Izkata3y ago

2018: Modeling of Blockchain Based Systems Using Queuing Theory Simulation

https://ieeexplore.ieee.org/document/8632560

Kye3y ago· 2 in thread

Such dark works from such adorable creatures. The nature of furry.

disruptiveink3y ago

I'm not part of this subculture, but I honesty enjoy this type of idonsyncratic presentation which always was, and hopefully always will be part of hacker culture.

The content is not only relevant and hilarious, but I'd much rather personal blogs like this keep being updated instead of seeing the the sterile Medium or LinkedIn Pulse versions, usually written to impress future employers.

Zany visuals, doing things with your friends just because you can and writing up detailed accounts of it to share with everyone is the quintessential hacker culture. If Hackers' Cyberdelia was set in the 2020s, no doubt Mara would fit in.

xena3y ago

Author here. Thank you for liking this. One of the most amusing things is that doing this zany visuals/detailed writeups/unabashed character in my writing actually makes me more impressive to future employers than any sterile LinkedIn posts ever will. I estimate that at this point if I wanted to get hired I could literally post a banger like this, add a banner to it halfway in that says "by the way, I'm looking for gainful employment, if you want someone doing DevRel that writes like this and gives talks like [link], please get in contact" and I could probably get a job in a week or two.

I want to keep hacker culture alive by not accepting the gentrification of it. Hacker culture is queer, neurodiverse, furry, weaboo, and more. As a philosopher of the arts, that is the kind of culture I want to create more of. Not platitudes on LinkedIn. I want to create the kind of culture that celebrates art for the sake of art.

I may be "cringe", but I am free.

3 more replies

hashhar3y ago· 1 in thread

Wow, this reminds me of https://www.youtube.com/watch?v=JcJSW7Rprio (Harder Drive: Hard drives we didn't want or need - suckerpinch)

Impractical, yet possible ways to store data is an exciting satire genre for me now.

koromak3y ago

tom7 is a national treasure. Someday I'll have an idea as absurd as this.

KaiserPro3y ago· 1 in thread

This, this is what the internet used to be like.

ok, what the good parts of the internet used to be like.

I also love that its highlights an AWS dark pattern. more, more I say.

account423y ago

Really? This is the old internet to you?

> <Cadey> Hello! Thank you for visiting my website. You seem to be using an ad-blocker. I understand why you do this, but I'd really appreciate if it you would turn it off for my website. These ads help pay for running the website and are done by Ethical Ads. I do not receive detailed analytics on the ads and from what I understand neither does Ethical Ads. If you don't want to disable your ad blocker, please consider donating [snip] or sending some extra cash to [snip] or [snip]. It helps fund the website's hosting bills and pay for the expensive technical editor that I use for my longer articles. Thanks and be well!

atmavatar3y ago· 1 in thread

> The bytes are stored in the cloud, which is slightly slower to read from than it would be to read data out of the heap.

Given latency and bandwidth differences, that's like saying that it's slightly slower to transport water by driving standard, 1-gallon jugs between the US east and west coasts than it is to transport it a few miles using a tanker truck.

AceJohnny23y ago

that's the joke.

ipython3y ago· 1 in thread

Given that Corey Quinn is involved, I’m surprised that route 53 isn’t included somewhere (or maybe it is, I’m still reading through it)

xena3y ago

I was gonna add Route 53 (to have each node set its own DNS record), but I'm saving that for part 2. Be afraid, part 2 is coming.

cosmolev3y ago· 1 in thread

At some point you can also discover it is Turing complete.

xena3y ago

Wait, are you saying that AWS S3 is turing complete? How the hell if so.

artem_dev3y ago· 1 in thread

DigitalOcean had a nice blog post on this "From 15,000 database connections to under 100: DigitalOcean's tale of tech debt" https://www.digitalocean.com/blog/from-15-000-database-conne... they used MySQL as a message queue and migrated from it.

hu33y ago

> ...it was simple and it worked – especially for a short-staffed technical team facing tight deadlines and a rapidly increasing user base.

> For four years, the database message queue formed the backbone of DigitalOcean’s technology stack.

The lesson I got from that article is that a database messaging system got them very far and was simple.

sitkack3y ago

I initially expected a horrible amount of flamewar tinder, but it is an Overton Window expanding Art Project and I love it!

The Cardio rate control library is cool, https://pkg.go.dev/within.website/x/cardio this library would be a great problem for using a neural net to model a PID loop.

https://www.semanticscholar.org/paper/A-Design-of-FPGA-Based...

https://www.semanticscholar.org/paper/Self-Tuning-Neural-Net...

XorNot3y ago

This is the best worst thing in software that I have ever seen. I love it so much.

cratermoon3y ago

This is astoundingly well-written and informative. It's the sort of thing I come here for. It's also engaging and charming in a way rarely seen in tech writing.

I like that they started out by talking about $0.07/G cost and went through the whole exercise before pointing out what immediately came to mind for me when they started pushing bytes in and out of S3.

JoelMcCracken3y ago

This is the first hn headline that I can remember actually making me laugh out loud. Bravo.

peter_d_sherman3y ago

>"Access to S3 is zero-rated in many cases with S3, however the real advantage comes when you are using this cross-region."

Or cross-Country... as in two Countries that are geoblocked from one another...

>"This lets you have a worker in us-east-1 communicate with another worker in us-west-1 without having to incur the high bandwidth cost per gigabyte when using Managed NAT Gateway."

Simplified algorithm for bypassing geoblock via the above method:

1) Select a cloud storage provider (could be any cloud storage provider or shared persistent storage platform; doesn't necessarily need to be Amazon/S3) that works or does business in two countries, Country A and Country B, where normal IP traffic is geoblocked between Country A and Country B.

2) Use shared cloud storage objects/buckets/rows (call them whatever you will, "keyed persistent storage discrete thingies" for lack of a better term!) as the article suggests, to emulate IP traffic between user A in Country A and user B in blocked country B...

3) Combined with a P2P or other front end app that knows how to use this method of communication (along with code stubs, such that people could customize it to their own cloud storage provider or platform) if/when normal country-to-country IP communication is blocked for whatever reason (zombie apocalypse? <g>) could make a powerful future P2P communications tool, for lawful purposes...

Anyway, great article!

sitkack3y ago

I just had a flashback of creating a forwarding-buffer-queue on memcached, a complete abomination but it was able to drastically reduce the load on the endpoints that saved game progress in a casual game.

It updated or created an entry for each call. But if a buffer was already in the queue, it would write the most recent one into the existing slot. This had the effect of reducing the load by the multiple of the update rate. So if you had clients sending save game payloads every 30s and your queue depth is 2.5mins, then your write rate to disk is 1/5. I think.

But memcached wasn't Redis, this was pre-Redis, and memcached could have evicted any of those keys at any time. We gave it lots of space, it never GCd, we never needing to fix it. The game slide from above the fold and wasn't fun enough to be viable long term.

One of my proudest Scotty Engineering moments.

JenrHywy3y ago

The title reminds me that, in a previous life, our "architecture" team implemented a service bus using Lotus Notes email.

svilen_dobrev3y ago

i used Couchdb as "RPC" pseudo-HTTP request-response transport layer, essentialy two queues, both ways. Especialy useful for mobile devices when connectivity is a random thing. Hugely Simplifed both client and server - Let the db handle all the multiplication, connections, switch on and off, timeouts, retries, delays, repeats, etc.

cosmolev3y ago

I think we can safely expand it even further: Anything can be a anything if you use it wrongly enough.

xg153y ago

Too late, already deployed to prod :)

rootw0rm3y ago

The title of this article is what malware authors say every day =)

jheriko3y ago

meanwhile in the minds of ZeroMQ devs...

rejectfinite3y ago

I clicked the site and wow...

devdiary3y ago

Been there, done that

retrocryptid3y ago

Sadly, they fundamentally misunderstood the "everything is a file" paradigm of *nix. It's not that everything is an extent of octets, it's that everything has a directory entry so C programs can use the open call to create a file handle. It might be more appropriate to say "Everything looks like a file in the file system and most every operation on a thing represented by a directory entry goes through a filehandle."

j / k navigate · click thread line to collapse

239 comments

137 comments · 37 top-level

tschumacher3y ago· 23 in thread

hardwaresofton3y ago