The trick is basically that you have to eschew the last 15 years of "productivity" enhancements. Pretty much any dynamic language is out; if you must use the JVM or .NET, store as much as possible in flat buffers of primitive types. I ended up converting order books from the obvious representation (hashtable mapping prices to a list of Order structs) to a pair of SortedMaps from FastUtils, which provides an unboxed float representation with no pointers. That change ended up reducing memory usage by about 4x.
You can fit a lot of ints and floats in today's 100G+ machines, way more than needed to represent the entire cryptocurrency market. You just can't do that when you're chasing 3 pointers, each with their associated object headers, to store 4 bytes.
Does the $4K include the cost of the RAM? Where can I find these servers? Thanks!
To answer your question, you can't find these servers because they don't exist. A server with 4T of RAM will cost you at a minimum $20,000 and that will be for some really crappy low-grade RAM. Realistically for an actual server that one would use in an actual semi-production setting, you're looking at a minimum of $35,000 for 4TB of RAM and that's just for the RAM alone, although to be fair that 35k ends up dominating the cost of the entire system.
https://www.broadberry.com/dual-amd-epyc-rackmount-servers/a...
Looking on ebay I can find some pretty decent R820 with 512GB each for right around $1500 a piece. Not counting any storage, even if they come used with some spinning hard drives, would end up replacing with SSDs. So more like three servers, 1.5TB of RAM, for $4500.
My data sets are far too big to fit into memory/cache. Disk pressure can be alleviated by optimizing queries but it's a game of whack-a-mole.
I have exhausted EBS i/o and been forced to resort to dirty tricks. With RDS you can just pay more but that only scales to a point – normally the budget.