undefined | Better HN

0 pointsstephen1231y ago0 comments

What do you do to then query the data? I usually need indexes so queries are not slow. Perhaps I could insert into a staging table then bulk copy the data over to an indexed table, but that seems silly.

0 comments

ndriscoll1y ago

If your application language/framework allows, you can do the batching there. e.g. have your single request handler put work into an (in-memory) queue. Then another thread/async worker pull batches off the queue and do your db work in batch, and trigger the response to the original handler. In an http context, this is all synchronous from the client perspective, and you can get 2-10x throughput at a cost of like 2 ms latency under load.

I gave more detail with a toy example here: https://news.ycombinator.com/item?id=39245416

I've since played around with this a little more and you can do it pretty generically (at least make the worker generic where you give it a function `Chunk[A] => Task[Chunk[Result[B]]]` to do the database logic). I don't have that handy to post right now, but probably you're not using Scala anyway so the details aren't that relevant.

I've tried out a similar thing in Rust and it's a lot more finicky but still doable there. Should be similar in go I'd think.

lmz1y ago

Isn't that basically the idea behind the "lambda architecture"? Of course you typically don't use the same product for both the real time and the batch aspects.

unixhero1y ago

You said you struggled with writes... so I mentioned an advice on how to speed up writes... the internet know a lot more about this than me tho

phantompeace1y ago

Could replicating to a DB with indexing (purely for queries) work?

remram1y ago

If one can't keep up, the other one can't either.

You could use partitions though.

j / k navigate · click thread line to collapse