* Daily importing of ~500+ million rows of data, with ~250 million unique ids. * I need to only keep the latest X entries per unique ID. Older entries are discarded after X entries for that id has been achieved. * Monthly will read out the entire dataset for processing
X can be anywhere from 1000 to 3000, it is static over the entire DB just depends on as we determine the best setting. Since I don't access the data more than once a day, or at the end of the month, I would prefer not to pay for storage. There are over a billion unique id's which I can partition by prefix or ranges. Each individual entry per ID is fairly small with only an integer and two decimals stored.
What would you recommend as a data store for this?
Thanks!