The attributes of one record are as follows:
* 1-4 character string
* float
* float
* integer
* integer
* integer
* integer
* integer
* 1 character
For each 1-4 character string, there are many records -- sometimes several per second. As an example, in the span of a month, one of these strings can be associated with 18 million records. There are about 10,000 unique 1-4 character strings, but not all as active as the previous example. The data is queried by two attributes: 1-4 character string and timestamp.Potential solutions I've come up with (feel free to debate any of these):
* Put everything (or just the hotspot) in a MyISAM compressed database.
* Put everything (or just the hotspot) in an InnoDB database with a proper clustered index.
* Put everything (or just the hotspot) into CouchDB with proper views.
* Put everything (or just the hotspot) into MongoDB with proper indexes.
* Put everything (or just the hotspot) into Redis ZSETs with timestamp as SCORE and distribute across nodes.
* Load all of the data into a long-running Hadoop job.
Feel free to ask any questions too.