Ask HN: Recommendation for time serie database for ML?
We have a system with a lot of telemetry: logs of "device use" with many log fields (positions, speeds...).
We want to make them available easily for read / write, but also for machine learning. Specifically it should be possible to extract easily some parts of logs that correspond to some specific events, a bit like "np.where(field1 = 3 and field2 = 5)".
Another challenge is that logs are messy: some values are present in some old logs and not anymore or the contrary... Or some values sometimes are constant for a whole log and it would be nice to be able to use such structure to lower the disk size.
Which database system could be both flexible enough in terms of field values / queries ? PostgreSQL / Influxdb / HAdoop / Elastic ? I have seen good reviews of all of them and I am not yet convinced of any advantage of one over the other because I guess the devil is in the details.
Thanks !