Was there any other reason you chose kestrel over alternatives like kafka? Did you test any others, or where you just that satisfied with kestrel?
So far our decision is to keep the raw events in Cassandra, and pre-aggregate most data for fast reads. Just wondering about your decision to not store raw events in Cassandra, and use raw files for that, and using Cassandra only for storing Hadoop analysis results. Do you think this decision may affect you later if you ever decide to support real-time analytics?
This applies for any sort of goal/process/?, whether programmatic or personal.
Very cool story, I'm looking forward to additional features. We pull a lot of data about Docker from GitHub that could be more readily available. We'd be more than happy to discuss or beta any new features, if you're interested.