https://blog.twitter.com/2015/introducing-practical-and-robu...
And the previous discussion is here:
Decomposition of time series is done with STL (stl function in stats package) and this is the first part of what they call "Seasonal Hybrid ESD (S-H-ESD)" (sounds impressive right?) which then apparently just involves taking the max absolute difference from the detrended sample mean in terms of standard deviations, remove it and repeat until you have your collection of x outliers. If they wanted to this could be explained in a few sentences, and the underlying code is really simple [0], but for whatever reason it's been written up as advanced analytics — as if decomposing a time series is a major challenge.
[0] https://github.com/twitter/AnomalyDetection/blob/master/R/de...
At Data Driven NYC: https://www.youtube.com/watch?v=AfSM45ncAT8 Keynote at Strata+Hadoop World 2014: https://www.youtube.com/watch?v=5Dnw46eC-0o
Luckily my employer encourages learning, and it helps that the class is mostly during lunch.
Keep on learning!
In order to monitor important time series with this code, they would presumably need to run it every n minutes on the entire time series, or at least the recent part of it. Seems an anomaly detection system operating on streaming data might make more sense.
Perhaps their real time anomaly detection system uses simpler logic on streaming data?
Would be interesting to see this package hooked up to streaming data and monitor performance