Ugh, I thought I'd answered this but looks like HN lost my response :-(
> How big is the pool of variables it can choose from?
Currently it is just: screenWidth/Height, browserWidth/Height, browser, browserVersion, os, referrer, city, region, country.
It picks the variables based on how well they partition the datasets into diverse sub-groups. This is the code that does that job:
https://github.com/sanity/quickdt/blob/master/src/main/java/...