"uniform distribution of request types" - okay, it makes sense in that context. Although if that assumption breaks down, your thread limits may become under or over provisioned.
I'm wondering though - how do you pick the right alpha and beta values? It seems like you need to do testing/validation to ensure you use the right values, right?
Sorry if I'm sounding critical by the way. I think this is a really cool project - thanks for open sourcing it!