Not that PID is better, not that bang-bang is better, just that.. learning from control theory is likely a fruitful direction to go in.
Most autos along algorithms have heuristically hacked hysteresis built in to stop them oscillating between provisioning a node, dropping a node, provisioning, dropping. That kind of threshold tuning stuff is ripe for replacement with something more rigorous. PID control is an example of a rigorous way to solve that in continuous control. It likely doesn’t directly work for capacity, but control theory in general does have better answers for this kind of thing than just PID.