My understanding is that producing shorter responses proportionally reduces the amount of computing—and thus the cost—required to produce it.