Let's say you have N qubits in a superposition.
A superposition is (a bit of an oversimplification since there are also complex numbers involved and a linear constraint) is in some sense a probabilistic distribution over the possible (classical) values that you get when you measure those N qubits.
E.g. if you measured this N-qubit state many times, which of the 2^(N-1) possible superposition values would you get? Would it be the uniform distribution? Biased towards particular classical bit patterns?
Note that there are 2^(N-1) possible classical bit patterns, so an arbitrary collection of these patterns would take O(2^N) operations to define.
> I'm definitely not saying you get the answer to every problem in a single evaluation step
Yes, but my point is that if your input is an arbitrary collection of classical N-bits, then in general you will require an exponential number of quantum operations to set stuff it into an N-qubit initial state, which means you get zero speedup.
Quantum computers only see speedup on inputs that require a relatively small number of quantum operations to set up. 'Small' could mean polynomial. But it's true that the space of 'easy to set up' initial N-qubit states is much smaller than the space of all possible N-qubit states, which is why a quantum computer cannot simply considered 'a magical computer that computes on 2^N bits at once' without considering how you get those bits in or out of the damn thing.