I just looked at the wikipage for zero-shot learning and it wasn't very clear about why it is "zero". The paper for GPT-3 however is quite clear:
"(c) “zero-shot” learning, where no demonstrations are allowed and only
an instruction in natural language is given to the model" [page 5 of pdf]
https://arxiv.org/pdf/2005.14165.pdf