RF: text (telegram), audio, still images (fax), moving images
Web had the same progression: text, still images (inverted here), audio age (MP3s, Napster), video (Netflix, YouTube)
AI: text, images, audio (realtime API), ...?
Vision is the obvious next medium.