I think you would want to use something like CLIP embeddings for image search.
Really enjoyed using this app for iOS: https://github.com/mazzzystar/Queryable
HN discussion: https://news.ycombinator.com/item?id=34686947
Or explore the dataset stable diffusion was trained on: https://news.ycombinator.com/item?id=32655497