Create a dozen models based on different things. Street signs, cats, houses, cars, etc. Then show the user a random selection of images generated from different models and say "select all the cats" and they get it right if they choose the images generated from the cat model.