Crawlers are based on consuming text.
HTML is text. Sites that optimize for SEO also use JavaScript to provide SEO context. The specific standard is called JSON+LD; pretty much any site that you use where SEO matters has JSON+LD, RDF-a, or Microdata embedded in the HTML.
You can see these structures if you use the Schema.org validator: https://validator.schema.org/
Try plugging in a URL like Reddit.com and see for yourself. On e-commerce websites, it's a *must have*. For example, try this Amazon page: https://www.amazon.com/dp/B09V3GZD32.
TL;DR: crawlers are parsing RDF-a and Microdata in the HTML or JSON+LD embedded in `<script/>` tags.
You can learn more about it here: https://developers.google.com/search/docs/appearance/structu...