undefined | Better HN

0 pointsCharlieDigital3y ago0 comments

Short answer: yes.

Crawlers are based on consuming text.

HTML is text. Sites that optimize for SEO also use JavaScript to provide SEO context. The specific standard is called JSON+LD; pretty much any site that you use where SEO matters has JSON+LD, RDF-a, or Microdata embedded in the HTML.

You can see these structures if you use the Schema.org validator: https://validator.schema.org/

Try plugging in a URL like Reddit.com and see for yourself. On e-commerce websites, it's a *must have*. For example, try this Amazon page: https://www.amazon.com/dp/B09V3GZD32.

TL;DR: crawlers are parsing RDF-a and Microdata in the HTML or JSON+LD embedded in `<script/>` tags.

You can learn more about it here: https://developers.google.com/search/docs/appearance/structu...

0 comments

2 comments · 1 top-level

randomdata3y ago· 1 in thread

Here's an excerpt of some Javascript found on the Amazon link:

    window.ue_ihb = (window.ue_ihb || window.ueinit || 0) + 1;
        if (window.ue_ihb === 1) {

            var ue_csm = window,
                ue_hob = +new Date();
            (function(d) {
                var e = d.ue = d.ue || {},
                    f = Date.now || function() {
                        return +new Date
                    };
                e.d = function(b) {
                    return f() - (b ? 0 : d.ue_t0)
                };
                e.stub = function(b, a) {

Feel free to visit it to find the entire script. It is much too large to post here. What is a crawler learning from that program that would be lost if the equivalent code was bundled as WASM instead? Why couldn't its WASM parser pull out the same information? The JS/WASM runtime in the browser has to produce the same result regardless of which encoding is chosen, so everything will be encoded in there somehow.

CharlieDigitalOP3y ago

> Why couldn't its WASM parser pull out the same information

There's currently no standard. If there's a will, there's a way.

JSON+LD is the standard for JavaScript based metadata.

1 more reply

j / k navigate · click thread line to collapse