I don’t use Semantic Web technologies anymore, though they still influence me (opens in new tab)

(lespetitescases.net)

119 pointsmdlincoln6y ago66 comments

66 comments

57 comments · 15 top-level

at_a_remove6y ago· 10 in thread

At an old job, I knew some very idealistic folks who kept pushing semantic web business. "Let's do that everywhere!" As an exercise, I would have them open a browser, visit various sites, and then look at the source. "Go on, check to see if it validates," I would say with an anticipatory grin. Whether hand-crafted HTML or generated by any number of frameworks, many sites can barely manage to close their tags, asking for semantic references is a "just won't happen in practice" thing.

I have also seen a great deal of consultant money, programmer time, sys-admin sweat, and the like focused on these toweringly-designed, completely-unused triple stores, layer upon layer of hot technologies (ever-moving, construction on the tower never ceased) fused together to create a resource-intense monstrosity that, at the end of the day, barely got used. But hey, let's look at that jazz semantic web example one more time.

The most painful part is that I understand the urge to build a gleaming repository for information, where the cool URIs never change; SPARQLing pinnacles, ready to broadcast the Library of Alexandria, glimmer; and the serene manifold of abstract information lies RESTful ... but I have come to understand that the web of today is an endlessly bulldozed mudscape where Someone Very Important has to have that URL top-level yesterday (never mind that they will forget about it tomorrow), of shoddy materials and wildly varying workmanship, and where nobody is listening to your eager endpoints because the commercials are just too loud. I too once labored for information architecture, to have the correct thing in the obvious place, with accurate links and current knowledge, to provide visitors with the knowledge they desired ... but PR preempted all of it to push yet more nice photographs in yet another place: the Web as a technology for distributing images that would once live on glossy pamphlets.

The vision is lovely, but we who have always lived in the castle have walked alone.

riffraff6y ago

I would argue the problem is not the broken tags, but the business disadvantage to exposing semantic data.

Remember when microformats were all the rage, and you could get hReview or hRecipe or XFN data everywhere?

Then every host in turn realized that actually, it's _better_ if people can't scrape your site, and it's even better if they can't even see it and it's behind a login wall.

acdha6y ago

“better” is too strong: in many cases, structured data is not a problem (and if it is, people will scrape it anyway), but there's simply no business case for spending time on it. Most of the semweb stack had a horrible developer experience — bad documentation, tools, validators, etc. — and rarely had tangible benefit from spending time slogging through it.

The semantic data which has actually been implemented on a wide scale happened because someone could go to their boss and say “Spending time on x will mean better Google ranking” or “Facebook will use their new sharing display for our pages”, and it was orders of magnitude simpler to implement so the time and risk were far more palatable.

zozbot2346y ago

Well, whether it's better depends on local incentives. But it's true that in many cases these push against making machine-readable data available, thus "semantic" tech becomes mostly irrelevant. Similarly, Linked Data has been most successful as Linked Open Data, where these incentives are explicitly aligned.

gwern6y ago

Indeed. Why would you expose all of your data to your competitors like Google, so they can commoditize you? (Incidentally, note that the big tech companies like the search engines are some of the major proponents of microformats, like for restaurants or local businesses... As always, 'commoditize your complement': https://www.gwern.net/Complement )

chongli6y ago

That’s a proximal cause. The root cause is that the Internet is not free, despite appearances. If hosting and bandwidth were free, we wouldn’t need businesses to do what we want. Wikipedia wouldn’t need donations. Everything would be great.

bordercases6y ago

I'm working on the Semantic Web stack in a more limited setting of biomedical data. Performance is definitely a problem but the project is currently exiting pilot due to what were seen as satisfactory results in indexing and summarizing biomedical information, and bridging connections between domains of results (with human assistance).

This is a different outcome than in the commercial setting where the W3C is still imagining people as users of their computer rather than consumers of the services their computers connect to. But it also means that in certain technical domains where e.g. publication results are scaled out to oblivion but the ontologies are regular or made easily negotiable, there can be benefits for researchers.

tasogare6y ago

I've read my share of SW papers: the fact that after a year more than half of links in such works are dead is more telling than the papers themselves.

goto116y ago

The reason HTML pages doesn't validate is pretty simple: It does not provide any benefit for the publisher. Consider if the images didn't show up - you better believe the publisher would have it fixed immediately.

Same for the semantic web. Show the benefit for the publisher.

rbosinger6y ago

Agreed. Poetically written as well!

dandelo536y ago

Just want to pile on more kudos. Nicely written.

AndrewStephens6y ago· 10 in thread

I have some low-level hate for the Semantic Web. I run a small personal blog that I maintain using a relatively simple static site generator that I created that turns markdown files into clean(ish) html.

A couple of months ago I got interested in adding semantic information to my posts so I modified the generator to add some of the common semantic tags. It was an annoying job, since the semantic information pollutes the structure of the html.

Can anyone tell me what the semantic web does for me as a small-time publisher? Is it for search engines? Does it really matter that a book review (for instance, I have a few) is tagged properly?

lazyjones6y ago

> Can anyone tell me what the semantic web does for me as a small-time publisher? Is it for search engines?

Yes, in practice it is mostly for bigger fish in the pond to easily identify and steal your content as needed.

For example, Google was using reviews from small competitors' sites in Google Shopping.

abathur6y ago

I think this is one of the big issues. The semantic information does make it easier for end users to find what they're looking for, but it also made denial of traffic possible.

In a lot of cases, the information was there to get eyeballs--so this is undesirable.

I guess if you don't really care about the eyeballs it can be "useful" for the big fish to pay most of the cost of serving the fraction of your server response that the end user was looking for...

1 more reply

onli6y ago

More a side note, but if you run a blog you might know that the trackback url can be specified via a RDF tag. That's a kind of semantic information, one example for one type of usage: Given other clients (here: other blogs) additional information (here: where to send the Trackback POST).

The markup you added - it depends on what exactly you did. Did you add the markup for schema.org? That's in practice solely for Google. The SEO promise there is that Google will make use of the information provided and format some information nicely, which can lead to more clicks. https://moz.com/learn/seo/serp-features explains that not badly. For things like reviews I can imagine it to be quite useful.

zozbot2346y ago

> Does it really matter that a book review (for instance, I have a few) is tagged properly?

If the semantic web was better supported, you could have a semantic annotation precisely identifying the books you are reviewing (whether by ISBN edition or otherwise), and reusers of your content (users, search engines or others) would be able to programmatically associate your review with similar content.

coddle-hark6y ago

That seems like it would be abused to the point of the semantic information being completely useless.

1 more reply

decebalus16y ago

> It was an annoying job, since the semantic information pollutes the structure of the html.

In what way? Both the html and the metadata is intended to make your website machine-friendly. You may find the html structure polluted, but crawlers would find it more informative.

sjg0076y ago

Embedding semantic information would allow Google to further refine search traffic to your web page. I assume it may also make you more authoritative wrt to the content you publish.

zmix6y ago

"Semantic Web" is a wide area. What technologies did you use? Care to post a little example, as to what and how it pollutes the HTML structure?

have_faith6y ago

I can't imagine what semantic tags would pollute a blog's markup as most of the semantic tags were designed to structure simple text content like a blog post. Do you have any examples?

> Is it for search engines?

Yes. And Accessibility.

Vinnl6y ago

I think you might be confusing semantic HTML with the semantic web. (Which is understandable given the mention of semantic tags.)

Using semantic HTML means using <article> rather than yet another <div>. What GP is referring to, however, is adding extra information to your HTML detailing what kind of data is in your tags, e.g.:

    <p vocab="http://schema.org/" typeof="Person">
      <span property="name">Christopher Froome</span> was sponsored by
      <span property="sponsor" typeof="http://schema.org/Organization">
        <a property="url" href="http://www.skysports.com/">Sky</a></span> in the Tour de France.
    </p>

Here, the vocab, typeof and property attributes are used to add semantic information to the HTML. It might also give you an idea of why one might consider that a chore, especially if it doesn't appear to provide any benefit, like making your site accessible to users of screen readers.

1 more reply

zcw1006y ago· 9 in thread

I could write a book on what's wrong with the semantic web. One of the worst isn't even technical, it's the community. There are some great people in the community but there are also a large number of extremely toxic people that drive people away. If the technology ever takes off it's going to be because some outside community cherry-picks the good parts and tells those people to f-off. That's already starting to happen and you'll hear no end of bitching from people in the semantic web community about how they're reinventing what they've already done years ago. Guess what? You're right. You're so toxic that it's worth redoing everything if it means they don't have to deal with the toxic attitudes.

reggieband6y ago

> a large number of extremely toxic people that drive people away

It's funny you say that because as soon as I saw semantic web I had a negative emotional nostalgia. I can hardly remember all of the RDF/RSS/Atom stuff from way, way back or what the trigger for that is but I just remember there being rancor swirling around the whole thing. I think there was some petty arguments about who deserved credit for the creation of the formats or something? Wasn't it between a bunch of bloggers? Then XHTML became a battleground since some groups were trying to keep semantic tags out of it while other people wanted them in. I remember just feeling exhausted every time the subject came up since it was like emacs vs. vim or space vs. tabs wars.

The funny thing is, I believe in the promise of the semantic web. I recall Tim Berners-Lee declaring the next frontier was not open source but open data and I agree. He even co-founded an institute around it: https://theodi.org/person/sir-tim-berners-lee/

ttepasse6y ago

> I can hardly remember all of the RDF/RSS/Atom stuff from way …

You're mixing in some stuff, that aren't really Semantic Web related.

RSS vs. Atom was less about the Semantic Web than an squibble between different XML formats, one very loosely specified, the other more ... well-formed. The Semantic Web did had a small foot in the RSS wars - the very first RSS (RSS 0.9 from Netscape) was RDF based and for a short time RSS 1.0 wanted to rebuild RSS on an RDF basis for the expandability of the Semantic Web, but the later discussion were about the XML variants of RSS and then Atom, wether the spec was adequate, wether it was frozen or how and wether it should be fixed, etc.

The XHTML discussions were less about elements in my recollection but about parsing models. XHTML reformulated HTML als XML which meant an error model with no error correction but failure on the first error. And XHTML 2 tried to evolve structural elements by being not backward compatible but defining a somewhat different new dialect. The backslash against XHTML was against that, a group sponsored by the browser makers then formed which wanted to evolve backwards-compatible and to standardize the parsing of tag soup → HTML5.

(„Semantic elements“ were often a shorthand for „instead of a dumb div use the appropriate HTML element. That was more the quest of the web standards project than the Semantic Web.)

(Slight overlap: How to embed Semantic Web statements has a small relationship with XHTML - RDFa started imho in an XHTML 2 module.)

I somewhat miss that time. All these bloggers with an interest in web standards and how to do them best had their own idealism and the cross blog and W3C discussions were always interesting. Today web standards don't have that publicity and idealism anymore, they seem more like an engineering collaboration of the 2½ big browser makers which get to decide among themselves. Maybe it was always so, but it seemed different at that time.

1 more reply

mindcrime6y ago

It's interesting how perceptions vary. I've been working with SemWeb stuff for a decade or so, and I have never experienced what you describe here:

One of the worst isn't even technical, it's the community. There are some great people in the community but there are also a large number of extremely toxic people that drive people away.

Maybe it's just the subset of the community that I choose to deal with, but the folks on the Jena mailing lists (pre and post Apache) have always been very gracious and helpful in my experience. And Ralph Hodgson, one of the co-founders of Top Quadrant came to a Triangle Java User's Group talk that I once gave on Semantic Web technologies, along with a bunch of other Top Quadrant people... and despite the fact that my company competes with them in certain areas, they were perfectly cordial and pleasant to interact with. Likewise for the other times that I've had Top Quadrant folks show up at events where I was speaking.

Maybe it's just dumb luck on my part, or whatever, but I have found no major issues with toxic people in the SemWeb community. shrug

zcw1006y ago

Yes, there are some wonderful people in there. Andy Seaborne has always impressed me with his thoughtful responses. I won't call out any of the bad apples. Usually a question goes something like, "Uh, I'm new to the semantic web and I'd like to do X" and the response is, "this is how it works, you're a dummy and you need to understand how brilliant the semantic web is and you don't need to be doing what you're asking for" or academics who will complain that they're not getting enough credit for providing their brilliant intellectual scaffolding.

Databases, that are run on a shoe string, aren't stable so we're going to make everything federated with linked fragments? Fine, give it a go but you don't need to go on and on about how databases are inadequate because someone isn't willing to foot the AWS bill so they can host dbpedia for ya.

Lets have a go at JSON-LD. RDF/XML is finally recognized as a mistake. A somewhat reasonable mistake because everyone was XML crazy at the time. So what do we do? The exact same thing except this time it's JSON. But it's even worse. We choose a serialization that is prized for its simplicity and we foist the entire RDF stack onto it? Then they claim that JSON-LD isn't about the semantic web so we're good and Jedi mind trick it with, "This isn't the RDF you're looking for".

Because we aren't done overcomplicating simple things we take aim at CSV with CSVW. Granted CSV has some subtle complexities but it's easy and reasonably compact. So now we're going to add metadata to csv files with rdf and then serialize it into JSON as JSON-LD. Great. How do I find this metadata. Either a well known location or in a link header. Whoops I can't publish metadata and reference your csv file. Lets convert your csv fie to rdf. WTF. my 500Mb csv file just became 1.5B triples and it's taking 8hrs. to load it into my triple store!

Don't get me started on people who call themselves ontologists. They're really zombies but instead of eating brains they eat budgets. They should be dispatched the same way, with a shotgun blast to the face. They generally can't justify their decision even though there is a framework to do that, onto clean. I have yet to meet one who even knew what that was. They just convince management that what they're doing is intellectually unattainable by mere developers although they'd be lost without protege, top braid, or excel and what they produce is generally an incomputable pile of garbage. It's always OWL full. "Class or property? Class or Property? Well is is an "is a" relationship."

I'm done writing so I'll just include a list of the half baked ideas that sound good but are a day late and a dollar short. LDP, R2RML, ShEX, SHACL, DCAT, RDF Data Cube, WebID.....

My wife always says to say something nice so I'm going to say SKOS. SKOS is ok.

1 more reply

drongoking6y ago

Community aside, I'd love to read an article on what's good and bad in the current semantic web. Maybe it would have to be written anonymously.

Or maybe contact O'Reilly and write an intro book "Semantic Web: Just the Good Parts" for their series.

meej6y ago

This is precisely why I left the rdf* mailing list shortly after it was created this year. The list maintainer invited subscribers to introduce themselves and as soon as a particular individual posted, I no longer wanted to be a member of the list.

markhollis6y ago

I'm curious what those toxic attitudes are. Surely the "we already invented it and you're reinventing it" can't be the only case. I'm also curious if it's in an academia or in industry.

tasogare6y ago

I shared the experience described by the grand parent. In particular I remember I had some argument on HN with a few people and the sheer amount of bad faith and technical inaccuracy thrown at me was jaw dropping. At this point I consider SW more a cult than a technology.

On the research side there are two kinds of research papers: the one that proposes an ontology for a domain, and the one that describes the conversion of an existing resource to RDF. I've never seen a paper where SW was used for something new and interesting and that would have been impossible without SW.

That being said, they are also both technical and conceptual pain points that are plaguing RDF. Basically the tech is trying to address too many things: both metadata and data, and every kind of data. "IRIs that can be URLs than can be sometimes dereferenced and sometimes not, but it's better if they are and then it's Linked Data" kind of thing makes it hard to assume (and thus build) anything.

So, RDF have been success in a few domains (biology) but in most case it doesn't offer a real competitive advantage over simpler and more expressive technologies such as graph databases.

PS: @zcw100 if you where to really write a book about semantic web, drop me a line please.

wrnr6y ago

My take on the attitude in academia: Here we describe a set of algorithms that can solve a class of problems that previous algorithms can't. In the 60' someone published a solution to a problem we have improved upon with the novel innovation of called "hyperlinks". The technical, social and economical shortcomings of our solution are invalid because it is decentralised and therefor morally superior to the current offerings, used the world over, of industry practitioners who are only doing it for the money. More funding is needed for further research.

1 more reply

ansible6y ago· 4 in thread

I had a lot of interest in the semantic web when I first started learning about it.

However, the efforts I've seen seem to be missing some critical factors for longer-term success. I think we've got a lot of work to do with regards to knowledge representation in general.

One of the big things for me is that the context for any fact is critical for it to be true or not.

You can have a fact like "Tim Cook is the CEO of Apple", represented in a graph like you would expect. However, that is only true today. Ten years ago it was Steve Jobs. Without explicit context encoded in the information graph, this web of data isn't as useful as it could be.

Context is important for reasoning in all kinds of situations. "What if Steve Ballmer was CEO of Apple?", is a hypothetical context, where it may be useful to do reasoning about. The context of "Who is the most distinguished captain of the Enterprise?" could be about the real world US Navy, or a fictional Star Trek universe (of which there are multiple).

JimmyRuska6y ago

Checkout datomic, you can query for facts given a certain point in time. This is common among customer preference storage too, like, "what's your favorite song or movie can change easily over time." Being able to query the state at a certain date can be helpful. Some predictive analytics can also be done like people with these preferences, how did they change once they had a family, or moved to another "life stage".

Though you probably don't need datomic, it would be not too complicated to model this in neo4j or some other RDF graph that supports arbitrary sized tuples. Datomic just supports this feature as a first class value offering.

mdlincolnOP6y ago

context-dependent, or "reified" assertions are a pain point for sure. I come from the perspective of cultural heritage data, where context is king. Which expert made this attribution for this painting? Who owned it _when_? According to which archival document? etc.

Almost all the engineering problems cited in the original post are still basically there, but graphical models are still the least painful way of doing this, particularly when trying to share data between institutions. Example: https://linked.art/model/assertion/

zozbot2346y ago

The OP mentions property graphs as a way around this problem. They can be seen as natural extensions of "RDF quads" which in turn are based on common RDF triples (Subject / Property / Object)

meej6y ago

This isn't that difficult to deal with, though. Instead of linking a CEO to their company with a simple object property, explicitly reify the relationship as a class in the model that represents something like "employment". Then you can hang as many contextually relevant properties from the class instance as you need -- start date, end date, role, etc...

This is why much of the hubbub over property graphs puzzles me. If you need a relationship to have its own properties in an RDF graph, just turn it into a class. What's the big deal?

JimmyRuska6y ago· 2 in thread

Semantic web tech solves a common problem. You have a database where you want to have some shared schema among many groups, and you want a way to infer facts based on first order logic. You want to be able to query multiple sources and reason about facts when taking into account multiple sources.

Whether you use semantic web tech or not that's still a common problem that doesn't always have a good plug and play solution. There's still a lot of places using jsonld format for metadata and cataloging information. You can google cooking recipes and get ratings, cook time; search for movies and see how high rated the movie is and who made it with a synopsis of the plot, all of these are product metadata powered by rdfs or jsonld metadata, a relic of the semantic web. It would be incorrect to say semantic web is dead. Any AI that can effectively use wikidata as a fact table would be jeopardy grade. There's still new tools coming out like RDFox that apply first order logic at multicore speed across huge datasets for reasoning. There is work being done to make it horizontally scalable. I think people will just go on an endless loop of getting the same pain points and creating new tools using the trending tech of the day, but even in this day and age, sometimes something like prolog or picat is what you need.

zozbot2346y ago

> you want a way to infer facts based on first order logic

Isn't that computationally infeasible? Semantic web standards are based on description logics, i.e. multi-modal logics chosen specifically for computational expediency.

Also, I wouldn't describe JSON-LD as a "relic" of anything. It's a fairly recent standard in the grand scheme of things, and many interesting projects these days implicitly rely on it.

aggerdom6y ago

So not an expert in this area, would love if someone corrects me. My understanding is generally FOL is infeasible. Propositional logic even can be computationaly difficult [1]. My understanding is that most of the semantic web stuff is done using a description logic of some flavor. These will be named based on the properties of the logic. The important thing is that they are generally decidable, and you can use something like MALET or some other solver to infer things from your database or ontology.(You give up some expressivity for decidability) Not sure how much stuff is going on with that these days. Played with a petrology ontology in protegé some back in college, but haven't followed the space. I remember OWL being important, but can't remember why at the moment.

[1] For example if you try to figure out if a formula is satisfiable. You can for sure do this using truth tables. The catch is that you're looking at 2^n complexity where n is the number of propositions in your formula.

hos2346y ago· 2 in thread

I am still a fan of Googles OpenRefine tool. It's reconciliation feature that helps disambiguate Named Entities etc based on wikidata is really powerful - https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation

You can hook in your own reconciliation end point which we do at work to expand internal knowledge graphs.

nl6y ago

Note that OpenRefine isn't really kept up to date.

The basic capabilities work ok, but lots of the additional capabilities have atrophied away.

jyrkesh6y ago

This is awesome, thanks so much for sharing. I'm really surprised I've never come across it because I've thought of building something like this before.

I really want to look into how this could ingest my own post-GDPR data exports, as well as data sanitization for ML projects.

buboard6y ago· 2 in thread

modern NLP makes the semantic web completely obsolete. if anything, you need less markup because it's confusing and more often than not, just wrong.

drongoking6y ago

This is too extreme. If, like Google, you have a flock of Ph.D.s who you can put onto an NLP problem to extract semantics from text, then semantic markup becomes less valuable. Not all of us are in that situation. And I don't think parsing text is the only application of the semantic web. Having hugs databases full of knowledge is interesting in itself.

As for semantic markup being confusing and usually wrong, I don't know where you get that.

buboard6y ago

Yeah, but i think there is a difference between standardized markup data formats describing e.g. proteins, and generic text with annotations. The latter are redundant

liminal6y ago· 2 in thread

I really want to like semantic web technologies, but every time I try to get into them I'm stymied: * A zillion standards that all reference each other * Two zillion incomplete and incompatible implementations of those specifications * No sense of direction within it all (what's the easy path?) * Multiple rebrandings of the same ideas (Semantic Web, Linked Data, Solid...)

zozbot2346y ago

"Solid" is just SOcial LInked Data anyway. I like LD as it seems clearer in intent than the "semantic web" label.

Vinnl6y ago

Yeah, in that sense Solid is a subset of Linked Data: linking personal data.

tylerjwilk006y ago· 1 in thread

Or responsive web design techniques apparently.

StuffedParrot6y ago

It reads fine for me on mobile.

sawaruna6y ago

Shoutouts to the 11 other people on HN still working with rdf and similar in 2020.

mark_l_watson6y ago

These a fair criticisms of the semantic web. One thing the author misses (does not touch on at all) is domain specific RDF resources for biology, medicine, etc.

schema.org and WikiData are great resources and for large companies, using these as a foundation for their own internal Knowledge Graphs can make sense. This expense is (maybe?) too large for small and medium size companies, they would not get enough benefit for the cost.

I worked with Google’s Knowledge Graph as a contractor, and I am still a believer in the technology but I also respect other people’s well founded scepticism.

contravariant6y ago

With the recent widespread interest in Category theory I still think it's a damn shame that RDF wasn't designed to treat relationships as stand-alone entities. Perhaps property graphs work better in that regard, although it's a bit weird how properties aren't themselves relationships, but perhaps that's a necessary concession to keep things efficient.

tannhaeuser6y ago

I wouldn't call semweb dead; it just has found its niche(s) and is even stabilizing and gaining in those areas. I actually landed a gig for graph DBs, SPARQL, etc. in lab informatics for bio/chem. Earlier this year I attended a keynote held by Wikimedia Deutschland's Franziska Heine pushing for large publicly available RDF data sets, etc.

abathur6y ago

I'm really interested in semantic authoring (not really structuring data with semantics--but marking semantics within running text), though I guess I'm disinterested in the semantic web.

I agree with a lot of the problems noted in other posts, and would add two other problems from the authoring side:

1. Identifying and employing sound semantics requires a level of thought and clarity that I don't think most people are habituated to working at. It raises the bar somewhat on who can be contributing (either they have to understand and take care with the semantics, or you need a separate person to handle them?)

2. I may be missing some good tools, but I haven't been able to find a good low-friction semantic authoring experience. Even if you are mentally prepared to write with explicit semantics, it still adds a lot of friction to the writing process (or requires subsequent semantic-edit passes).

austincheney6y ago

When writing data structures that are not for describing or defining services I still can't help but think in triples. I also can't help but think of each data facet as though it were something described with meta-data would provide sufficient context that it would make sense if it were read out loud to a stranger.

j / k navigate · click thread line to collapse

66 comments

57 comments · 15 top-level

at_a_remove6y ago· 10 in thread

The vision is lovely, but we who have always lived in the castle have walked alone.

riffraff6y ago

I would argue the problem is not the broken tags, but the business disadvantage to exposing semantic data.

Remember when microformats were all the rage, and you could get hReview or hRecipe or XFN data everywhere?

Then every host in turn realized that actually, it's _better_ if people can't scrape your site, and it's even better if they can't even see it and it's behind a login wall.

acdha6y ago

zozbot2346y ago

gwern6y ago

chongli6y ago

bordercases6y ago

tasogare6y ago

I've read my share of SW papers: the fact that after a year more than half of links in such works are dead is more telling than the papers themselves.

goto116y ago

Same for the semantic web. Show the benefit for the publisher.

rbosinger6y ago

Agreed. Poetically written as well!

dandelo536y ago

Just want to pile on more kudos. Nicely written.

AndrewStephens6y ago· 10 in thread

Can anyone tell me what the semantic web does for me as a small-time publisher? Is it for search engines? Does it really matter that a book review (for instance, I have a few) is tagged properly?

lazyjones6y ago

> Can anyone tell me what the semantic web does for me as a small-time publisher? Is it for search engines?

Yes, in practice it is mostly for bigger fish in the pond to easily identify and steal your content as needed.

For example, Google was using reviews from small competitors' sites in Google Shopping.

abathur6y ago

I think this is one of the big issues. The semantic information does make it easier for end users to find what they're looking for, but it also made denial of traffic possible.

In a lot of cases, the information was there to get eyeballs--so this is undesirable.

I guess if you don't really care about the eyeballs it can be "useful" for the big fish to pay most of the cost of serving the fraction of your server response that the end user was looking for...

1 more reply

onli6y ago

zozbot2346y ago

> Does it really matter that a book review (for instance, I have a few) is tagged properly?

coddle-hark6y ago

That seems like it would be abused to the point of the semantic information being completely useless.

1 more reply

decebalus16y ago

> It was an annoying job, since the semantic information pollutes the structure of the html.

In what way? Both the html and the metadata is intended to make your website machine-friendly. You may find the html structure polluted, but crawlers would find it more informative.

sjg0076y ago

Embedding semantic information would allow Google to further refine search traffic to your web page. I assume it may also make you more authoritative wrt to the content you publish.

zmix6y ago

"Semantic Web" is a wide area. What technologies did you use? Care to post a little example, as to what and how it pollutes the HTML structure?

have_faith6y ago

I can't imagine what semantic tags would pollute a blog's markup as most of the semantic tags were designed to structure simple text content like a blog post. Do you have any examples?

> Is it for search engines?

Yes. And Accessibility.

Vinnl6y ago

I think you might be confusing semantic HTML with the semantic web. (Which is understandable given the mention of semantic tags.)

Using semantic HTML means using <article> rather than yet another <div>. What GP is referring to, however, is adding extra information to your HTML detailing what kind of data is in your tags, e.g.:

    <p vocab="http://schema.org/" typeof="Person">
      <span property="name">Christopher Froome</span> was sponsored by
      <span property="sponsor" typeof="http://schema.org/Organization">
        <a property="url" href="http://www.skysports.com/">Sky</a></span> in the Tour de France.
    </p>

1 more reply

zcw1006y ago· 9 in thread

reggieband6y ago

> a large number of extremely toxic people that drive people away

ttepasse6y ago

> I can hardly remember all of the RDF/RSS/Atom stuff from way …

You're mixing in some stuff, that aren't really Semantic Web related.

(„Semantic elements“ were often a shorthand for „instead of a dumb div use the appropriate HTML element. That was more the quest of the web standards project than the Semantic Web.)

(Slight overlap: How to embed Semantic Web statements has a small relationship with XHTML - RDFa started imho in an XHTML 2 module.)

1 more reply

mindcrime6y ago

It's interesting how perceptions vary. I've been working with SemWeb stuff for a decade or so, and I have never experienced what you describe here:

One of the worst isn't even technical, it's the community. There are some great people in the community but there are also a large number of extremely toxic people that drive people away.

Maybe it's just dumb luck on my part, or whatever, but I have found no major issues with toxic people in the SemWeb community. shrug

zcw1006y ago

I'm done writing so I'll just include a list of the half baked ideas that sound good but are a day late and a dollar short. LDP, R2RML, ShEX, SHACL, DCAT, RDF Data Cube, WebID.....

My wife always says to say something nice so I'm going to say SKOS. SKOS is ok.

1 more reply

drongoking6y ago

Community aside, I'd love to read an article on what's good and bad in the current semantic web. Maybe it would have to be written anonymously.

Or maybe contact O'Reilly and write an intro book "Semantic Web: Just the Good Parts" for their series.

meej6y ago

markhollis6y ago

I'm curious what those toxic attitudes are. Surely the "we already invented it and you're reinventing it" can't be the only case. I'm also curious if it's in an academia or in industry.

tasogare6y ago

So, RDF have been success in a few domains (biology) but in most case it doesn't offer a real competitive advantage over simpler and more expressive technologies such as graph databases.

PS: @zcw100 if you where to really write a book about semantic web, drop me a line please.

wrnr6y ago

1 more reply

ansible6y ago· 4 in thread

I had a lot of interest in the semantic web when I first started learning about it.

However, the efforts I've seen seem to be missing some critical factors for longer-term success. I think we've got a lot of work to do with regards to knowledge representation in general.

One of the big things for me is that the context for any fact is critical for it to be true or not.

JimmyRuska6y ago

mdlincolnOP6y ago

zozbot2346y ago

The OP mentions property graphs as a way around this problem. They can be seen as natural extensions of "RDF quads" which in turn are based on common RDF triples (Subject / Property / Object)

meej6y ago

This is why much of the hubbub over property graphs puzzles me. If you need a relationship to have its own properties in an RDF graph, just turn it into a class. What's the big deal?

JimmyRuska6y ago· 2 in thread

zozbot2346y ago

> you want a way to infer facts based on first order logic

Isn't that computationally infeasible? Semantic web standards are based on description logics, i.e. multi-modal logics chosen specifically for computational expediency.

Also, I wouldn't describe JSON-LD as a "relic" of anything. It's a fairly recent standard in the grand scheme of things, and many interesting projects these days implicitly rely on it.

aggerdom6y ago

hos2346y ago· 2 in thread

You can hook in your own reconciliation end point which we do at work to expand internal knowledge graphs.

nl6y ago

Note that OpenRefine isn't really kept up to date.

The basic capabilities work ok, but lots of the additional capabilities have atrophied away.

jyrkesh6y ago

This is awesome, thanks so much for sharing. I'm really surprised I've never come across it because I've thought of building something like this before.

I really want to look into how this could ingest my own post-GDPR data exports, as well as data sanitization for ML projects.

buboard6y ago· 2 in thread

modern NLP makes the semantic web completely obsolete. if anything, you need less markup because it's confusing and more often than not, just wrong.

drongoking6y ago

As for semantic markup being confusing and usually wrong, I don't know where you get that.

buboard6y ago

Yeah, but i think there is a difference between standardized markup data formats describing e.g. proteins, and generic text with annotations. The latter are redundant

liminal6y ago· 2 in thread

zozbot2346y ago

"Solid" is just SOcial LInked Data anyway. I like LD as it seems clearer in intent than the "semantic web" label.

Vinnl6y ago

Yeah, in that sense Solid is a subset of Linked Data: linking personal data.

tylerjwilk006y ago· 1 in thread

Or responsive web design techniques apparently.

StuffedParrot6y ago

It reads fine for me on mobile.

sawaruna6y ago

Shoutouts to the 11 other people on HN still working with rdf and similar in 2020.

mark_l_watson6y ago

These a fair criticisms of the semantic web. One thing the author misses (does not touch on at all) is domain specific RDF resources for biology, medicine, etc.

I worked with Google’s Knowledge Graph as a contractor, and I am still a believer in the technology but I also respect other people’s well founded scepticism.

contravariant6y ago

tannhaeuser6y ago

abathur6y ago

I'm really interested in semantic authoring (not really structuring data with semantics--but marking semantics within running text), though I guess I'm disinterested in the semantic web.

I agree with a lot of the problems noted in other posts, and would add two other problems from the authoring side:

austincheney6y ago

j / k navigate · click thread line to collapse