FWIW, the document's own URI is terrible: 'https://www.w3.org/Provider/Style/URI' - who could have any idea what the page is about from that? And what if the meaning of the word 'Provider' or 'Style' changes in x years from now? :) You could argue that the meaning/usage of 'URI' has already changed, because practically no-one uses that term any more. Everyone knows about URLs, not URIs. Not many people could tell you what the difference was. So the article's URI has already failed by its own rules.
No, a URL doesn't necessarily have to give you the title of the article, even if having some related words in it might be good for SEO value. If you paste it in plain text or similar, add a description to it. Here's how:
Cool URIs Don't Change: https://www.w3.org/Provider/Style/URI
There, now the reader will know what's this about.
I'd still get far more value out of this:
I think you both stumbled upon a fundamental part of the discussion: the tension between finding a way to identify resources (or concepts, or physical things) in a unique and unambiguous fashion, and affordances provided by natural language that allow human minds to easily associate concepts and labels with the things they refer to.
The merit of UUID's, hashes or any other random string of symbols which falls outside of the domain of existing natural languages, is that doesn't carry any prior meaning until an authority within a bounded context associates that string with a resource by way of accepted convention. In a way, you're constructing a new conceptual reference framework of (a part of) the world.
The downside is that random strings of symbols don't map with widely understood concepts in natural language, making URL's that rely on them utterly incomprehensible unless you dereference them and match your observation of the dereferenced resource with what you know about the world (e.g. "Oh! http://0x5235.org/5aH55d actually points to a review of "Citizen Kane")
By using natural language when you construct a URL, you're inevitably incorporating prior meaning and significance into the URI. The problem is that you then end up with the murkiness of linguistics and semantics, and ends with all kinds of weird word plays if you let your mind roam entirely free about the labels in the URI proper.
For instance, there's the famous painting by René Margritte "The treachery of images" which correctly points out that the image is, in fact, not a pipe: it's a representation of a pipe. [1] By the same token, an alternate URI to this one [2] might read http://collections.lacma.org/ceci-nest-past-une-pipe, which incidentally correct as well: it's not a pipe, it's a URI pointing to a painting that represents a physical object - a pipe - with the phrase "this is not a pipe."
Another example would be that a generic machine doesn't know if http://www.imdb.com/titanic references the movie Titanic or the actualy cruiseship, unless it dereferences the URI, whereas we humans understand that it's the movie because we have a shared understanding that IMDB is a database about movies, not historic cruiseships. Of course, when you build a client that dereferences URI's from IMDB, you basically base your implementation on that assumption: that you're working with information about movies.
Incidentally, if you work with hashes and random strings, such as http://0x5235.org/5aH55d, you're client still has to be founded on a fundamental assumption that you're dereferencing URI's minted by a movie review database. Without context, a generic machine would perceive it as random string of characters which happens to be formatted as a URI, and dereferencing it just gives a random stream of characters that can't possibly be understood.
[1] https://en.wikipedia.org/wiki/The_Treachery_of_Images [2] https://collections.lacma.org/node/239578
Additionally, widespread use of web search engines has made URI stability less relevant for humans. Bookmarks are not the only solution to find a leaf page by topic again. A dedicated person might find that archiving websites may have preserved content at their old URIs.
Some of this is allowed to happen because the content is ultimately disposable, expires, or possesses limited relevance outside of a limited audience. Some company websites are little more than brochures. Documents and applications that are relevant within organizations can be communicated out of band. Ordinary people and ordinary companies don't want to be consciously running identifier authorities forever.
The reason for the eventual demise of the URL will simply be the fact that the concept of "resource" will just not be sufficient enough to describe every future class of application or abstract behavior that the web will enable.
It depends on how you define a "resource" and what which value you attribute to that resource. And this is exactly the crux: this is out of the scope of the specification. It's entirely left to those who implement URI's within a specific knowledge domain or problem domain to define what a resource is.
Far more important then "resource" is the "identifier" part. URI's are above all a convention which allows for minting globally unique identifiers that can be used to reference and dereference "resources" whatever those might be.
It's perfectly valid to use URI's that reference perishable resources that only have a limited use. The big difficulty is in appreciating resources and deriving how much need there is to focus on persistence and longevity. Cool URI's are excellent for referencing research (papers, articles,...), or identifying core concepts in domain specific taxonomies, or natural/cultural objects, or endorsement of information as an authority,...
The fallacy, then, is reducing URI's to how the general understanding of how the Web works: the simple URL you type in the address bar which allows you to retrieve and display that particular page. If Google et al. end up stripping URL's from user interfaces, and making people believe that you don't need URI's, inevitably a different identifier scheme and a new conceptual framework will need to be developed to just to be able to do what the Web is all about today: naming and referencing discrete pieces of information.
Ironically, you will find that such a framework and naming scheme will bear a big resemblances, and solves the same basic problems the Web has been solving for the past 30 years. And down the line, you will discover the same basic problem Cool URI's are solving today: that names and identifiers can change or become deprecated as our understanding and appreciation about information changes.
In the late 90's - early 2000's, HTML started to being pushed into fields that, at my opinion, were unrelated (remember active desktop?). Before you had time to react, HTML was being used to pass data between applications. At the time I was already doing embedded stuff and I remember being astonished to learn that I have to code an HTML parser/server/stack in my small 16-bit micro because some jerk thought it was a good idea to pass an integer using HTML (SOAP, for example).
In the meantime, HTML was being dynamically generated, and then dynamically modified in the browser, and then modified back in the server using the same thing you use to modify it in the browser. It's a snowball that will implode, sooner or later.
My HN username may be a case in point, drawing from a selection of twice five[0] digits due to legacy code of Hox genes: https://pubmed.ncbi.nlm.nih.gov/1363084/
[0] "This new art is called the algorismus, in which / out of these twice five figures / 0 9 8 7 6 5 4 3 2 1, / of the Indians we derive such benefit"
https://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Ca...
I'm happy that the world moved on to the point that json/yaml-like formats are strongly preferred.
(1) some operators only care about a handful of the URLs under their domain;
(2) hardly anyone uses link relations, so most links are devoid of semantic metadata and are essentially context-free, requiring a human to read the page and try to guess the purpose of the link;
(3) so many 'resources' are now entire applications, and the operators of these applications sometimes find it undesirable to encode application state into the URI, so for these you can only get to the entry point -- everything else is ephemeral state inside the browser's script context.
But I disagree with the statement that "the reason for the eventual demise of the URL will simply be the fact that the concept of 'resource' will just not be sufficient enough to describe every future class of application or abstract behavior that the web will enable."
URIs are a sufficient abstraction to accomodate any future use-case. It's a string where the part before the first colon tells you how to interpret the rest of it. It'd be hard to get more generic, yet more expressive.
The demise of URLs, if it ever comes to pass, will be due to politics or fashion: e.g. browser vendors not implementing support for certain schemes, lack of interoperability around length limits, concerns about readability and gleanability, and vertical integration around content discovery.
The article states early on, “Except insolvency, nothing prevents the domain name owner from keeping the name.” As it turns out, insolvency is a pretty significant source of URL rot, but also so is non renewal of domains by choice or by apathy, whether for financial or mere personal energy reasons (“who is my registrar again? Where do I go to renew?”) especially by individuals. You start a project and ten years later your interest has waned.
Domains are an increasingly abundant resource as TLDs proliferate. Why not default to a model where you pay once up front for the domain, and thereafter continued control is contingent on maintaining a certain percentage of previously published resources, and if you fail at that some revocable mechanism kicks in that serves mirrored versions of your old urls. Funding of these mirrors comes from the up front domain fees. Design of the mechanism is left as an exercise for the reader :-)
- The UK leaving the EU means British companies can't keep their .eu domains, unless they have a subsidiary in the EU.
- A trademark dispute can mean someone loses a domain.
If limited per customer it'd still be a similar situation, probably involving lots of 'fake' accounts and registrant details.
Years ago .info domains were being sold very cheaply. Their registrations skyrocketed and the quality of the average .info domain clearly went down.
Namespace pollution. What if my great-great grandson wants my user name on Google? I took it. Similarly, I took the .net domain with my last name.
Spam, squatting, maybe.
2016: https://news.ycombinator.com/item?id=11712449
2012: https://news.ycombinator.com/item?id=4154927
2011: https://news.ycombinator.com/item?id=2492566
2008 ("I just noticed that this classic piece of advice has never been directly posted to HN."): https://news.ycombinator.com/item?id=175199
also one comment from 7 months ago: https://news.ycombinator.com/item?id=21720496
http://www.pathfinder.com/money/moneydaily/1998/981212.moneyonline.html
This consists of:0. Access protocol
1. Hostname/DNS name
2. Arbitrary chosen path hirarchy
3. File extension
This is really a description where to find a document ("locator" not "identifier"). So, if you are:
- re-organizing / cleanup your file structure
- change or hide the file extension
- enable HTTPS
- migrating files to a different domain name
This WILL change the URL. What are you going to do? Not cleanup your space anymore? Stick to HTTP? So URLs DO change. That's just the reality.
If you want something that does not change, don't link to a location but link to content directly: E.g.
- git hashes do not change
- torrent/magnet Links don't change
- IPSFS links do not change.
Or use a central authority, that stewards the identifier:
- DOI numbers don't change
- ISBN numbers don't change
The article addresses this by reminding you that though URIs often look like paths, they can be aribtrarily mapped.
By all means move the resource, but put a redirect under the old URI. This means old links continue to work, which is the key point of the article.
I have tried to do it a few times, and eventually just gave up. Carrying forward bad naming decisions from the past, is tremendous effort. When cleaning up the house, I also don't leave around sticky notes at the places where I removed documents from.
On top of this:
- When using static site generators, it's not even possible to do 301 redirects (you would have to ugly slow JS version).
- It does not help if you don't own the old DNS name anymore.
That isn't always true, depending on your choice of web server. You can use mod_rewrite rules in Apache's .htaccess files so if your generator is aware of previous URLs for given content it could generate these to 30x redirect visitors and search bots to the new right place.
Off the top of my head I'm not aware of a tool that does this, but it is certainly possible. It would need to track the old content/layout so you'd need the content in a DB with history tracking (or a source control system) or the tool could leave information about previous generations on each run for later reading. Or it could simply rely on the user defining redirect links for it to include in the generated site.
Of course if you are using a static site generator for maximum efficiency you probably aren't using Apache with .htaccess processing enabled! I suppose a generator could also generate a config fragment for nginx/other similarly though that would not be useful if you are publishing via a web server where you do not have appropriately privileged access to make changes to that.
You can do 301s statically, by generating whatever your particular version of an .htaccess file is in place. Or, you can generate the HTML files with the meta-redirect header in place.
The DNS is obviously an issue, but that's not really relevant. The article is advocating for URLs not changing. It's not saying that they mustn't change, just that it's really cool for everyone if they don't.
I know it's 2020 and all that, but sometimes you don't need 20 MB of minified JS to achieve something: https://en.wikipedia.org/wiki/Meta_refresh
Netlify allows dead simple redirects, and so do most other static hosting platforms.
When you change the structure of your urls, you can generally generate redirect rules to translate old urls to the new structure. Or run a script to individually map each old url to its new one.
Note: I've never done the layter for more than a few hundreds urls, I don't know if it scales well for a very large site
This is a poor analogy. Perhaps “I’m a librarian for a library with thousands or millions of users, and when I rearrange the books, I don’t leave sticky notes pointing to the new locations”
The arbitrary path hierarchy is not so bad. Better than every URI just being https://domainname.com/meaninglesshash. You can also stick a short prefix in front, like https://domainname.com/v1/money/1998/etc, so that all documents created after a reorg can use a different prefix. If your reorg is so severe that there's no way to keep access to old documents under their old URI, even if it has its own prefix, it seem unlikely they'll be made available in any other location. In that context you can imagine the article is imploring you "please don't delete access to old documents".
Your remaining objections, for host name and access, boil down to "don't use URIs at all, and don't bother to avoid changing them". As I type this comment I'm starting to realise that was your whole point, but it was a bit buried alongside minor objections to this particular example. It's also perhaps a bit of an extreme point of view. Referencing a git hash alongside a URI is sensible, but on its own it's pretty useless, and many web pages won't have anything analogous.
Hostname, well perhaps if a company has been merged/sold.
Path/query is really down to information architecture and planning that early on can go a long way, e.g. contact, faq belonging in a /site subdirectory.
File extension doesn't really matter nowadays
Main thing is there's no technical reasons for the change. I recently saw someone wanting to change the URLs of their entire site because they now use PHP instead of ASP. They could use their webserver for PHP to deal with those pages and save the outside world a redirect and twice as many URLs to think about.
I really wish HTTPS hadn't changed the URL scheme so you could host both HTTPS and fallback HTTP under the same URL. However most HTTPS sites will redirect http://domain/(.*) to https://domain/$1 (or at least they should) so this doesn't need to break URLs.
This is excellent. I wish more people would make your distinction between URL and URI. URIs really are supposed to be IDs. When put in that parlance, it's hard to say that IDs should change willy-nilly on the web. That said, I think that does deprioritize a global hierarchy / taxonomy for a fundamentally graph-like data structure.
> If you want something that does not change, don't link to a location but link to content directly
I see motivation for this, but I've personally found this to be equally as problematic as blending the distinction between URIs and URLs. Most "depth" and hierarchy that's in URLs is stuff that ideally would be in the domain part of the URL. For instance:
http://company.com/blog/2019/02/10-cool-tips-you-wouldnt-bel...
would really map to:
http://blog.company.com/2019/02/10-cool-tips-you-wouldnt-bel...
and the "blog" subdomain would be owned by a team. You could imagine "payments", "orders", or whatever combo of relevant subdomains (or sub-subdomains). In my experience this hierarchical federation within an organization is not only natural, it's inevitable: Conway's Law.
So I do very much believe that the hierarchy of content and data is possible without needing a flat keyspace of ids. Just off the top of my head, issues with the flat keyspace are things like ownership of namespaces, authorization, resource assignment, different types of formats/content for the same underlying resources etc. Hierarchies really do scale and there's reason for them.
That said, most sites (the effective 'www' part of the domain) are really materialized _views_ of the underlying structure of the site/org. The web is fundamentally built to do this mashup of different views. Having your "location" be considered a reference "view" to the underlying "identity" "data" would go a long way to fixing stuff like this.
DOI and ISBN are as much locations as URL.
Content based URN are the only option.
> Historical note: At the end of the 20th century when this was written, "cool" was an epithet of approval particularly among young, indicating trendiness, quality, or appropriateness. In the rush to stake our DNS territory involved the choice of domain name and URI path were sometimes directed more toward apparent "coolness" than toward usefulness or longevity. This note is an attempt to redirect the energy behind the quest for coolness.
It's 2020 and "cool" still has that same meaning, as an informal positive epithet. I believe "cool" is the longest surviving informal positive epithet in the English language.
"Cool" has been cool since the 1920s, and it's still cool today. "Cool" has outlived "hip," "happening," "groovy," "fresh," "dope," "swell," "funky," "bad," "clutch," "epic," "fat," "primo," "radical," "bodacious," "sweet," "ace," "bitchin'," "smooth," and "fly."
My daughter says things are "cool." I predict that her children will say "cool," too.
Isn't that cool?
> Slang meaning "superior, classy, clever" is attested from 1893. Sense of "stylish" is from 1922.
> A 1599 dictionary has smoothboots "a flatterer, a faire spoken man, a cunning tongued fellow."
It may be time to bring that one back. "Did you see Keith chatting up that girl at the bar? Total smoothboots."
(under 30 male, west coast USA perspective)
I enjoyed reading your list, it was like a trip down memory lane.
I don't have my hard copy here and Google is failing me but this is addressed by Terry Pratchett in (I think) Only You Can Save Mankind.
The context is some teen-agers talking about how it's not cool to say Yo, or Crucial, or Well Wicked, but Cool is always cool.
Would appreciate the full quote if somebody can find!
'It's not cool to say Yo any more,' said Wobbler.
'Is it rad to say cool?' said Johnny.
'Cool's always cool. And no-one says rad any more, either.'
Wobbler looked around conspiratorially and then fished a package from his bag.
'This is cool. Have a go at this.'
'What is it?' said Johnny.
...
'Yes. We call him Yo-less because he's not cool.'
'Anti-cool's quite cool too.'
'Is it? I didn't know that. Is it still cool to say "well wicked"?'
'Johnny! It was never cool to say "well wicked".'
'How about "vode"?'
'Vode's cool.'
'I just made it up.'
The capsule drifted onwards.
'No reason why it can't be cool, though.'
I have spoken.
Example from a book: https://books.google.com/books?id=yLj8m3K0kNoC&pg=PA224&dq=h...
A message from a W3C staff member on a W3C mailing list on 1999-06-21 mentions [1] that w3c.org should redirect to the corresponding page at w3.org, and the latter is considered the 'correct' domain.
[1] https://lists.w3.org/Archives/Public/www-rdf-comments/1999Ap...
One of OneDrive’s pet peeves is that if I move a file it changes the URI. So any time someone moves a file, it breaks all the links that point to it. Or if they change the name from foo-v1 to foo-v2. I wish they’d adopt google docs.
[1] https://www.nayuki.io/page/designing-better-file-organizatio...
I think it's for reasons like this that many mac users strongly prefer native apps over Electron or web apps.
Users on every OS do.
Does make updating more awkward, and you still need some system of mapping the addresses to friendly names.
Within the context of digital preservation and on line archives, where longevity and the ephemeral nature of digital resources are at odds, this problem is addressed through the OAI-ORE standard [1]. This standard models resources as "web aggregations" which are represented as "resource maps" who are identified through Cool URI's.
It doesn't solve the issue entirely if you're not the publisher of the URI's your trying to curate. That's where PURL's (Persistent URL's) [2] come into play. The idea being that an intermediate 'resolver' proxies requests to Cool URI's to destination around the Web. The 'resolver' stores a key-value map which requires continually maintenance (Yes, at it's core, it's not solving the problem, it's moving the problem into a place where it becomes manageable). An example of a resolver system is the Handle System [3].
Finally, when it comes to caching and adding a 'time dimension' to documents identified through cool URI's, the Memento protocol [4] reuses existing and defines one extra HTTP Header.
Finding what you need via a Cool URI then becomes a matter of content negotiation. Of course, that doesn't solve everything. For one, context matters and it's not possible to a priori figure out the intentions of a user when they dereference a discrete URI. It's up to specific implementations to provide mechanisms that captures that context in order to return a relevant result.
[1] https://en.wikipedia.org/wiki/Object_Reuse_and_Exchange [2] https://en.wikipedia.org/wiki/Persistent_uniform_resource_lo... [3] https://en.wikipedia.org/wiki/Handle_System [4] https://en.wikipedia.org/wiki/Memento_Project
The migration to TLS for the majority of sites would have won him the bet but I see this one is still serving up non-TLS
Most URN schemes I have seen look something like an authority ID followed by either a date and a string you choose, or just a string you choose. This looks very like an HTTP URI. In other words, if you think your organization will be capable of creating URNs which will last, then prove it by doing it now and using them for your HTTP URIs. There is nothing about HTTP which makes your URIs unstable. It is your organization. Make a database which maps document URN to current filename, and let the web server use that to actually retrieve files.
Did this fail as a concept? Are there any active live examples of URNs?
One well-known example is the ISBN namespace [2], where the namespace-specific string is an ISBN [3].
The term 'URI' emerged as somewhat of an abstraction over URLs and URNs [4]. People were also catching onto the fact that URNs are conceptually useful, but you can't click on them in a mainstream browser, making its out-of-the-box usability poor.
DOI is an example of a newer scheme that considered these factors extensively [5] and ultimately chose locatable URIs (=URLs) as their identifiers.
[1] https://www.iana.org/assignments/urn-namespaces/urn-namespac... [2] https://www.iana.org/assignments/urn-formal/isbn [3] https://en.wikipedia.org/wiki/International_Standard_Book_Nu... [4] https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Hi... [5] https://www.doi.org/factsheets/DOIIdentifierSpecs.html
When a protocol ID is a URI it is common to use a URL rather than a URN so that the ID can serve as a link to its own documentation.
There is a bonkers DNS record called NAPTR https://en.wikipedia.org/wiki/NAPTR_record which was designed to be used to make the URN mapping database mentioned towards the end of your quote, using a combination of regex rewriting and chasing around the DNS. I get the impression NAPTR was never really used for resolving URNs but it has a second life for mapping phone numbers to network services.
There are too many moving parts to trust that even domain names will be the same. See geocities and tumblr for recent example. If you want a document, you should have archived it.
(Or maybe your point was deeper, that one not only can't trust that the resource location won't change but even that the resource itself will still be available somewhere? That is true, too! But saying that archive.org is the solution is just making one massively centralised point of failure. That doesn't mean that we shouldn't have or use archive.org, but that we should regard it as just the best solution we have now rather than the best solution, full stop.)
And then there are the URIs that aren't even made for human consumption, ridiculously long, impossible to parse or pass around. Another class is those that get destroyed on purpose. Your favorite search engine should just link to the content. Instead they link to a script that then forwards you to the content. This has all kinds of privacy implications as well as making it impossible to pass on for instance the link to a pdf document that you have found to a colleague because the link is unusable before you click it and after you click it you end up in a viewer.
For Firefox, I recommend the extension https://addons.mozilla.org/en-US/firefox/addon/google-direct.... The extension’s source code: https://github.com/chocolateboy/google-direct.
I can copy Google link just fine.
Here is a sample:
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&c...
Obtained by right clicking the link to the pdf and then 'copy link location'. What you see is not what is sent to your clipboard.
For instance, `https://example.com/foo` tells you that the resource can be accessed via the HTTPS protocol, at the server with the hostname example.com (on port 443), by asking it for the path `/foo`. It is hence an URL. On the other hand, `isbn:123456789012` precisely identifies a specific book, but gives you no information about how to locate it. Thus, it is just an URI, not an URL. (Every URL is also an URI, though.)
End of the day, there is not clarity, so just use the term that will be best understood by the person you are talking to. URL is a good default, probably even for "about:"
Look for example at this link:
https://www.amazon.com/Fundamentals-Software-Architecture-Engineering-Approach-ebook/dp/B0849MPK73/ref=sr_1_1?dchild=1&keywords=software+architecture&qid=1594966348&sr=8-1
Maybe each part has a solid reason to exist, but the result is a monster.I would prefer something like this:
https://amazon.com/dp/B0849MPK73
And guess what, the above short link actually works! But Amazon didn't use this kind of links as a standard. https://amazon.com/Fundamentals-Software-Architecture/dp/B0849MPK73/
This includes the main title of the book + ID (this variant also works).The Amazon URL that includes the title should be fairly stable, but if you look at e.g. a Discourse forum URL you see it contains the topic title, which can change at any time and then the URL changes with it. The old URL still works, because Discourse redirects, but this can't be taken for granted.
So Discourse then has these URL's referring to the same topic:
- https://forum.example.com/t/my-title/12345
- https://forum.example.com/t/my-new-title/12345
- https://forum.example.com/t/12345
And using the last version may be best to use when linking to the topic from somewhere else.But... this link works. Everything after /B0849MPK73/ is because you reached that product page through search, and it stores the search term in the URL. You can remove it and the site works just fine.
handle.net (technically it’s like a url shortner, but there’s an escrow agreement you need to sign first to make sure that the urls stay available). Purl and w3id.org (that allow for easy moving of whole sites to a new domain name. And of course https://robustlinks.mementoweb.org/spec/
* Simplicity: Short, mnemonic URIs will not break as easily when sent in emails and are in general easier to remember.
* Stability: Once you set up a URI to identify a certain resource, it should remain this way as long as possible ("the next 10/20 years"). Keep implementation-specific bits and pieces such as .php out, you may want to change technologies later.
* Manageability: Issue your URIs in a way that you can manage. One good practice is to include the current year in the URI path, so that you can change the URI-schema each year without breaking older URIs.
This is what 301 HTTP status (permanent redirect) should be for... [1] So it seems to me if you use 301 you should be good to go.
Also from a quick search it seems the recommended thing to do is remove the old URLs from your sitemap.
1: https://en.wikipedia.org/wiki/URL_redirection#HTTP_status_co...
e.g.: https://news.ycombinator.com/item?id=8454570 https://news.ycombinator.com/item?id=10086156 https://news.ycombinator.com/item?id=803901
In this one https://news.ycombinator.com/item?id=1472611 the URI is actually broken - not sure if it changed or if it just was a mistake of OP back then.
True. Yet this submission will have dramatically greater visibility than it otherwise would have because the HN facebook bot linked it 5 minutes ago[1]. As a web archivist, I've dealt a lot with the erosion of URI stability at the hands of platform-centric traffic behavior and I don't see it letting up any time soon.
Sidenote: The fb botpage with a far larger audience, @hnbot[2], stopped posting some months ago.
[1]: https://facebook.com/hn.hiren.news/posts/2716971055212806
Here's some selected quotes:
6.2.1 "(...) The definition of resource in REST is based on a simple premise: identifiers should change as infrequently as possible. Because the Web uses embedded identifiers rather than link servers, authors need an identifier that closely matches the semantics they intend by a hypermedia reference, allowing the reference to remain static even though the result of accessing that reference may change over time. REST accomplishes this by defining a resource to be the semantics of what the author intends to identify, rather than the value corresponding to those semantics at the time the reference is created. It is then left to the author to ensure that the identifier chosen for a reference does indeed identify the intended semantics."
6.2.2 "Defining resource such that a URI identifies a concept rather than a document leaves us with another question: how does a user access, manipulate, or transfer a concept such that they can get something useful when a hypertext link is selected? REST answers that question by defining the things that are manipulated to be representations of the identified resource, rather than the resource itself. An origin server maintains a mapping from resource identifiers to the set of representations corresponding to each resource. A resource is therefore manipulated by transferring representations through the generic interface defined by the resource identifier."
[1] https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding...
Is it a bias I've developed or has anyone else realized just how many dangling links on microsoft.com? Redistributables, small tools, patches, support pages, documentation pages. I've recently found out when a link domain is microsoft.com I subconsciously expect it to be 404 with about 50% chance.
Is there a benefit to this? I am mostly just frustrated.
I think archive.org is the better long term plan. Not only does it preserve urls forever, it also preserves the content on them.