Getting around website paywalls with devtools alone (opens in new tab)

(bbarrows.com)

169 pointsbebrws3y ago120 comments

120 comments

98 comments · 38 top-level

samwillis3y ago· 31 in thread

Many sites don't contains the full content even if you do that.

I'm not sure of how it works (does it subscribed to them all?) but https://archive.ph/ is a good way to see the content in those cases.

But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there.

DennisP3y ago

I subscribe to a major newspaper. But I'm not going to subscribe to all major newspapers. The individual subscription model doesn't really fit a world where you can go to one site, and click links to articles from lots of different publications.

If they had a common subscription, where you pay one reasonable fee and they divide it up according to whose articles you read, I'd subscribe to that. Since they don't, I subscribe to one paper and do workarounds on the others. I feel this is ethical because if everyone did it, with a decently random distribution, then the newspapers would survive just fine. They'd make the same overall revenue as when everyone had one newspaper, showing up at their doorstep each morning.

basch3y ago

There is a common subscription mostly overlooked. Your local library.

What most of the press subscription services make the mistake of is trying to simplify the billing process to the individual. Ripe opportunity for someone to come along and make an all in one paid subscription service at the county level and make it easy to log into all sites with your library membership.

The current one is per site model is way too much from too not be worth it.

samwillis3y ago

> If they had a common subscription, where you pay one reasonable fee and they divide it up according to whose articles you read

So some extent that's what Apple News+ (included in Apple One) is. But it doesn't do multi region stuff I don't think, and misses some major publications.

sowbug3y ago

Syndication used to be the way this idea worked. You'd buy your local paper that wrote about the cat stuck in the tree down the street, but it also republished national articles from AP. If you really cared about another locale, you might subscribe to two papers, which meant you might see the same national article twice.

Tagbert3y ago

That is basically the Apple News model. If you have Apple News (part of the Apple One bundle) you can get articles from a number of publishers that would normally require a separate prescription. I’m not sure if the publishers really like this model but so far they seem to be willing to play along.

latchkey3y ago

This is also quickly happening with social media now.

Twitter $8, FB $12 (web) / $15 ios)

You're asking for the cable tv model where they aggregate premium channels, imho that doesn't work either... you end up paying for a lot of stuff that you're not interested in.

k_sze3y ago

How about getting a paid membership for certain higher education institutions’ libraries, which will give you access to not only books, but also a plethora of periodicals. And that will be fully legal and ethical.

2 more replies

enumjorge3y ago

I’d be more willing to subscribe if publications didn’t pull the dark pattern of making signing up fast and easy, but then requiring a call to a rep to cancel.

I used to subscribe to the nytimes but a few years ago I needed a break from news. My plan was to come back in 6-12 months, but they made me wait on the phone for 25 mins for something that should have taken a couple of minutes on their site. I cancelled and never went back.

kvdveer3y ago

Move to the Netherlands. The medium used to sign up, should be available to cancel. Cancellation can't be harder than signing up, and (barring a few exceptions), you can't have an autorenew period of longer than a month, after an initial year.

These may actually be EU rules BTW, haven't checked.

gorkish3y ago

My secret weapon for cutting through the call-to-cancel CSR script with a machete is the following:

"May I ask why you want to cancel your account today?"

"No, you may not."

For some insane reason they ALL phrase the question this way, enabling this little grammatical gem of a response. Enjoy.

1 more reply

monksy3y ago

I was going to subscribe to NYTdigital .. but I saw the reviews talking about this experience.

nicwolff3y ago

You could suspend your subscription. (Shhh – suspending print delivery doesn't interrupt your access to the site or the Replica Edition ツ)

huy773y ago

the browser should have a wallet feature, instead of asking you to subscribe, just pay for one article you want to read

AdamJacobMuller3y ago

privacy.com

felixthehat3y ago

https://12ft.io is another good one but yes agreed, you should cough up for the subscription when possible

bin_bash3y ago

I've tried that on like 3 or 4 different sites and it didn't work for a single one

1 more reply

terrycody3y ago

No, this website, I never succeed even one time...

terrycody3y ago

+1 for this.

This website almost succeed every time I run out of my tricks, like:

1) ESC to interrupt the page load 2) quickly hit "view mode" before the wall appears 3) add a "." behind the .com, so like .com./ 4) visit in incognito window when the tokens run out (e.g. Medium) 5) Check Google cache of this page, (you can quickly add cache: URL to visit the cache page) 6) Check archive.org cache of some lost pages 7) maybe some extensions but I seldom use them nowadays 8) before, there are some cool sites like, sorry I forgot the names, all stopped working, those websites can remove paypall

9) console tricks though I dunno.

phemartin3y ago

Is this even legal? How can sites host other ones' content, and that's ok?

- Is it fair use because it's "archiving" the web?

- Is it because it's on the open web and it's public domain?

- Or is it illegal, and people do it because they can ¯\_(ツ)_/¯

jrochkind13y ago

Is Firefox "reader" mode legal?

Is Google cache legal?

What about Internet Archive wayback machine archive?

Is deleting your cookies legal? How about spoofing your user-agent?

How about a browser plugin that automates what OP describes?

In the case of caches like Google, Internet Archive, or `archive.today` (same thing as `archive.ph`)... probably, in the USA? If it winds up in court, we will find out, eventually.

Simply reading anything on the web technically involves "making a copy" already, which is one reason it gets and has remained somewhat confusing and complex to determine what is or is not legal with regard to copying web content. You can't simply say "making a copy is not allowed".

reggegg3y ago

Archive.ph et al is run by a russian fellow so probably the third one, especially now that Russia doesn't seem to care about something like this at all

mik19983y ago

Is caching illegal?

2 more replies

vie000013y ago

> I'm not sure of how it works (does it subscribed to them all?) but https://archive.ph/ is a good way to see the content in those cases.

I think for search engine crawlers there are versions without a paywall so these articles can get fully indexed. Archive.ph, and similar services, might get the full content this way somehow. But I am just guessing.

qingdao993y ago

Yes, it's called dynamic rendering when you serve different content based on the user agent, but it's apparently not recommended anymore.

https://searchengineland.com/google-no-longer-recommends-usi...

maxboone3y ago

archive.ph also uses (donated) logins to archive (paywalled) content, however those accounts do get blocked from time to time.

https://blog.archive.today/post/678202832257794048/why-cant-...

While pretending to be GoogleBot used to get you full articles (or grabbing them from cache) this doesn't seem to be the case for some sites anymore.

They just give the first part of the article without the paywall, as that's usually enough for SEO purposes.

1 more reply

crakenzak3y ago

you're spot on, this is exactly how sites like archive.org, archive.ph, or even if you click on "view cached version" on Google get the non-paywalled versions.

grishka3y ago

I stumbled upon a link to an article with an interesting headline. I would like to read it, so I click the link, but there's a paywall. I have no idea what site that even is. This is the first and probably last time I'm seeing it. No way it gets a single cent from me. This just can't work when there are so many news websites competing for subscriptions.

Yes, archive.ph works most of the time, can't recommend it enough.

izietto3y ago

> But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there

I'd go further than your statement: I try not to read paywalled contents. Actually I don't get all these workarounds about paywalls. I'm like "they don't want me to read it? I'm not going to read it then".

donohoe3y ago

But they do want you to read. They also want to be financially viable.

1 more reply

dkdbejwi3833y ago

> But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there.

While there are some paywalled websites that allow you to read _n_ articles per period for "free", there are many that don't. How do I know in this case whether it's worth the cost?

There are also times where I'll see a link to something behind a paywall with an interesting headline (frequently on HN), but from a publication I don't regularly read, so have no intention of a subscription. It would be nice in this case to be able to pay a one-time, small contribution.

Worth stating I don't disagree necessarily with the sentiment, there are just a few "edge cases" that make it impractical.

jelangkung3y ago

this.

bloomberg.com for instance, hides pay walled lines in empty <div>s.

the other method is to disable javascript and cookies (works on nytimes.com), or press ESC key to stop page loading before paywall kicks in (works on telegraph.co.uk) :)

retox3y ago· 6 in thread

A handful of sites will present the subscribers view of the page if you put a dot after the tld part of the url, i.e.

https://site.com./1235/article

Those behind Cloudflare don't seem to be vulnerable to this though.

I've emailed the sites I've found where this works and none of them have fixed it after a year.

neoromantique3y ago

>I've emailed the sites I've found where this works

Why?

bilekas3y ago

Can't speak for OP but I think I would be annoyed if I was paying a subscription and I found out I could get around it with a simple `.` because of developer incompetence.

2 more replies

creamyhorror3y ago

Why would this work? I know FQDNs sometimes require specifying a period at the end, and browsers seem to accept it as well, but I wonder why this would affect some frameworks' displayed content. Normally a site would rely on a cookie to maintain logged-in status, and that shouldn't be affected by the request URL.

tyingq3y ago

Maybe some backend concession to allow viewing the article on hostnames/uris that are meant for internal use, partner sharing sites, etc? Like an if/then that checks the origin hostname?

1 more reply

bilekas3y ago

> I've emailed the sites I've found where this works and none of them have fixed it after a year.

Guess it's a feature then and not a bug.

mock-possum3y ago

Maybe some kind of common mistake when writing a pattern match

jwr3y ago· 5 in thread

What I find annoying about paywalled sites is that they provide the full content to Google. And Google is OK with indexing the full content, even though it is not available on the internet, and even though they explicitly forbid the practice of showing different content to a search engine from what is available publicly.

Paywalled sites are just fine, but they are not part of the open Internet, and should not pretend to be.

CM303y ago

Yeah, 100% agree with this. It's like these sites want to have their cake and eat it, and both get the traffic the 'open web' provides while not having to actually share any of their work there.

It's like if you needed an app to view a page, yet Google had all its content indexed. Why is that (rightly) seen as unreasonable while charging users for content you provide to bots for free isn't?

iicc3y ago

Sometimes the full content is available - but only if you navigate to it via a Google results page, and you don't have existing cookies implementing a "free article limit".

donohoe3y ago

> they explicitly forbid the > practice of showing different > content to a search engine from > what is available publicly

This isn’t true. This paywall treatment is something they do allow and have worked to accommodate.

tyingq3y ago

It's also true that they say "cloaking" isn't allowed, "any type of cloaking"

https://youtu.be/QHtnfOgp65Q

mrobins3y ago

That’s a Google problem not a publisher problem. Content should be findable and available to people who want to pay for it.

Google could easily add an option to filter out paywalled content but that would reduce clicks.

1vuio0pswjnm73y ago· 4 in thread

Another option no one has menitoned is AMP. Many sites that try to use "paywalls" have AMP URLs which point to pages that have all the full text of the article in <p> tags. These AMP sites generally look great in a text-only browser that does not run Javascript. Popular example is WSJ. In the URL, add /amp before /article/.

Paywalls are insidious because they target non-subscribers. Why let non-subscribers view articles. Why not password protect all subscriber content. Paywalls are a way to make money from (the attention of) non-subscribers, targeting them with ads and tracking. The strategy is apparently to annoy people to the point of subscribing. Yet even if they subscribe they will still be subjected to advertising. One potential advantage is that a paying subscriber has an enforceable contract. In theory the contract could contain enforceable privacy protections. "Tech" companies would never agree to give people enforceable privacy protections; it would destroy their "business".

The way to save journalism, especially local news, is to regulate "Big Tech" middlemen, who generally do not employ journalists and produce zero content.1 The quality of journalism in general has taken a nosedive, but placing the blame for that on web users not purchasing subscriptions is conveniently ignoring the true culprit.

1. Arguably that's a prerequisite to maintaining their Section 230 protection. In the recent Supreme Court oral arguments, Google's counsel argued Google is not a publisher. Then minutes later she argued Google has to make design decisions "like any publisher", therefore Google gets a free pass to reorganise information in annoying and perhaps harmful ways to maximise ad services revenue, like inserting "popular" videos into YouTube search results that have nothing to do with the query string.

1vuio0pswjnm73y ago

"Other proposals would have more broadly exposed providers to liability for hosting unlawful content, if the provider is aware of that content.318 For example, the Platform Accountability and Consumer Transparency Act (PACT Act) would have amended Section 230 so that certain providers lose immunity under subsection (c)(1) if the provider is notified about illegal content or activity occurring on its service and does not remove the illegal content or stop the illegal activity within 24 hours."

"Proposals like the PACT Act would state that if a provider is aware of an unlawful post, it will lose immunity for a lawsuit premised on that specific post.334"

"The CASE-IT Act would have taken a similar approach, providing that service providers and users lose Section 230(c)(1) immunity for a year if they engage in certain activities, including permitting harmful content to be distributed to minors, if the harmful content is made readily accessible to minors by the failure of such provider or user to implement a system designed to effectively screen users who are minors from accessing such content.337"

"In the same vein, other bills would have caused providers to lose Section 230 immunity if they use algorithms to distribute content to users or display behavioral advertising.338"

https://crsreports.congress.gov/product/pdf/R/R46751

basch3y ago

> maintaining their Section 230 protection

There’s no such thing as maintaining protection. It’s not something you “lose.”

donohoe3y ago

But the number of AMP versions is dwindling as Google is no longer forcing it.

1vuio0pswjnm73y ago

Yep. And the browser vendor can make changes to DevTools, archive.ph is already privacy-unfriendly and blocked in some countries, it could disappear, 12ft.io could disappear, and so on.

The fact that seems least likely to change is that most "paywalls" rely on Javascript, CSS or some other "feature" of so-called "modern" web browsers.

I have lost count of how many times an HN comment alleges "paywalled" in a thread and I am reading the article just like any other because I am not using a graphical web browser. I would not even know there was a "paywall". Paywalls have dependencies.

3 more replies

drewtato3y ago· 3 in thread

Usually on these sites, there'll be an `overflow: hidden` element that's holding all the content. If you can find and disable that CSS line, it'll work as normal. Or just save it to the Wayback Machine and read it through that.

bebrwsOP3y ago

I am going to add this to the post if that is alright. Please let me know if I shouldn't cite your HN username on the post. You can reply here and I'll see it. Thank you, this is super helpful.

drewtato3y ago

Kinda late, but that's okay with me!

prettyStandard3y ago

Not sure if this is exactly right, but something like this should obviate the need to find the element holding all the content.

* { overflow: visible !important; }

_boffin_3y ago· 3 in thread

or... just remove the `overflow: hidden` that's most likely placed on the `<body>` or some high level `<div>`.

lol7683y ago

This is the correct answer - and then scrolling will work properly!

Not sure why this 'hack' is on the front-page.

spiritplumber3y ago

I learned a thing today that I would not have if it wasn't :)

courgette3y ago

my experience is that it stopped working a few years ago. Eg: NYT or FT. The content is nowhere to be fund on the raw html itself.

No idea how it works but it looks like actual content is loaded separately once the gates are open?

karrotwaltz3y ago· 2 in thread

I use this JS bookmarklet to remove fixed elements and restore scrolling, it works most of the time:

https://pastebin.com/qBjJHkMv

I also have one to kill all running javascript and remove all event listeners, it works wonders when you are redirected to a paywall / login page after a few seconds.

albert_e3y ago

Would you mind sharing the second script as well? Thanks

This is supposed to be saved as a Javascript Bookmarklet?

karrotwaltz3y ago

Javascript killer: https://pastebin.com/utE3275J

Yes, I'm using it as a bookmarklet. I'm using firefox but I think it should work the same for other browsers.

PufPufPuf3y ago· 2 in thread

I recommend the browser extension "Bypass Paywalls Clean". I sometimes think about the morality of using it, but I just don't find it viable to pay all the websites where I read just a single article.

denton-scratch3y ago

> but I just don't find it viable to pay all the websites where I read just a single article.

This.

In the print days, you'd buy a newspaper; you'd have access to all the articles in that edition. I used to read a daily paper.

In the modern world, these papers expect you to pay for a newspaper just to read a single article. I dunno, perhaps they could form a "Paywall Consortium", so that I could pay a one-day fee to the consortium, and have access to Washpo, Telegraph, NYT etc. for 24 hours. Let the consortium figure out how to distribute the fees - it's not my concern.

But if you want me to buy the whole paper to read a single article, well, ain't gonna happen.

Kerrick3y ago

> But if you want me to buy the whole paper to read a single article, well, ain't gonna happen.

This was common for non-subscribers in the print days. Newspapers would print a number of enticing headlines and images on the front page above the fold, and display those folded newspapers for sale at dispensers, newsstands, and stores. Many people who bought a one-off paper would buy for a single article that interested them.

1 more reply

start1233y ago· 1 in thread

I just use Firefox's reader view. Does the same with just a click. If it doesn't work, just refresh in reader view and it should load properly.

bebrwsOP3y ago

Added this comment to the post if that is ok. Let me know here if it isn't. Thank you!

alkonaut3y ago· 1 in thread

10 years ago paywalled sites contained the content just hidden. Today I haven't seen a site in a long time that renders the content hidden (why would it do that? There is no reason to do it based on indexing/SEO as far as I'm aware).

Even cached/archived versions these days tend to not include the whole text. Basically: they figured out how to make a paywall, which frankly isn't that surprising.

donohoe3y ago

Not sure I agree with that assessment.

There are so many ways to do a paywall and you’ll still see all sorts of flavors across the web today.

pentagrama3y ago· 1 in thread

This extension removes paywalls on many news sites (maintained, 33k stars) https://github.com/iamadamdev/bypass-paywalls-chrome

latchkey3y ago

There was some previous drama and some people switched to this one.

https://gitlab.com/magnolia1234/bypass-paywalls-chrome-clean...

https://gitlab.com/magnolia1234/bypass-paywalls-firefox-clea...

encryptluks23y ago· 1 in thread

Some sites now are literally not loading the actual paywalled content until after you sign in, so not matter what you do you aren't going to be able to access it unless someone with a paid subscription shares that content and it is then uploaded to a third party paywall bypasser.

colesantiago3y ago

The Information does this a lot.

https://www.theinformation.com

batperson3y ago

In the case of Washington Post it just has a "position: fixed" style on the <body> element. That's usually the case with most of these scroll locking sites, one of the root parent elements will have some CSS style that you can click off.

_the_inflator3y ago

A Clickbait Transformator would have opted for a headline along the lines: "This tip saves you thousands of Dollars!". ;)

arbol3y ago

Good tip. Sites turning off scroll is one of my pet annoyances.

kokanee3y ago

If you're unable to scroll the page, it generally means there is a "position: fixed" css rule on the body or a wrapper element. Turn off that css rule and you can scroll through the article normally.

Publishers are slowly wising up, though. Most don't load the full article for unpaid viewers anymore.

porbelm3y ago

Yeah on the system most of the local papers use over here, the full content is not even loaded for non-subscribers. It used to be, so removing the blur div worked, but now only the headline, byline and lead text are visible :(

Guess they caught on to the "cheaters."

Am4TIfIsER0ppos3y ago

Just like with pointless loading spinners. You just delete the element which is overlaid on the complete content. Sometimes there is an overflow, opacity, or visibility attribute that needs changing. Fucking webdevs!

bubblebaker3y ago

My method is to use ublock origin extention to block third party scripts.

phtrivier3y ago

Title should be : "getting around very poorly implemented paywall (eg WaPo) with devtools alone".

As soon as your site sends the whole content of the article to the browser, you're not even trying seriously. (And Firefox "reading mode" is just much better ux than the devtools.)

71a54xd3y ago

Your site's feature of "cool comments from HN" is awesome! Automating this with some kind of LLM and selling it would be killer.

vixen993y ago

Does not work with https://www.spectator.co.uk/

vixen993y ago

Did not work with https://www.spectator.co.uk/.

l3mure3y ago

Another devtools trick I've used is network throttling to allow me to copy out an article's content before JS loads.

dspillett3y ago

> But I didn't know how to scroll down on the page until today.

That is usually due to an "overflow: hidden" somewhere near the top of the DOM tree. Remove that and your normal scrollbar usually returns. You see this a lot with "accept being stalked to read this" pop-overs as well as paywall related shenanigans.

I've seen some sites put it back in via JS. There is probably a workaround for that too though the easiest one is to not worry about it and DNS blacklist the site so you don't waste time visiting it again in future.

Eupraxias3y ago

Anyone else just read content in source?

CodeHz3y ago

Maybe it just a transparent div that blocked all the pointer events?

oh_sigh3y ago

Generally just disabling JavaScript for the site works perfectly.

snaix3y ago

Shortcut in macos, in most browsers, to get into "div selection mode" is "shift + command + c"...

Just press that, select the paywall (and any other junk backdrops/opaque divs), and press delete.

Sometimes the site also sets an `overflow: hidden` in the css, and you need to remove that to see the content..

kozinc3y ago

Sometimes, however, you don't even get a full article text when the paywalled site loads. In those cases, no amount of Scroll Into View will help.

For those cases try something like Google Cache or Wayback Machine! It still won't always work, but it's nevertheless got a pretty good success rate.

iKlsR3y ago

The simplest method is just to refresh the page and hit Esc as quickly as you can to cancel the page load and prevent further scripts from running, that works for me 80% of the time. Sites need to be indexed by google so any paywall is often client side js.

thallosaurus3y ago

Oh yeah, used this method to bypass a regional paywalled news site until they fixed it by sending out scrambled text that seems random enough to not be a cipher ("Zc xjixc Axiäclxiqcil jcqlxi ljx Zxcxqjxiqxi cc")

suhaybh3y ago

I just use https://github.com/iamadamdev/bypass-paywalls-chrome and it works well for me.

jrberendt3y ago

It's insane that news sites with thousands or even millions of monthly users have paywalls like this. Sure it's great for anyone who wants to read for free, but some individuals make their livings now through paywalled articles. If the big guys can't protect from exploits, what about everyone else?

I created https://turbolink.io to attempt to solve this problem.

kleinmatic3y ago

Pretty amazing how many people's ethics are less valuable to them than their product design opinions.

throwaway293033y ago

It seems the cat's out of the bag now. There's also another interesting way of getting around paywalls: a simple race condition. While loading the page hit stop as soon as you load enough of the content you're interested in et voilà. It does not work 100% of the time and I'm not using Chrome, however. ;)

kjeksfjes3y ago

Hush?

bebrwsOP3y ago

A code free way to view a site with a paywall and navigate or scroll your way through the content.

j / k navigate · click thread line to collapse

120 comments

98 comments · 38 top-level

samwillis3y ago· 31 in thread

Many sites don't contains the full content even if you do that.

I'm not sure of how it works (does it subscribed to them all?) but https://archive.ph/ is a good way to see the content in those cases.

But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there.

DennisP3y ago

basch3y ago

There is a common subscription mostly overlooked. Your local library.

The current one is per site model is way too much from too not be worth it.

samwillis3y ago

> If they had a common subscription, where you pay one reasonable fee and they divide it up according to whose articles you read

So some extent that's what Apple News+ (included in Apple One) is. But it doesn't do multi region stuff I don't think, and misses some major publications.

sowbug3y ago

Tagbert3y ago

latchkey3y ago

This is also quickly happening with social media now.

Twitter $8, FB $12 (web) / $15 ios)

You're asking for the cable tv model where they aggregate premium channels, imho that doesn't work either... you end up paying for a lot of stuff that you're not interested in.

k_sze3y ago

2 more replies

enumjorge3y ago

I’d be more willing to subscribe if publications didn’t pull the dark pattern of making signing up fast and easy, but then requiring a call to a rep to cancel.

kvdveer3y ago

These may actually be EU rules BTW, haven't checked.

gorkish3y ago

My secret weapon for cutting through the call-to-cancel CSR script with a machete is the following:

"May I ask why you want to cancel your account today?"

"No, you may not."

For some insane reason they ALL phrase the question this way, enabling this little grammatical gem of a response. Enjoy.

1 more reply

monksy3y ago

I was going to subscribe to NYTdigital .. but I saw the reviews talking about this experience.

nicwolff3y ago

You could suspend your subscription. (Shhh – suspending print delivery doesn't interrupt your access to the site or the Replica Edition ツ)

huy773y ago

the browser should have a wallet feature, instead of asking you to subscribe, just pay for one article you want to read

AdamJacobMuller3y ago

privacy.com

felixthehat3y ago

https://12ft.io is another good one but yes agreed, you should cough up for the subscription when possible

bin_bash3y ago

I've tried that on like 3 or 4 different sites and it didn't work for a single one

1 more reply

terrycody3y ago

No, this website, I never succeed even one time...

terrycody3y ago

+1 for this.

This website almost succeed every time I run out of my tricks, like:

9) console tricks though I dunno.

phemartin3y ago

Is this even legal? How can sites host other ones' content, and that's ok?

- Is it fair use because it's "archiving" the web?

- Is it because it's on the open web and it's public domain?

- Or is it illegal, and people do it because they can ¯\_(ツ)_/¯

jrochkind13y ago

Is Firefox "reader" mode legal?

Is Google cache legal?

What about Internet Archive wayback machine archive?

Is deleting your cookies legal? How about spoofing your user-agent?

How about a browser plugin that automates what OP describes?

In the case of caches like Google, Internet Archive, or `archive.today` (same thing as `archive.ph`)... probably, in the USA? If it winds up in court, we will find out, eventually.

reggegg3y ago

Archive.ph et al is run by a russian fellow so probably the third one, especially now that Russia doesn't seem to care about something like this at all

mik19983y ago

Is caching illegal?

2 more replies

vie000013y ago

> I'm not sure of how it works (does it subscribed to them all?) but https://archive.ph/ is a good way to see the content in those cases.

qingdao993y ago

Yes, it's called dynamic rendering when you serve different content based on the user agent, but it's apparently not recommended anymore.

https://searchengineland.com/google-no-longer-recommends-usi...

maxboone3y ago

archive.ph also uses (donated) logins to archive (paywalled) content, however those accounts do get blocked from time to time.

https://blog.archive.today/post/678202832257794048/why-cant-...

While pretending to be GoogleBot used to get you full articles (or grabbing them from cache) this doesn't seem to be the case for some sites anymore.

They just give the first part of the article without the paywall, as that's usually enough for SEO purposes.

1 more reply

crakenzak3y ago

you're spot on, this is exactly how sites like archive.org, archive.ph, or even if you click on "view cached version" on Google get the non-paywalled versions.

grishka3y ago

Yes, archive.ph works most of the time, can't recommend it enough.

izietto3y ago

> But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there

donohoe3y ago

But they do want you to read. They also want to be financially viable.

1 more reply

dkdbejwi3833y ago

> But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there.

While there are some paywalled websites that allow you to read _n_ articles per period for "free", there are many that don't. How do I know in this case whether it's worth the cost?

Worth stating I don't disagree necessarily with the sentiment, there are just a few "edge cases" that make it impractical.

jelangkung3y ago

this.

bloomberg.com for instance, hides pay walled lines in empty <div>s.

the other method is to disable javascript and cookies (works on nytimes.com), or press ESC key to stop page loading before paywall kicks in (works on telegraph.co.uk) :)

retox3y ago· 6 in thread

A handful of sites will present the subscribers view of the page if you put a dot after the tld part of the url, i.e.

https://site.com./1235/article

Those behind Cloudflare don't seem to be vulnerable to this though.

I've emailed the sites I've found where this works and none of them have fixed it after a year.

neoromantique3y ago

>I've emailed the sites I've found where this works

Why?

bilekas3y ago

Can't speak for OP but I think I would be annoyed if I was paying a subscription and I found out I could get around it with a simple `.` because of developer incompetence.

2 more replies

creamyhorror3y ago

tyingq3y ago

Maybe some backend concession to allow viewing the article on hostnames/uris that are meant for internal use, partner sharing sites, etc? Like an if/then that checks the origin hostname?

1 more reply

bilekas3y ago

> I've emailed the sites I've found where this works and none of them have fixed it after a year.

Guess it's a feature then and not a bug.

mock-possum3y ago

Maybe some kind of common mistake when writing a pattern match

jwr3y ago· 5 in thread

Paywalled sites are just fine, but they are not part of the open Internet, and should not pretend to be.

CM303y ago

Yeah, 100% agree with this. It's like these sites want to have their cake and eat it, and both get the traffic the 'open web' provides while not having to actually share any of their work there.

It's like if you needed an app to view a page, yet Google had all its content indexed. Why is that (rightly) seen as unreasonable while charging users for content you provide to bots for free isn't?

iicc3y ago

Sometimes the full content is available - but only if you navigate to it via a Google results page, and you don't have existing cookies implementing a "free article limit".

donohoe3y ago

> they explicitly forbid the > practice of showing different > content to a search engine from > what is available publicly

This isn’t true. This paywall treatment is something they do allow and have worked to accommodate.

tyingq3y ago

It's also true that they say "cloaking" isn't allowed, "any type of cloaking"

https://youtu.be/QHtnfOgp65Q

mrobins3y ago

That’s a Google problem not a publisher problem. Content should be findable and available to people who want to pay for it.

Google could easily add an option to filter out paywalled content but that would reduce clicks.

1vuio0pswjnm73y ago· 4 in thread

1vuio0pswjnm73y ago

"Proposals like the PACT Act would state that if a provider is aware of an unlawful post, it will lose immunity for a lawsuit premised on that specific post.334"

"In the same vein, other bills would have caused providers to lose Section 230 immunity if they use algorithms to distribute content to users or display behavioral advertising.338"

https://crsreports.congress.gov/product/pdf/R/R46751

basch3y ago

> maintaining their Section 230 protection

There’s no such thing as maintaining protection. It’s not something you “lose.”

donohoe3y ago

But the number of AMP versions is dwindling as Google is no longer forcing it.

1vuio0pswjnm73y ago

Yep. And the browser vendor can make changes to DevTools, archive.ph is already privacy-unfriendly and blocked in some countries, it could disappear, 12ft.io could disappear, and so on.

The fact that seems least likely to change is that most "paywalls" rely on Javascript, CSS or some other "feature" of so-called "modern" web browsers.

3 more replies

drewtato3y ago· 3 in thread

bebrwsOP3y ago

I am going to add this to the post if that is alright. Please let me know if I shouldn't cite your HN username on the post. You can reply here and I'll see it. Thank you, this is super helpful.

drewtato3y ago

Kinda late, but that's okay with me!

prettyStandard3y ago

Not sure if this is exactly right, but something like this should obviate the need to find the element holding all the content.

* { overflow: visible !important; }

_boffin_3y ago· 3 in thread

or... just remove the `overflow: hidden` that's most likely placed on the `<body>` or some high level `<div>`.

lol7683y ago

This is the correct answer - and then scrolling will work properly!

Not sure why this 'hack' is on the front-page.

spiritplumber3y ago

I learned a thing today that I would not have if it wasn't :)

courgette3y ago

my experience is that it stopped working a few years ago. Eg: NYT or FT. The content is nowhere to be fund on the raw html itself.

No idea how it works but it looks like actual content is loaded separately once the gates are open?

karrotwaltz3y ago· 2 in thread

I use this JS bookmarklet to remove fixed elements and restore scrolling, it works most of the time:

https://pastebin.com/qBjJHkMv

I also have one to kill all running javascript and remove all event listeners, it works wonders when you are redirected to a paywall / login page after a few seconds.

albert_e3y ago

Would you mind sharing the second script as well? Thanks

This is supposed to be saved as a Javascript Bookmarklet?

karrotwaltz3y ago

Javascript killer: https://pastebin.com/utE3275J

Yes, I'm using it as a bookmarklet. I'm using firefox but I think it should work the same for other browsers.

PufPufPuf3y ago· 2 in thread

denton-scratch3y ago

> but I just don't find it viable to pay all the websites where I read just a single article.

This.

In the print days, you'd buy a newspaper; you'd have access to all the articles in that edition. I used to read a daily paper.

But if you want me to buy the whole paper to read a single article, well, ain't gonna happen.

Kerrick3y ago

> But if you want me to buy the whole paper to read a single article, well, ain't gonna happen.

1 more reply

start1233y ago· 1 in thread

I just use Firefox's reader view. Does the same with just a click. If it doesn't work, just refresh in reader view and it should load properly.

bebrwsOP3y ago

Added this comment to the post if that is ok. Let me know here if it isn't. Thank you!

alkonaut3y ago· 1 in thread

Even cached/archived versions these days tend to not include the whole text. Basically: they figured out how to make a paywall, which frankly isn't that surprising.

donohoe3y ago

Not sure I agree with that assessment.

There are so many ways to do a paywall and you’ll still see all sorts of flavors across the web today.

pentagrama3y ago· 1 in thread

This extension removes paywalls on many news sites (maintained, 33k stars) https://github.com/iamadamdev/bypass-paywalls-chrome

latchkey3y ago

There was some previous drama and some people switched to this one.

https://gitlab.com/magnolia1234/bypass-paywalls-chrome-clean...

https://gitlab.com/magnolia1234/bypass-paywalls-firefox-clea...

encryptluks23y ago· 1 in thread

colesantiago3y ago

The Information does this a lot.

https://www.theinformation.com

batperson3y ago

_the_inflator3y ago

A Clickbait Transformator would have opted for a headline along the lines: "This tip saves you thousands of Dollars!". ;)

arbol3y ago

Good tip. Sites turning off scroll is one of my pet annoyances.

kokanee3y ago

If you're unable to scroll the page, it generally means there is a "position: fixed" css rule on the body or a wrapper element. Turn off that css rule and you can scroll through the article normally.

Publishers are slowly wising up, though. Most don't load the full article for unpaid viewers anymore.

porbelm3y ago

Guess they caught on to the "cheaters."

Am4TIfIsER0ppos3y ago

bubblebaker3y ago

My method is to use ublock origin extention to block third party scripts.

phtrivier3y ago

Title should be : "getting around very poorly implemented paywall (eg WaPo) with devtools alone".

As soon as your site sends the whole content of the article to the browser, you're not even trying seriously. (And Firefox "reading mode" is just much better ux than the devtools.)

71a54xd3y ago

Your site's feature of "cool comments from HN" is awesome! Automating this with some kind of LLM and selling it would be killer.

vixen993y ago

Does not work with https://www.spectator.co.uk/

vixen993y ago

Did not work with https://www.spectator.co.uk/.

l3mure3y ago

Another devtools trick I've used is network throttling to allow me to copy out an article's content before JS loads.

dspillett3y ago

> But I didn't know how to scroll down on the page until today.

Eupraxias3y ago

Anyone else just read content in source?

CodeHz3y ago

Maybe it just a transparent div that blocked all the pointer events?

oh_sigh3y ago

Generally just disabling JavaScript for the site works perfectly.

snaix3y ago

Shortcut in macos, in most browsers, to get into "div selection mode" is "shift + command + c"...

Just press that, select the paywall (and any other junk backdrops/opaque divs), and press delete.

Sometimes the site also sets an `overflow: hidden` in the css, and you need to remove that to see the content..

kozinc3y ago

Sometimes, however, you don't even get a full article text when the paywalled site loads. In those cases, no amount of Scroll Into View will help.

For those cases try something like Google Cache or Wayback Machine! It still won't always work, but it's nevertheless got a pretty good success rate.

iKlsR3y ago

thallosaurus3y ago

suhaybh3y ago

I just use https://github.com/iamadamdev/bypass-paywalls-chrome and it works well for me.

jrberendt3y ago

I created https://turbolink.io to attempt to solve this problem.

kleinmatic3y ago

Pretty amazing how many people's ethics are less valuable to them than their product design opinions.

throwaway293033y ago

kjeksfjes3y ago

Hush?

bebrwsOP3y ago

A code free way to view a site with a paywall and navigate or scroll your way through the content.

j / k navigate · click thread line to collapse