GreenTunnel: anti-censorship utility designed to bypass deep packet inspection (opens in new tab)

(github.com)

280 pointshieudang96y ago77 comments

77 comments

57 comments · 17 top-level

snvzz6y ago· 10 in thread

This is a nice workaround for those stuck under censorship regimes such as the UK, South Korea, Turkey, India or China.

Now, Encrypted DNS (thanks to DNS over TLS/HTTPS) and HTTPS (thanks to Let's Encrypt and HSTS) are getting deployed somewhat widely.

The next step is encrypted SNI[0], and it'll get this much harder to do any meaningful DPI, for censorship or else.

[0]: https://en.wikipedia.org/wiki/Server_Name_Indication#Securit...

nimbius6y ago

there are two edges to this sword.

DoH also means breaking stuff like pihole and other ad filtering. It means you trust companies like google who base their revenue off ads, or cloudflare who have censored content numerous times in the past, to serve you DNS.

its also kind of pointless if the state knows youre using it outside of a tunnel...they can just watch your next packets to see where you decided to go.

jchw6y ago

Quick thought. If software wanted to, could they not, today, bypass your DNS resolvers anyways? Choosing to use DoH on software where you control the DNS resolution seems like an unambiguous win. FWIW, the Chromium implementation of DoH upgrading only upgrades you to DoH if your configured DNS provider is known to support it via a hardcoded list.

In theory, you could have Pihole resolve using a DoH resolver and your devices resolve using Pihole and have the best of everything.

(Disclaimer: Google employee, not working on ads or Chromium or DNS.)

1 more reply

recursive6y ago

This is a fundamental flaw of content blocking based on host name. It often happens to work, but there's no rule that says that it has to, and really no good reason why it should be guaranteed to.

m4636y ago

Isn't there a way to use pihole as your DNS server and let it use DoH?

That way you could do DNS to pihole, do the filtering and let it use DoH to the outside world.

snvzz6y ago

>DoH also means breaking stuff like pihole and other ad filtering.

No, it doesn't.

e.g. I run DoH behind my home's dns cache server.

>its also kind of pointless if the state knows youre using it outside of a tunnel...they can just watch your next packets to see where you decided to go.

This is where HTTPS and eSNI further help.

1 more reply

godelski6y ago

> cloudflare who have censored content numerous times in the past

Besides Stormfront[0], what else did they censor?

[0] https://en.wikipedia.org/wiki/Stormfront_%28website%29

2 more replies

ukd16y ago

pihole is a short term solution; it is the wrong long term one - it only works as these holes exist. blocking needs to be done in the browser, or your computer to be done more securely

1 more reply

rubyfan6y ago

does this just become an arms race where root certificates or some sort of device management tools are forced onto citizen devices?

aasasd6y ago

Kazakhstan just recently conducted an experiment with sending people an sms telling them to install a root cert. And Russia is making it mandatory to install not-yet-determined Russian software on all newly sold machines, which will quite probably soon include an FSB cert.

1 more reply

mirimir6y ago

Yes, it's a clever workaround. And requires no remote server.

I still prefer VPNs and Tor, but hey.

segfaultbuserr6y ago· 7 in thread

> GET / HTTP/1.0

> Host: www.youtube.com

> We send it in 2 parts: first comes GET / HTTP/1.0 \n Host: www.you and second sends as tube.com \n .... In this example, ISP cannot find blocked word YouTube in packets and you can bypass it!

If you talk to anyone from China that this is how you bypass (HTTP) "deep packet inspection", it would sound incrediblely naive. I'm not criticizing here, thanks for developing an anti-censorship tool, but my point is, any DPI that can be bypassed in this way is simply too outdated, it's far from the state-of-art threats we are facing worldwide.

What China does today is what your ISP/government is going to do tomorrow, when they upgrade the equipment. Learning a history lesson from China, can help providing insights for developers in other countries to know where this cat-and-mouse game is heading to...

> paulryanrogers: So basically it just does two things: carefully chunking HTTP header packets and encrypted DNS? Not sure this will work for very long.

Of course it will not. I'll explain why.

---

Literally, the same technique was used in China during the early days of Great Firewall, around 2007. At that time, the "censorship stack" was simple, basically, it had...

* A brute-force IP blocking list

This is a constantly updated list of IP addresses of "unwanted" web servers, such as YouTube or Facebook. They are distributed via the BGP protocol, just like how normal routing information is distributed. Once your server enters this blacklist, nothing can be done. Not all unwanted websites enter the list due to its computational/storage costs.

* A DNS poisoning infrastructure

A list of "unwanted" domain names are maintained. These domain names are entered to the national DNS root server as records with bogus IP addresses. It was used more widely than the IP blocklist, since it has zero cost to operate, but it can only block websites in the list and it takes time for the censor to be aware of a target's existence.

* A naive keyword filtering system.

All outgoing international traffic is mirrored for inspection. A keyword inspection system attempts to match the URLs in HTTP requests against a blacklist of unwanted keywords. Rumors said the string matching was performed by hardware in ASIC/FPGA, allowing enormous throughput.

* A TCP reset attack system

Once an unwanted TCP connection is identified by the keyword inspection system, the TCP Reset attack system fires a bogus RST packet to your computer, fooled by the packet, your operating system will voluntarily work against you and terminate the connection, saving the censors' CPU time. The keyword filtering system paired with reset attack was the preferred way to carry out censorship.

That's all. The principle of operation was simple and easy to understand. So what were the options for bypassing it? There were a lot. To begin with, the blocked IP addresses were blocked, you could do nothing about it. But in the earliest day, accessing them was as simple as finding a random HTTP proxy server. Later, the inspection system was upgraded to match HTTP proxy requests. Then, you could simply play some magic tricks with your HTTP requests, like the example in the beginning, so that your request wouldn't trigger a match. Around the same time, in-browser web proxy tools became popular, they were PHP scripts running on a web server that fetched pages. However, they became useless when the keyword matching system was upgraded to match the content of the entire page, not simply the requests (remember, few sites had HTTPS). At this point, all plaintext proxy techniques and HTTP request "massaging" techniques were all officially dead.

Some naive rot13-like techniques were later implemented to some web proxies, HTTPS web proxies were also a thing, but they saw limited use.

* New: A complete keyword filtering system - Inspect all HTML pages (Was: A naive keyword filtering system)

Another target to attack was the DNS poisoning system, sometimes all you needed was a correct IP address, since not all IPs were included in the blocklist due to the costs. Initially, all one needed to do was modifying one nameserver to 8.8.8.8. However, countermeasures were quickly deployed. A simple countermeasure was rerouting 8.8.8.8 to the ISP's nameserver, continued feeding the same bogus records to you. Nevertheless, there were always alternative resolvers to use. So the system was upgraded to provide a DNS spoofing infrastructure - at the instant an outgoing DNS packet is detected, the spoofing system would immediately answer with a bogus packet. The real packet would arrive at a hundred milliseconds later, but it would be too late, your OS had already accepted the bogus result.

And ironically, even if DNSSEC was widely supported (it was not), it couldn't do anything but returning an SERVFAIL, since DNSSEC can only check whether the result was true, dropping the bogus packet and accepting the true one was outside the capabilities of a standard DNSSEC implementation.

* New: A Real-time DNS Spoofing System

Better tools were developed later, that acted like a transparent resolver between the upstream resolver and your computer, that identified the bogus results to drop them, but the use was limited. Also, at this point, the IP blocklist has been greatly expanded. Even if a correct IP could be obtained, it was still inaccessible. Around 2008 or so, a special open source project was launched by developers in China - /etc/hosts list, whenever someone found a Facebook IP address that was not in the blocklist yet, one sent patches to the project. There were also shell scripts to keep your list up-to-date.

However, a /etc/hosts list was useful but its usefulness was limited. First, it was a matter of time before a new IP address was blocked. Also, a working IP address still was restricted by the same keyword filtering system.

* New: Expanded IP Blocklist.

Some people also realized that the firewall was only able to terminate a connection by fooling the operating system. Soon, iptables rules for blocking RST packets appeared in technical blogs. By ignoring all RST packets, one essentially gained immunity at the expense of network stability, as legitimate RSTs were also ignored. Soon, the censorship responded by upgrading the reset attack system, so that RST packets were sent to both directions - even if you ignored RST, the server on the other side would still terminate it. Also, RST was now "latched-on" for a limited time, when the first RST was triggered, the target remained inaccessible in several minutes.

* New: Bidirectional TCP Reset Attack

* New: "Latched-On" Reset Attack

When HTTPS was enabled, it was impossible to perform keyword inspection in the HTML pages - at this time, censor sometimes still wished to allow partial access, only triggering the block when detected a match. This strategy cannot be applied to HTTPS, since the content was all encrypted. Some people realized some popular websites supported HTTPS but not enabled it by default, such as Wikipedia. The Great Firewall responded by implementing a HTTPS certificate matching subsystem in the keyword matching system, when a particular certificate was matched, you were greeted by a TCP RST packet (this system has been removed later when HTTPS saw widespread use).

* New: Certificate-Based HTTPS Blocking System

At this point, around 2010, the only reliable way to browse the web was using a fully-encrypted proxy, such as SSH dynamic port forwarding or a VPN, which required purchasing a VPS from a hosting provider. SSH was more popular due to its ease of use - all one needed was finding a SSH server and ran "ssh -D 1337", so that port 1337 would be a SOCKS5 proxy provided by OpenSSH. OpenVPN was reserved for heavy web users, since it's more difficult to setup, but had better performance.

From the beginning to the 2010s, anyone who was using VPN or SSH can enjoy reliable web browsing (only be disturbed from time to time due to the overloaded international bandwidth). However, the good days came to an end when the Great Firewall implemented a real-time traffic classifier, it was first applied to SSH. It observed the SSH packets in real-time and attempted to identify whether an overlay proxy traffic was carried on top of it. The blocking mechanism was enhanced as well, now it was able to dynamically inserting null route entries when it decided that the communication with a server was unwanted. The IP blocking system was also improved, now it was able to collect unwanted IP addresses at a faster rate with help of the traffic classifier. If you used SSH as a proxy, after a while the connection would be identified, with all packets dropped, repeated offenses would earn you a permanent IP block. For VPNs, the firewall implemented a real-time classifier to detect OpenVPN's TLS handshakes. When handshakes were detected, a RST packet is sent (or if you use UDP, all packets are dropped). Repeated offenses would earn you a permanent IP block as well.

New: Real-Time Traffic Classifier

New: Real-Time IP Blocking

New: Actively Updated IP Blocklist using Classifiers as Feedback

Traffic classifiers would later be expanded to cover HTTPS-in-HTTPS as well, so a naive HTTPS proxy wouldn't work, and possibly have other features, it's a mystery.

BTW, after Google exited from China, the HTTPS version was immediately blocked, and for HTTP, a ridiculous keyword blocklist was enforced and it generated huge amount of false-positive RSTs for harmless words, apparently a deliberate decision, preferring false-positive over false-negative. Eventually, all Google services had been permanently blocked. The IP block became extensive, major websites have been completely blocked, the unblocked sites were only exceptions. For most people, the arrival of widely-used HTTPS was too late and useless, since IPs were blocked. And as mentioned, SSH and VPNs were classified and blocked as well.

This was when a new generation of proxy tools started to gain popularity,

segfaultbuserr6y ago

Shadowsocks being the most well-known example. From a cryptographic perspective, it was a big step backwards. Since Diffie-Hellman handshakes were subjected to traffic classifiers, these tools only used symmetric encryption with fixed keys. Their encryption protocols were ad-hoc, and not cryptographically robust. While it was a matter of fact that nobody could break a simple AES-CBC encryption, nobody would trust these tools for one's confidential data as well (for example, AEAD was unsupported for many years). But since the goal was bypassing censorship, not secrecy, they became extremely popular. It was not seen as an major issue, since the widespread use of HTTPS offered robust secrecy. DNS encryption was still essential (usually the SOCKS-5 interface was provided by these tools, SOCKS-5 can be configured to pass the original domain name to the proxy, the proxy can resolve the names inside its encrypted connection), but became less useful when used on its own, since the IP blocklist was huge by the time.

The landscape of the Internet has changed dramatically since 2013 as well. The universal adoption of HTTPS eventually rendered all keyword-based inspection useless. A few sites were considered too large to block, including Amazon AWS and GitHub. One side of the battle started becoming a mutual assured destruction game - either allowing people to exploit a large platform to publish uncensored material, or blocking the platform altogether and creating economic damages. I am confident that the MAD game will continue to play out, however, Russia's response to AWS domain fronting showed this strategy could fail if major platforms don't want to cooperate, it was a bit worrying, at least. But anyway, encrypting SNIs should be the next step.

But I digressed, back to Shadowsocks, et al, since the state was eliminated (pun intended), all one could see was encrypted raw TCP packets, there was no reliable way for the firewall to classify Shadowsocks-like tools for many years (until recently, possibly by exploiting cryptographic-related issues, but we are not sure how successful it is). But the censorship system started getting weirder and weirder - sometimes, connections break without any apparent reason at all, sometimes data rate was extremely low, sometimes a few IPs were blocked mysteriously, and so on, but life kept going on. There were several possible hypotheses, one was that the traffic classifiers were getting more and more functionalities, and occasionally they could hit something. Another was that the TCP RST was sent in a probabilistic manner to suspected endpoints to degrade reliability. The only thing that could be confirmed was the significantly increased use of QoS by the ISPs, so that all unknown protocols would be classified as "low priority", degrading the reliability of all anti-censorship tools. At this point, bad connectivity and censorship was indistinguishable.

It's safe to say, that at this point, nobody ever understands how the Great Firewall of China work anymore. This is the end of our story.

For simplicity, I skipped many less used techniques, such as Tor's domain fronting, or CDN-based circumvention, or obfsproxy4 that featured Diffie-Hellman keys indistinguishable from random strings, and possibly others. I'm well-aware of them. But it's expected that, unless everything is encrypted and all infoleak is plugged (then, we will start playing the mutual assured destruction game), all these tools are doing is an endless cat-and-mouse game.

Developers of anti-censorship tools need to consider countermeasures based on what China is currently doing. So that when the same techniques used by China are implemented by their own ISPs in the future, they are always prepared to act.

d4mi3n6y ago

Fantastic breakdown on the recent history of censorship in China, thanks for sharing it.

You mentioned that for many of these efforts bypassing censorship trumped secrecy concerns. Is this still the case?

If I were a citizen regularly bypassing censorship of an authoritarian government, I’d be concerned for my safety if it was well documented that I regularly accessed censored material.

2 more replies

ackbar036y ago

Thanks for this summary. The firewall has been a lot stricter recently and it's been a real pain in the ass, even for legitimate things. I can only speculate they are using deep learning type tools now to do their blocking

1 more reply

friendlybus6y ago

Great post, thank you.

peter_d_sherman6y ago

I agree with the other posters -- fascinating, detailed info. These posts should be promoted to their own HN article/discussion...

kingosticks6y ago

Were they doing that full page text matching in an ASIC too?! Doesn't that basically involve writing a simple parser also? Else what prevents things like usage of Google analytics/fonts etc from triggering a match and blocking?

segfaultbuserr6y ago

> Else what prevents things like usage of Google analytics/fonts etc from triggering a match and blocking?

The blocking was/is complementary. Usually, domain names themselves were blocked by DNS poisoning (or IP blocking if it escalated), domains themselves (or the names of the websites) did not appear in the keyword blocklist. A link to Google Analytics or Facebook button could stuck the webpage from loading properly until a timeout, but merely mentioning or linking a domain name would not trigger a keyword match of the page itself.

The intention of keyword matching was to allow partial access while still blocking unwanted content. Usually, only the most politically unwanted keyword entered the keyword list. For example, Wikipedia could be accessed normally, but as soon as "a word that should not be named" appeared in the webpage, the connection would be reset immediately. An interesting phenomenon was, sometimes the page could partially load and stopped exactly before the forbidden word. And since the censorship system worded on mirrored traffic, sometimes a slight processing delay allowed the full page to load before the RST was received, it would be a "I'm feeling lucky moment".

Anyway, there was how the system worked before 2010. The extensive use of HTTPS rendered it useless, and it appeared that some forms of keyword filtering has already been lifted, since it's already a pointless exercise.

For quite some time after keyword matching became ineffectively, DNS poisoning remained the only form of censorship for many unwanted but not significant websites, for example, Hacker News. But recently, SNI matching was implemented.

paulryanrogers6y ago· 5 in thread

So basically it just does two things: carefully chunking HTTP header packets and encrypted DNS?

Not sure this will work for very long.

gruez6y ago

>Not sure this will work for very long.

Maybe if it gets popular. Defeating chunking would require additional memory + compute power on the DPI boxes, which I suspect ISPs don't want to bear.

jaimex26y ago

I work in the DPI field and have maintained a few DPI firewalls.

Most DPI that I know of will defeat this bypass technique, I'm not sure the author has even tested if it works.

DPI firewalls already have to support aggregating packets. It's pretty common to need more information beyond the initial packet. It's not really any more memory intensive either, you're just reading byte by byte and keeping what you need.

Heck most DPI firewalls support checking something in the outbound packets is in the inbound packets. ie - checking if a connection is performing IKE.

colanderman6y ago

The DPI doesn't need to buffer packets. Searches like this are performed using a regex compiled as a DFA or similar state machine. The state maintained per flow is a few machine words at most.

You'd have better luck sending the TCP packets out-of-order. But some DPI boxes will buffer these to a small degree to catch such shenanigans.

Source: in a previous life I worked on the layer-7 inspection subsystem (among others) of a DPI box.

EDIT: Also what @cpitman said. DPI boxes will often err on the side of caution. The DPI will happily kill your goofy-but-standards-compliant flow if it can't figure out that it's safe.

cpitman6y ago

Does it? I don't think you need to correlate packets, you could probably just block small packets that look like they have only part of the hostname. If they wanted to be slightly more selective, they could block small packets that have a partial hostname and have a prefix that is blocked.

In order for traffic to be open for any substantial time, the technique either has to stay hidden/unpopular or the traffic has to be hard to distinguish from normal traffic.

kburman6y ago

I believe this is more of a cat and mouse game.

benbristow6y ago· 4 in thread

Doesn't work against Virgin Media UK

Nextgrid6y ago

Virgin is censoring things now?

buckminster6y ago

All the big UK ISPs do. This is targeted at CP but who knows what else gets covered.

1 more reply

benbristow6y ago

Has been for a while. https://www.virginmedia.com/help/list-of-court-orders

Also (rightly so) anything on the Internet Watch Foundation list - https://www.iwf.org.uk/become-a-member/services-for-members/...

1 more reply

DanBC6y ago

ISPs in the UK have to obey court orders, and there are some court orders in place for the "big 5" ISPs around certain piracy websites.

If I try to go to thepiratebay.org I get this

--- Access to this website has been blocked under an Order of the Higher Court.

Any TalkTalk customer affected by the Court Order has a right under the Court Order to apply to vary or discharge it. Any such application must: (i) clearly indicate the identity and status of the applicant; (ii) be supported by evidence setting out and justifying the grounds of the application; and (iii) be made on 10 days notice to all of the parties to the Court Order.

For further details click here. ---

On top of that there is a voluntary scheme run by the Internet Watch Foundation to provide filtering for sites that share images of child sexual abuse. Some ISPs don't participate in this, but the vast majority do. One famous example of things that get blocked (but also reasonably quickly unblocked) are the Wikipedia page for a band. NSFW link https://en.wikipedia.org/wiki/Internet_Watch_Foundation_and_...

1 more reply

travisgriggs6y ago· 3 in thread

Why hasn't this become the modern Right to Bear Arms? The root of the second amendment was trying to ensure that one class of citizenry did not have tools at their hands to force another class of citizenry to comply. It maintained a balance. The right to encrypt and keep your data private should be a modern equivalent of the right to bear arms.

dclusin6y ago

The second amendment doesn't guarantee us the right to pack heat at work. There's lots of use cases where it would be considered reasonable to block the use of certain services.

ploika6y ago

Bear in mind that as a non American, invoking a right to bear arms actively turns me off whatever you're trying to sell me. Too much damage has been done to my home country by violent groups who took up arms against the state and each other.

1014046y ago

Just look at the past 100 years of human history and the damage that has been done because citizens had NO arms to defend their rights against a despotic government. The deaths are counted in millions.

Insurgent groups will buy 5 dollar AK-47 regardless of legality. This is about lawful citizens being able to defend their rights against unlawful governments. Or at least to raise the costs an armed group has to pay for an attempt to take power.

As another non American, I am still undecided on the issue, but I tend to be in favor of that 2nd amendment.

__sy__6y ago· 3 in thread

This is great if there isn't a blanket ban on VPN's, but unfortunately, it won't work in China. I've had next to zero-luck keeping my own VPN tunnels open for more than a couple of days at a time when behind the great firewall.

dastx6y ago

This isn't a VPN though, you simply run it on your computer, and it does all the magic for you.

hrdwdmrbl6y ago

Check out v2ray. You'll need to have your own domain, server, and a cloudflare account, but in terms of speed it is unmatched. Unfortunately many of the best tutorials are in Chinese.

bscphil6y ago

Looks interesting. From https://www.v2ray.com/en/index.html it seems that it's "just" a VPN protocol / software that can tunnel over TLS. I assume the point of using your own server + Cloudflare is that it breaks IP based blocking of most VPN providers. I guess just your own server without Cloudflare would work fine for a while, but they probably have heuristics for a lot of encrypted traffic sent to a single unknown server?

The remaining question for me is about the TLS part of all this. Does China not have agreements with most external services about stripping TLS such that a lot of TLS traffic would be suspect? Or do they not mandate their citizens to use a Government provided root cert that would allow them to "securely" MITM connections? That would be how I'd do it if I were an authoritarian government.

If not, then what's their plan for the future? I could see a Firewall kind of mostly working for now on a combination of DNS, IP, and SNI filtering, but all three are going away in the near term. DNS with DNS-over-HTTP, SNI with eSNI, and IP blocking has become less plausible already through routine use of proxies like Cloudflare.

1 more reply

19966y ago· 2 in thread

A simple countermeasure at the ISP level: a buffer to merge 'www.you' to 'tube.com'

A far greater danger is DPI that use statistical analysis to detect possible tunnels. You want your traffic to be as close as possible to normal traffic. There is no perfect solution there. The current best is generating valid images with a hidden data payload (to download), and generating pseudo text posted on public forums or email (to upload) while limiting the download/upload ratio, by downloading random content if necessary as most people download far more than they upload.

It works best when using "known" websites like gmail (draft folder) or facebook (messenger), as all the traffic goes to a whitelisted host and look like regular usage.

gruez6y ago

>A simple countermeasure at the ISP level: a buffer to merge 'www.you' to 'tube.com'

addressed here: https://news.ycombinator.com/item?id=22656122

>A far greater danger is DPI that use statistical analysis to detect possible tunnels.

That's only an issue if there's a blanket ban on tunneling/proxies. While it's a problem in authoritarian regimes (eg. china, kazakhstan), it's not an issue in most western countries. I haven't heard of any western countries banning VPNs (yet).

>The current best is [...]

The timing information would still be suspicious. Most people aren't constantly checking their gmail/facebook multiple times a second, but normal browsing would generate packets with that frequency. It's really only undetectable if you're sending/receiving messages (eg. IM or email). A better candidate might be multiplayer game traffic. They provide a consistent stream of bits[1] to hide data in. If you're willing to set your tunnel's bandwidth to a few kilobytes a second (throttling if there's too much data, sending decoy packets if there's too little), it'd be very hard to detect any anomalies.

[1] random search: https://youtu.be/8Kvj5TZNNJ4?t=1080

19966y ago

> Defeating chunking would require additional memory + compute power on the DPI boxes, which I suspect ISPs don't want to bear.

It depends. ISP may be willing to spend more, if they gain more or are forced by governments to do that.

Even as is, the proposed method is still too easy to defeat, especially with IP bans: if the ISP really doesn't want to let youtube.com work, all the A and AAAA records will be blacklisted

> The timing information would still be suspicious. > It's really only undetectable if you're sending/receiving messages

Indeed, so the suggestion was to use the draft folder and FB messenger.

A better method would rotate the whitelisted websites- like using mostly gmail for 20 minutes, then facebook for 1h, etc. and of course only "on demand" so that traffic does not occur 24/7

For multiplayer game, the audio channel already provides a very simple method to stream more than a few kb per seconds.

Thorentis6y ago· 2 in thread

> We send it in 2 parts: first comes GET / HTTP/1.0 \n Host: www.you and second sends as tube.com

How does this even work? "www.you" won't return a valid HTTP response and "tube.com" won't either. How can you fetch the content at "youtube.com" but splitting the domain name in half? Won't you get two completely wrong responses that don't fit together?

detaro6y ago

It's split across two network packets. It's still one request for the web server.

est6y ago

might add another fake packet that confuse DPI.

david_draco6y ago· 2 in thread

Why not use Tor?

hrdwdmrbl6y ago

Really depends on your use-case. Tor is great but easily detectable and thus blockable, and though its speed has gotten much better, it isn't as fast as other options. But again, it really depends on what your goal is.

jaimex26y ago

Tor has obfuscation options, they work well.

The default endpoint list is usually blocked though as its published to the clients so you have to request an off-list endpoint.

m_a_g6y ago· 1 in thread

It is working perfectly for Turkcell Superonline, Turkey. Unfortunately, anti-censorship tools are very crucial for us these days. Thank you for your work.

degski6y ago

Going voting is crucial.

oedmarap6y ago· 1 in thread

It's really good to see more tools in the privacy space, resulting in more options and fallbacks for the end-user.

I think the author should also try to market this as much as possible outside of the HN crowd, since this seems targeted at non-tech users — I could be wrong but my reasoning is that HN users who care about privacy would prefer a combination of a VPN and DoH to defend against traffic & DNS inspection, respectively.

yjftsjthsd-h6y ago

It really depends on your threat model, but if you have a full tunnel VPN, then encrypted DNS is a lot less important.

kburman6y ago

Works with You Broadband, India. Thanks a lot man!

jogundas6y ago

Is that a SOCKS proxy, or just a HTTP proxy? The github readme does not make that clear.

dontdieych6y ago

It's working very well against South Korea, KT(ISP)

the_resistence6y ago

Need some more insights into proper use. I couldn't get to work in mainland China

hrdwdmrbl6y ago

How does this differ from Gigsaw's Outline, Shadowsocks or V2Ray?

terrycody6y ago

Any one tested if this tool work in China or not?

j / k navigate · click thread line to collapse

77 comments

57 comments · 17 top-level

snvzz6y ago· 10 in thread

This is a nice workaround for those stuck under censorship regimes such as the UK, South Korea, Turkey, India or China.

Now, Encrypted DNS (thanks to DNS over TLS/HTTPS) and HTTPS (thanks to Let's Encrypt and HSTS) are getting deployed somewhat widely.

The next step is encrypted SNI[0], and it'll get this much harder to do any meaningful DPI, for censorship or else.

[0]: https://en.wikipedia.org/wiki/Server_Name_Indication#Securit...

nimbius6y ago

there are two edges to this sword.

its also kind of pointless if the state knows youre using it outside of a tunnel...they can just watch your next packets to see where you decided to go.

jchw6y ago

In theory, you could have Pihole resolve using a DoH resolver and your devices resolve using Pihole and have the best of everything.

(Disclaimer: Google employee, not working on ads or Chromium or DNS.)

1 more reply

recursive6y ago

This is a fundamental flaw of content blocking based on host name. It often happens to work, but there's no rule that says that it has to, and really no good reason why it should be guaranteed to.

m4636y ago

Isn't there a way to use pihole as your DNS server and let it use DoH?

That way you could do DNS to pihole, do the filtering and let it use DoH to the outside world.

snvzz6y ago

>DoH also means breaking stuff like pihole and other ad filtering.

No, it doesn't.

e.g. I run DoH behind my home's dns cache server.

>its also kind of pointless if the state knows youre using it outside of a tunnel...they can just watch your next packets to see where you decided to go.

This is where HTTPS and eSNI further help.

1 more reply

godelski6y ago

> cloudflare who have censored content numerous times in the past

Besides Stormfront[0], what else did they censor?

[0] https://en.wikipedia.org/wiki/Stormfront_%28website%29

2 more replies

ukd16y ago

pihole is a short term solution; it is the wrong long term one - it only works as these holes exist. blocking needs to be done in the browser, or your computer to be done more securely

1 more reply

rubyfan6y ago

does this just become an arms race where root certificates or some sort of device management tools are forced onto citizen devices?

aasasd6y ago

1 more reply

mirimir6y ago

Yes, it's a clever workaround. And requires no remote server.

I still prefer VPNs and Tor, but hey.

segfaultbuserr6y ago· 7 in thread

> GET / HTTP/1.0

> Host: www.youtube.com

> We send it in 2 parts: first comes GET / HTTP/1.0 \n Host: www.you and second sends as tube.com \n .... In this example, ISP cannot find blocked word YouTube in packets and you can bypass it!

> paulryanrogers: So basically it just does two things: carefully chunking HTTP header packets and encrypted DNS? Not sure this will work for very long.

Of course it will not. I'll explain why.

---

Literally, the same technique was used in China during the early days of Great Firewall, around 2007. At that time, the "censorship stack" was simple, basically, it had...

* A brute-force IP blocking list

* A DNS poisoning infrastructure

* A naive keyword filtering system.

* A TCP reset attack system

Some naive rot13-like techniques were later implemented to some web proxies, HTTPS web proxies were also a thing, but they saw limited use.

* New: A complete keyword filtering system - Inspect all HTML pages (Was: A naive keyword filtering system)

* New: A Real-time DNS Spoofing System

* New: Expanded IP Blocklist.

* New: Bidirectional TCP Reset Attack

* New: "Latched-On" Reset Attack

* New: Certificate-Based HTTPS Blocking System

New: Real-Time Traffic Classifier

New: Real-Time IP Blocking

New: Actively Updated IP Blocklist using Classifiers as Feedback

Traffic classifiers would later be expanded to cover HTTPS-in-HTTPS as well, so a naive HTTPS proxy wouldn't work, and possibly have other features, it's a mystery.

This was when a new generation of proxy tools started to gain popularity,

segfaultbuserr6y ago

It's safe to say, that at this point, nobody ever understands how the Great Firewall of China work anymore. This is the end of our story.

d4mi3n6y ago

Fantastic breakdown on the recent history of censorship in China, thanks for sharing it.

You mentioned that for many of these efforts bypassing censorship trumped secrecy concerns. Is this still the case?

If I were a citizen regularly bypassing censorship of an authoritarian government, I’d be concerned for my safety if it was well documented that I regularly accessed censored material.

2 more replies

ackbar036y ago

1 more reply

friendlybus6y ago

Great post, thank you.

peter_d_sherman6y ago

I agree with the other posters -- fascinating, detailed info. These posts should be promoted to their own HN article/discussion...

kingosticks6y ago

segfaultbuserr6y ago

> Else what prevents things like usage of Google analytics/fonts etc from triggering a match and blocking?

paulryanrogers6y ago· 5 in thread

So basically it just does two things: carefully chunking HTTP header packets and encrypted DNS?

Not sure this will work for very long.

gruez6y ago

>Not sure this will work for very long.

Maybe if it gets popular. Defeating chunking would require additional memory + compute power on the DPI boxes, which I suspect ISPs don't want to bear.

jaimex26y ago

I work in the DPI field and have maintained a few DPI firewalls.

Most DPI that I know of will defeat this bypass technique, I'm not sure the author has even tested if it works.

Heck most DPI firewalls support checking something in the outbound packets is in the inbound packets. ie - checking if a connection is performing IKE.

colanderman6y ago

The DPI doesn't need to buffer packets. Searches like this are performed using a regex compiled as a DFA or similar state machine. The state maintained per flow is a few machine words at most.

You'd have better luck sending the TCP packets out-of-order. But some DPI boxes will buffer these to a small degree to catch such shenanigans.

Source: in a previous life I worked on the layer-7 inspection subsystem (among others) of a DPI box.

EDIT: Also what @cpitman said. DPI boxes will often err on the side of caution. The DPI will happily kill your goofy-but-standards-compliant flow if it can't figure out that it's safe.

cpitman6y ago

In order for traffic to be open for any substantial time, the technique either has to stay hidden/unpopular or the traffic has to be hard to distinguish from normal traffic.

kburman6y ago

I believe this is more of a cat and mouse game.

benbristow6y ago· 4 in thread

Doesn't work against Virgin Media UK

Nextgrid6y ago

Virgin is censoring things now?

buckminster6y ago

All the big UK ISPs do. This is targeted at CP but who knows what else gets covered.

1 more reply

benbristow6y ago

Has been for a while. https://www.virginmedia.com/help/list-of-court-orders

Also (rightly so) anything on the Internet Watch Foundation list - https://www.iwf.org.uk/become-a-member/services-for-members/...

1 more reply

DanBC6y ago

ISPs in the UK have to obey court orders, and there are some court orders in place for the "big 5" ISPs around certain piracy websites.

If I try to go to thepiratebay.org I get this

--- Access to this website has been blocked under an Order of the Higher Court.

For further details click here. ---

1 more reply

travisgriggs6y ago· 3 in thread

dclusin6y ago

The second amendment doesn't guarantee us the right to pack heat at work. There's lots of use cases where it would be considered reasonable to block the use of certain services.

ploika6y ago

1014046y ago

As another non American, I am still undecided on the issue, but I tend to be in favor of that 2nd amendment.

__sy__6y ago· 3 in thread

dastx6y ago

This isn't a VPN though, you simply run it on your computer, and it does all the magic for you.

hrdwdmrbl6y ago

Check out v2ray. You'll need to have your own domain, server, and a cloudflare account, but in terms of speed it is unmatched. Unfortunately many of the best tutorials are in Chinese.

bscphil6y ago

1 more reply

19966y ago· 2 in thread

A simple countermeasure at the ISP level: a buffer to merge 'www.you' to 'tube.com'

It works best when using "known" websites like gmail (draft folder) or facebook (messenger), as all the traffic goes to a whitelisted host and look like regular usage.

gruez6y ago

>A simple countermeasure at the ISP level: a buffer to merge 'www.you' to 'tube.com'

addressed here: https://news.ycombinator.com/item?id=22656122

>A far greater danger is DPI that use statistical analysis to detect possible tunnels.

>The current best is [...]

[1] random search: https://youtu.be/8Kvj5TZNNJ4?t=1080

19966y ago

> Defeating chunking would require additional memory + compute power on the DPI boxes, which I suspect ISPs don't want to bear.

It depends. ISP may be willing to spend more, if they gain more or are forced by governments to do that.

Even as is, the proposed method is still too easy to defeat, especially with IP bans: if the ISP really doesn't want to let youtube.com work, all the A and AAAA records will be blacklisted

> The timing information would still be suspicious. > It's really only undetectable if you're sending/receiving messages

Indeed, so the suggestion was to use the draft folder and FB messenger.

A better method would rotate the whitelisted websites- like using mostly gmail for 20 minutes, then facebook for 1h, etc. and of course only "on demand" so that traffic does not occur 24/7

For multiplayer game, the audio channel already provides a very simple method to stream more than a few kb per seconds.

Thorentis6y ago· 2 in thread

> We send it in 2 parts: first comes GET / HTTP/1.0 \n Host: www.you and second sends as tube.com

detaro6y ago

It's split across two network packets. It's still one request for the web server.

est6y ago

might add another fake packet that confuse DPI.

david_draco6y ago· 2 in thread

Why not use Tor?

hrdwdmrbl6y ago

jaimex26y ago

Tor has obfuscation options, they work well.

The default endpoint list is usually blocked though as its published to the clients so you have to request an off-list endpoint.

m_a_g6y ago· 1 in thread

It is working perfectly for Turkcell Superonline, Turkey. Unfortunately, anti-censorship tools are very crucial for us these days. Thank you for your work.

degski6y ago

Going voting is crucial.

oedmarap6y ago· 1 in thread

It's really good to see more tools in the privacy space, resulting in more options and fallbacks for the end-user.

yjftsjthsd-h6y ago

It really depends on your threat model, but if you have a full tunnel VPN, then encrypted DNS is a lot less important.

kburman6y ago

Works with You Broadband, India. Thanks a lot man!

jogundas6y ago

Is that a SOCKS proxy, or just a HTTP proxy? The github readme does not make that clear.

dontdieych6y ago

It's working very well against South Korea, KT(ISP)

the_resistence6y ago

Need some more insights into proper use. I couldn't get to work in mainland China

hrdwdmrbl6y ago

How does this differ from Gigsaw's Outline, Shadowsocks or V2Ray?

terrycody6y ago

Any one tested if this tool work in China or not?

j / k navigate · click thread line to collapse