Curl-Impersonate (opens in new tab)

(github.com)

425 pointsjakeogh1y ago111 comments

111 comments

67 comments · 16 top-level

cle1y ago· 17 in thread

The same author also makes a Python binding of this which exposes a requests-like API in Python, very helpful for making HTTP reqs without the overhead of running an entire browser stack: https://github.com/lexiforest/curl_cffi

I can't help but feel like these are the dying breaths of the open Internet though. All the megacorps (Google, Microsoft, Apple, CloudFlare, et al) are doing their damndest to make sure everyone is only using software approved by them, and to ensure that they can identify you. From multiple angles too (security, bots, DDoS, etc.), and it's not just limited to browsers either.

End goal seems to be: prove your identity to the megacorps so they can track everything you do and also ensure you are only doing things they approve of. I think the security arguments are just convenient rationalizations in service of this goal.

throwaway992101y ago

> I can't help but feel like these are the dying breaths of the open Internet though

I agree with the over zealous tracking by the megacorps but this is also due to bad actors, I work for a financial company and the amount of API abuse, ATO, DDoS, nefarious bot traffic, etc. we see on a daily basis is absolutely insane

berkes1y ago

But how much of this "bad actor" interaction is countered with tracking? And how many of these attempts are even close to successfull with even the simplest out of the box security practices set up?

And when it does get more dangerous, is over zealous tracking the best counter for this?

I've dealt with a lot of these threats as well, and a lot are countered with rather common tools, from simple fail2ban rules to application firewalls and private subnets and whatnot. E.g. a large fai2ban rule to just ban anything that attempts to HTTP GET /admin.php or /phpmyadmin etc, even just once, gets rid of almost all nefarious bot traffic.

So, I think the amount of attacks indeed can be insane. But the amount that need over zealous tracking is to be countered, is, AFAICS, rather small.

6 more replies

code511y ago

Much of this "bad actor" activity is actually customer needs left hanging - for either the customer to automate herself or other companies to fill the gap to create value that's not envisioned by the original company.

I'm guessing investors actually like a healthy dose of open access and a healthy dose of defence. We see them (YC, as an example) betting on multiple teams addressing the same problem. The difference is their execution, the angle they attack.

If, say, the financial company you work for is capable in both product and technical aspect, I assume it leaves no gap. It's the main place to access the service and all the side benefits.

1 more reply

cle1y ago

Yep totally agree these are problems. I don't have a good alternative proposal either, I'm just disappointed with what we're converging on.

deadbabe1y ago

Even if the internet was wide open it’s of little use these days.

AI will replace any search you would want to do to find information, the only reason to scour the internet now is for social purposes: finding comments and forums or content from other users, and you don’t really need to be untracked to do all that.

A megacorp’s main motivation for tracking your identity is to sell you shit or sell your data to other people who want to sell you things. But if you’re using AI the amount of ads and SEO spam that you have to sift through will dramatically reduce, rendering most of those efforts pointless.

And most people aren’t using the internet like in the old days: stumbling across quaint cozy boutique websites made by hobbyists about some favorite topic. People just jump on social platforms and consume content until satisfied.

There is no money to be made anymore in mass web scraping at scale with impersonated clients, it’s all been consumed.

matheusmoreira1y ago

You are on point. There is no open internet without computing freedom.

Computers used to be empowering. Cryptography used to be empowering. Then these corporations started using both against us. They own the computers now. Hardware cryptography ensures the computers only run their software now, software that does their the corporation's bidding and enforces their controls. And if we somehow gain control of the computer we are denied every service and essentially ostracized. I don't think it will be long before we are banned from the internet proper for using "unauthorized" devices.

It's an incredibly depressing state of affairs. Everything the word "hacker" ever stood for is pretty much dying. It feels like there's no way out.

choeger1y ago

> ensure you are only doing things they approve of

Absolutely. They might not care about individuals, though. It's their approach to shape "markets". The Apple, Google, Amazon, and Microsoft tax is not inevitable and that's their problem. They will fight toe and nail to keep you locked in, call it "innovation", and even cooperate with governments (which otherwise are their natural enemy in the fight for digital control). It's the people that a) don't care much and b) don't have any options.

In the end, a large share of our wealth is just pulled from us to these ever more ridiculous rent seeking schemes.

octocop1y ago

"I have nothing to hide" will eventually spread to everyone. Very unfortunate.

cle1y ago

I'm in a similar boat but it's more like "I have nothing I can hide".

These days I just tell friends & family to assume that nothing they do is private.

Habgdnv1y ago

The answer is simple: I have something to hide. I have many things to hide actually. Nothing of these things is illegal currently but I still have many things to hide. And if I have something to hide - I can be worried about many things.

1 more reply

schnable1y ago

A lot of the motivation comes from government regulations too. Right now this is mostly in banking, but social media and porn regs are coming too.

lelandfe1y ago

PornHub and all of its affiliate sites now block all residents of Alabama, Arkansas, Idaho, Indiana, Kansas, Kentucky, Mississippi, Montana, Nebraska, North Carolina, Texas, Utah, and Virginia (and Florida on Jan 1): https://www.pcmag.com/news/pornhub-blocked-florida-alabama-t...

Child safety, as always, was the sugar that made the medicine go down in freedom-loving USA. I imagine these states' approaches will try to move to the federal level after Section 230 dies an ignominious death.

Keep an eye out for Free Speech Coalition v. Paxton to hit SCOTUS in January: https://www.oyez.org/cases/2024/23-1122

1vuio0pswjnm71y ago

https://github.com/lexiforest/curl_cffi/releases/expanded_as...

jagged-chisel1y ago

> … helpful for making HTTP reqs without the overhead of running an entire browser stack

For those less informed, add “to impersonate the fingerprints of a browser.”

One can, obviously, make requests without a browser stack.

userbinator1y ago

They've been planning this stuff for a long time...

https://en.wikipedia.org/wiki/Next-Generation_Secure_Computi...

...and we're seeing the puzzle pieces fall into place. Mandated driver signing, TPMs, and more recently remote attestation. "Security" has always been the excuse --- securing their control over you.

dwattttt1y ago

Another trending thread right now is Pegasus/Predator; as much as it may be a facade, to say MS (or any OS vendor) has no business working on security/secure computing is demonstrably false.

1 more reply

zouhair1y ago

The disappearance of the third space is killing us.

oefrha1y ago· 8 in thread

What are some example sites where this is both necessary and sufficient? In my experience sites with serious anti-bot protection basically always have JavaScript-based browser detection, and some are capable of defeating puppeteer-extra-plugin-stealth even in headful mode. I doubt sites without serious anti-bot detection will do TLS fingerprinting. I guess it is useful for the narrower use case of getting a short-lived token/cookie with a headless browser on a heavily defended site, then performing requests using said tokens with this lightweight client for a while?

Retr0id1y ago

A lot of WAFs make it a simple thing to set up. Since it doesn't require any application-level changes, it's an easy "first move" in the anti-bot arms race.

At the time I wrote this up, r1-api.rabbit.tech required TLS client fingerprints to match an expected value, and not much else: https://gist.github.com/DavidBuchanan314/aafce6ba7fc49b19206...

(I haven't paid attention to what they've done since so it might no longer be the case)

oefrha1y ago

Makes sense, thanks.

jonatron1y ago

There are sites that will block curl and python-requests completely, but will allow curl-impersonate. IIRC, Amazon is an example that has some bot protection but it isn't "serious".

ekimekim1y ago

In most cases this is just based on user agent. It's widespread enough that I just habitually tell requests not to set a User Agent at all (these aren't blocked, but if the UA contains "python" it is).

thrdbndndn1y ago

Lots of sites, actually.

> I doubt sites without serious anti-bot detection will do TLS fingerprinting

They don't set it up themselves. CloudFlare offer such thing by default (?).

oefrha1y ago

Pretty sure it’s not default, and Cloudflare browser check and/or captcha is a way bigger problem than TLS fingerprinting, at least was the case the last time I scraped a site behind Cloudflare.

Avamander1y ago

CloudFlare offers it. Even if it's not used for blocking it might be used for analytics or threat calculations, so you might get hit later.

remram1y ago

Those JavaScript scripts often get data from some API, and it's that API that will usually be behind some fingerprinting wall.

jandrese1y ago· 5 in thread

The build scripts on this repo seem a bit cursed. It uses autotools but has you build them in a subdirectory. The default built target is a help text instead of just building the project. When you do use the listed build target it doesn't have the dependencies set up correctly so you have to run it like 6 times to get to the point where it is building the application.

Ultimately I was not able to get it to build because the BoringSSL disto it downloaded failed to build even though I made sure all of the dependencies the INSTALL.md listed are installed. This might be because the machine I was trying to build it on is an older Ubuntu 20 release.

Edit: Tried it on Ubuntu 22, but BoringSSL again failed to build. The make script did work better however, only requiring a single invocation of make chrome-build before blowing up.

Looks like a classic case of "don't ship -Werror because compiler warnings are unpredictable".

Died on:

/extensions.cc:3416:16: error: ‘ext_index’ may be used uninitialized in this function [-Werror=maybe-uninitialized]

The good news is that removing -Werror from the CMakeLists.txt in BoringSSL got around that issue. Bad news is that the dependency list is incomplete. You will also need libc++-XX-dev and libc++abi-XX-dev where the XX is the major version number of GCC on your machine. Once you fix that it will successfully build, but the install process is slightly incomplete. It doesn't run ldconfig for you, you have to do it yourself.

On a final note, despite the name BoringSSL is huge library that takes a surprisingly long time to build. I thought it would be like LibreSSL where they trim it down to the core to keep the attack surface samll, but apparently Google went in the opposite direction.

ospider1y ago

Hi, maintainer here, the whole project is a hack, actually :P

The original repo was already full of hacks, and on top of that, I added more hacks to keep up with the latest browsers. The main purpose of my fork is to serve as a foundation of the python binding, which I think is easier to use. So I haven't tried to make the whole process more streamlined as long as it works on the CI. You can use the prebuilt binaries on the release page, though. I guess I should find some time to clean up the whole thing.

userbinator1y ago

Look on the bright side: the harder it is to build and use correctly, the harder it is for the enemy to analyse and react.

jakeoghOP1y ago

I hit that too, there is an open bug: https://github.com/lexiforest/curl-impersonate/issues/81

Worked around it by modifying the patch: https://github.com/jakeogh/jakeogh/blob/master/net-misc/curl...

Considering the complexity, this project, and it's upstream parent and grandparent(curl proper) are downright amazing.

at0mic221y ago

Played this game and switched to prebuilt libraries. Think builder docker images have also been broken for a while.

381y ago

that's exactly why I stopped using C/C++. building is many times a nightmare, and the language teams seems to have no interest in improving the situation

zlagen1y ago· 4 in thread

In case anyone is interested, I created something similar but for python(using chromium's network stack) https://github.com/lagenar/python-cronet I'm looking for help to create the build for windows.

Klonoar1y ago

Similar projects exist for C# (https://github.com/sleeyax/CronetSharp), Go (https://github.com/sleeyax/cronet-go) and Rust (https://github.com/sleeyax/cronet-rs).

These can work well in some cases but it's always a tradeoff.

hk__21y ago

Any reason you didn’t use https://github.com/lexiforest/curl_cffi?

zlagen1y ago

I wanted to try a diffent approach which is to use chromium's network stack directly instead of patching curl to impersonate it. In this case you're using the real thing so it's a bit easier to maintain when there are changes in the fingerprint.

thrdbndndn1y ago

Any plan to offer a sync API?

TekMol1y ago· 4 in thread

What is the use case? If you have to read data from one specific website which uses handshake info to avoid being read by software?

When I have to do HTTP requests these days, I default to a headless browser right away, because that seems to be the best bet. Even then, some website are not readable because they use captchas and whatnot.

adastral1y ago

> I default to a headless browser

Headless browsers consume orders of magnitude more resources, and execute far more requests (e.g. fetching images) than a common webscraping job would require. Having run webscraping at scale myself, the cost of operating headless browsers made us only use them as a last resort.

at0mic221y ago

Blocking all image/video/CSS requests is the rule of thumb when working with headless browsers via CDP

1 more reply

TekMol1y ago

So you maintain a table of domains and how to access them?

How do you build that table and keep it up to date? Manually?

mschuster911y ago

> What is the use case? If you have to read data from one specific website which uses handshake info to avoid being read by software?

Evade captchas. curl user agent / heuristics are blocked by many sites these days - I'd guess many popular CDNs have pre-defined "block bots" stuff that blocks everything automated that is not a well-known search engine indexer.

londons_explore1y ago· 4 in thread

> The resulting curl looks, from a network perspective, identical to a real browser.

How close is it? If I ran wireshark, would the bytes be exactly the same in the exact same packets?

jsnell1y ago

The packets from Chrome wouldn't be exactly the same as packets sent by Chrome at a different time either. "The exact same packets" is not a viable benchmark, since both the client and the server randomize the payloads in various ways. (E.g. key exchange, GREASE).

peetistaken1y ago

You can check your fingerprint on https://tls.peet.ws

dchest1y ago

What else could "identical" mean?

londons_explore1y ago

It could be that the TCP streams are the same, but packetiation is different.

It could mean that the packets are the same, but timing is off by a few milliseconds.

It could mean a single HTTP request exactly matches, but when doing two requests the real browser uses a connection pool but curl doesn't. Or uses HTTP/3's fast-open abilities, etc.

etc.

2 more replies

jollyllama1y ago· 3 in thread

>The Client Hello message that most HTTP clients and libraries produce differs drastically from that of a real browser.

Why is this?

throwaway992101y ago

Based on what I've seen, most command-line clients and basic HTTP libraries typically ship with leaner, more static configurations (e.g., no GREASE extensions in the Client Hello, limited protocols in the ALPN extension header, smaller number of Signature Algorithms). Mirroring real browser TLS fingerprints is also more difficult due to the randomization of the Client Hello parameters (e.g., current versions of Chrome)

zlagen1y ago

They use different SSL libraries/configuration. Chrome uses BoringSSL and other libraries may use OpenSSL or some other library. Besides that the SSL library may be configured with different cipher suites and extensions. The solution these impersonators provide is to use the same SSL library and configuration as a real browser.

Retr0id1y ago

The protocols are flexible and most browsers bring their own HTTP+TLS clients

userbinator1y ago· 2 in thread

I can't help but think that projects like these shouldn't be posted here, since the enemy is among us. Prodding the bear even more might lead to an acceleration towards the dystopia that others here have already prophesised.

The following browsers can be impersonated.

...unfortunately no Firefox to be seen.

I've had to fight this too, since I use a filtering proxy. User-agent discrimination should be illegal. One may think the EU could have some power to change things, but then again, they're also hugely into the whole "digital identity" thing.

ospider1y ago

Maintainer here. Curl drops NSS support since like a year ago, which is the SSL engine firefox uses. Without NSS, two special extensions can not be added. And that's why only webkit-based browsers are left.

You can find support for old firefox versions in the original repo.

crtasm1y ago

It says "Firefox(In progress)", and the original project this was forked from has it: https://github.com/lwthiker/curl-impersonate

aninteger1y ago· 2 in thread

I think we should list the sites where this fingerprinting is done. I have a suspicion that Microsoft does it for conditional access policies but I am not sure of other services.

Galanwe1y ago

We cannot really list them, as 90% of the time, it's not the websites themselves, it's their WAF. And there is a trend toward most company websites to be behind a WAF nowadays to avoid 1) annoying regulations (US companies putting geoloc on their websites to avoid EU cookie regulations) and 2) DDoS.

It's now pretty common to have cloudflare, AWS, etc WAFs as main endpoints, and these do anti bots (TLS fingerprinting, header fingerprinting, Javascript checks, capt has, etc).

pixelesque1y ago

Cloudflare (which seems to be fronting half the web these days based off the number of cf-ray cookies that I see being sent back) does this with bot protection on, and Akamai has something similar I think.

ape41y ago· 2 in thread

I like this project!

Is there a way to request impersonization of the current version of Chrome (or whatever)?

jakeoghOP1y ago

The latest version is a moving target, currently you get the following chrome versions:

  $ curl_chrome <TAB><TAB>
  curl_chrome100
  curl_chrome101
  curl_chrome104
  curl_chrome107
  curl_chrome110
  curl_chrome116
  curl_chrome119
  curl_chrome120
  curl_chrome123
  curl_chrome124
  curl_chrome131
  curl_chrome131_android
  curl_chrome99
  curl_chrome99_android

ape41y ago

Perhaps plain `curl_chrome` could use the latest available `curl_chromeNNN`

Sytten1y ago

Thankfully only a small fraction of website does JA3/JA4 fingerprinting. Some do more advanced stuff like correlating headers to the fingerprint. We have been able to get away without doing much in Caido for a long time but I am working on an OSS rust based equivalent. Neat trick, you can use the fingerprint of our competitor (Burp Suite) since it is whitelisted for the security folks to do their job. Only time you will not hear me complain about checkbox security.

Retr0id1y ago

I recently used ja3proxy, which uses utls for the impersonation. It exposes an HTTP proxy that you can use with any regular HTTP client (unmodified curl, python, etc.) and wraps it in a TLS client fingerprint of your choice. Although I don't think it does anything special for http/2, which curl-impersonate does advertise support for.

https://github.com/LyleMi/ja3proxy

https://github.com/refraction-networking/utls

peetistaken1y ago

https://github.com/bogdanfinn/tls-client is the go-to package for the go world, it does the same thing

kerblang1y ago

Interesting in light of another much-discussed story about AI scraper farms swamping/DDOSing sites https://news.ycombinator.com/item?id=42549624

0x676e671y ago

I think someone should need this. It is based on boring tls and makes some fake extensions similar to utls to support Firefox TLS fingerprint imitation

repo: https://github.com/penumbra-x/rquest

jakeoghOP1y ago

(very rough) ebuild: https://github.com/jakeogh/jakeogh/blob/master/net-misc/curl...

j / k navigate · click thread line to collapse

111 comments

67 comments · 16 top-level

cle1y ago· 17 in thread

throwaway992101y ago

> I can't help but feel like these are the dying breaths of the open Internet though

berkes1y ago

But how much of this "bad actor" interaction is countered with tracking? And how many of these attempts are even close to successfull with even the simplest out of the box security practices set up?

And when it does get more dangerous, is over zealous tracking the best counter for this?

So, I think the amount of attacks indeed can be insane. But the amount that need over zealous tracking is to be countered, is, AFAICS, rather small.

6 more replies

code511y ago

If, say, the financial company you work for is capable in both product and technical aspect, I assume it leaves no gap. It's the main place to access the service and all the side benefits.

1 more reply

cle1y ago

Yep totally agree these are problems. I don't have a good alternative proposal either, I'm just disappointed with what we're converging on.

deadbabe1y ago

Even if the internet was wide open it’s of little use these days.

There is no money to be made anymore in mass web scraping at scale with impersonated clients, it’s all been consumed.

matheusmoreira1y ago

You are on point. There is no open internet without computing freedom.

It's an incredibly depressing state of affairs. Everything the word "hacker" ever stood for is pretty much dying. It feels like there's no way out.

choeger1y ago

> ensure you are only doing things they approve of

In the end, a large share of our wealth is just pulled from us to these ever more ridiculous rent seeking schemes.

octocop1y ago

"I have nothing to hide" will eventually spread to everyone. Very unfortunate.

cle1y ago

I'm in a similar boat but it's more like "I have nothing I can hide".

These days I just tell friends & family to assume that nothing they do is private.

Habgdnv1y ago

1 more reply

schnable1y ago

A lot of the motivation comes from government regulations too. Right now this is mostly in banking, but social media and porn regs are coming too.

lelandfe1y ago

Keep an eye out for Free Speech Coalition v. Paxton to hit SCOTUS in January: https://www.oyez.org/cases/2024/23-1122

1vuio0pswjnm71y ago

https://github.com/lexiforest/curl_cffi/releases/expanded_as...

jagged-chisel1y ago

> … helpful for making HTTP reqs without the overhead of running an entire browser stack

For those less informed, add “to impersonate the fingerprints of a browser.”

One can, obviously, make requests without a browser stack.

userbinator1y ago

They've been planning this stuff for a long time...

https://en.wikipedia.org/wiki/Next-Generation_Secure_Computi...

...and we're seeing the puzzle pieces fall into place. Mandated driver signing, TPMs, and more recently remote attestation. "Security" has always been the excuse --- securing their control over you.

dwattttt1y ago

Another trending thread right now is Pegasus/Predator; as much as it may be a facade, to say MS (or any OS vendor) has no business working on security/secure computing is demonstrably false.

1 more reply

zouhair1y ago

The disappearance of the third space is killing us.

oefrha1y ago· 8 in thread

Retr0id1y ago

A lot of WAFs make it a simple thing to set up. Since it doesn't require any application-level changes, it's an easy "first move" in the anti-bot arms race.

At the time I wrote this up, r1-api.rabbit.tech required TLS client fingerprints to match an expected value, and not much else: https://gist.github.com/DavidBuchanan314/aafce6ba7fc49b19206...

(I haven't paid attention to what they've done since so it might no longer be the case)

oefrha1y ago

Makes sense, thanks.

jonatron1y ago

There are sites that will block curl and python-requests completely, but will allow curl-impersonate. IIRC, Amazon is an example that has some bot protection but it isn't "serious".

ekimekim1y ago

thrdbndndn1y ago

Lots of sites, actually.

> I doubt sites without serious anti-bot detection will do TLS fingerprinting

They don't set it up themselves. CloudFlare offer such thing by default (?).

oefrha1y ago

Pretty sure it’s not default, and Cloudflare browser check and/or captcha is a way bigger problem than TLS fingerprinting, at least was the case the last time I scraped a site behind Cloudflare.

Avamander1y ago

CloudFlare offers it. Even if it's not used for blocking it might be used for analytics or threat calculations, so you might get hit later.

remram1y ago

Those JavaScript scripts often get data from some API, and it's that API that will usually be behind some fingerprinting wall.

jandrese1y ago· 5 in thread

Edit: Tried it on Ubuntu 22, but BoringSSL again failed to build. The make script did work better however, only requiring a single invocation of make chrome-build before blowing up.

Looks like a classic case of "don't ship -Werror because compiler warnings are unpredictable".

Died on:

/extensions.cc:3416:16: error: ‘ext_index’ may be used uninitialized in this function [-Werror=maybe-uninitialized]

ospider1y ago

Hi, maintainer here, the whole project is a hack, actually :P

userbinator1y ago

Look on the bright side: the harder it is to build and use correctly, the harder it is for the enemy to analyse and react.

jakeoghOP1y ago

I hit that too, there is an open bug: https://github.com/lexiforest/curl-impersonate/issues/81

Worked around it by modifying the patch: https://github.com/jakeogh/jakeogh/blob/master/net-misc/curl...

Considering the complexity, this project, and it's upstream parent and grandparent(curl proper) are downright amazing.

at0mic221y ago

Played this game and switched to prebuilt libraries. Think builder docker images have also been broken for a while.

381y ago

that's exactly why I stopped using C/C++. building is many times a nightmare, and the language teams seems to have no interest in improving the situation

zlagen1y ago· 4 in thread

In case anyone is interested, I created something similar but for python(using chromium's network stack) https://github.com/lagenar/python-cronet I'm looking for help to create the build for windows.

Klonoar1y ago

Similar projects exist for C# (https://github.com/sleeyax/CronetSharp), Go (https://github.com/sleeyax/cronet-go) and Rust (https://github.com/sleeyax/cronet-rs).

These can work well in some cases but it's always a tradeoff.

hk__21y ago

Any reason you didn’t use https://github.com/lexiforest/curl_cffi?

zlagen1y ago

thrdbndndn1y ago

Any plan to offer a sync API?

TekMol1y ago· 4 in thread

What is the use case? If you have to read data from one specific website which uses handshake info to avoid being read by software?

adastral1y ago

> I default to a headless browser

at0mic221y ago

Blocking all image/video/CSS requests is the rule of thumb when working with headless browsers via CDP

1 more reply

TekMol1y ago

So you maintain a table of domains and how to access them?

How do you build that table and keep it up to date? Manually?

mschuster911y ago

> What is the use case? If you have to read data from one specific website which uses handshake info to avoid being read by software?

londons_explore1y ago· 4 in thread

> The resulting curl looks, from a network perspective, identical to a real browser.

How close is it? If I ran wireshark, would the bytes be exactly the same in the exact same packets?

jsnell1y ago

peetistaken1y ago

You can check your fingerprint on https://tls.peet.ws

dchest1y ago

What else could "identical" mean?

londons_explore1y ago

It could be that the TCP streams are the same, but packetiation is different.

It could mean that the packets are the same, but timing is off by a few milliseconds.

It could mean a single HTTP request exactly matches, but when doing two requests the real browser uses a connection pool but curl doesn't. Or uses HTTP/3's fast-open abilities, etc.

etc.

2 more replies

jollyllama1y ago· 3 in thread

>The Client Hello message that most HTTP clients and libraries produce differs drastically from that of a real browser.

Why is this?

throwaway992101y ago

zlagen1y ago

Retr0id1y ago

The protocols are flexible and most browsers bring their own HTTP+TLS clients

userbinator1y ago· 2 in thread

The following browsers can be impersonated.

...unfortunately no Firefox to be seen.

ospider1y ago

You can find support for old firefox versions in the original repo.

crtasm1y ago

It says "Firefox(In progress)", and the original project this was forked from has it: https://github.com/lwthiker/curl-impersonate

aninteger1y ago· 2 in thread

I think we should list the sites where this fingerprinting is done. I have a suspicion that Microsoft does it for conditional access policies but I am not sure of other services.

Galanwe1y ago

It's now pretty common to have cloudflare, AWS, etc WAFs as main endpoints, and these do anti bots (TLS fingerprinting, header fingerprinting, Javascript checks, capt has, etc).

pixelesque1y ago

ape41y ago· 2 in thread

I like this project!

Is there a way to request impersonization of the current version of Chrome (or whatever)?

jakeoghOP1y ago

The latest version is a moving target, currently you get the following chrome versions:

  $ curl_chrome <TAB><TAB>
  curl_chrome100
  curl_chrome101
  curl_chrome104
  curl_chrome107
  curl_chrome110
  curl_chrome116
  curl_chrome119
  curl_chrome120
  curl_chrome123
  curl_chrome124
  curl_chrome131
  curl_chrome131_android
  curl_chrome99
  curl_chrome99_android

ape41y ago

Perhaps plain `curl_chrome` could use the latest available `curl_chromeNNN`

Sytten1y ago

Retr0id1y ago

https://github.com/LyleMi/ja3proxy

https://github.com/refraction-networking/utls

peetistaken1y ago

https://github.com/bogdanfinn/tls-client is the go-to package for the go world, it does the same thing

kerblang1y ago

Interesting in light of another much-discussed story about AI scraper farms swamping/DDOSing sites https://news.ycombinator.com/item?id=42549624

0x676e671y ago

I think someone should need this. It is based on boring tls and makes some fake extensions similar to utls to support Firefox TLS fingerprint imitation

repo: https://github.com/penumbra-x/rquest

jakeoghOP1y ago

(very rough) ebuild: https://github.com/jakeogh/jakeogh/blob/master/net-misc/curl...

j / k navigate · click thread line to collapse