I feel like it's counterproductive in situations like this to mention forking. It will come across like a threat, when there isn't really anything intrinsically aggressive about it. So just do it; and when you have a decent amount of separate development, you can decide whether to make PRs back, advertise your fork, etc.
I am trying to resolve what you've seen. For years of hard work.
I think that it owes its success to be first "port" of python requests to support async, that was a strong need.
But otherwise it is bad: API is not that great, performance is not that great, tweaking is not that great, and the maintainer mindset is not that great also. For the last point, few points were referenced in the article, but it can easily put your production project to suddenly break in a bad way without valid reason.
Without being perfect, I would advise everyone to switch to Aiohttp.
Basically treating HTTP requests as an orthogonal, or cross-cutting concern.
It is sometimes hard to tell if these upstream packages are stable or abandoned.
I should probably document my methodology so it can help others or at least have the chance to find out what mistakes or limitations they might have.
fwiw, HTTP/2 is twelve years old, just saying.
(Also the sponsorship subscription thing in the readme gives me vague rugpull vibes. Maybe I’ve just been burned too much—I don’t mean to discourage selling support in general.)
I've started seeing these emoji-prefixed commits lately now too, peculiar
J'aime bien, j'aime bien
It's been a pleasure to use, has a httpx compatibility layer for gradually migrating to its API, and it's a lot more performant (right now, I think it's the most performant Python http client out there: https://github.com/MarkusSintonen/pyreqwest/blob/main/docs/b...)
>They also point out that not opening up the source code goes against the principles of Open Source software development
I will never stop being amused when people have feelings like this and also choose licenses like BSD (this project). If you wanted a culture that discouraged those behaviors, why would you choose a license that explicitly allows them? Whether you can enforce it or not, the license is basically a type of CoC that states the type of community you want to have.
This usually doesn't work, and in the end all they can do is complain about behaviours that their license choice explicitly allowed.
I still think that hijacking the mkdocs package was the wrong way to go though.
The foss landscape has become way too much fork-phobic.
Just fork mkdocs and go over your merry way.
I don’t think that’s the case. It’s more of a marketing/market incentive. It’s great pr to be associated with the most famous project, way less so to be associated with a fork, at least until the fork becomes widespread and well recognised.
GitHub does make it fairly easy to fork a project, I wouldn’t blame the situation on github.
[1] https://github.com/orgs/encode/discussions/11#discussioncomm...
On the other hand, the comments the MkDocs author is making about perceived gender grievances feel so unhinged that I wouldn't be touching anything made by them with a barge pole.
Oleh was basically the sole maintainer for many years, and the development basically stopped when he left.
const response = await new Promise( (resolve, reject) => {
const req = https.request(url, {
}, res => {
let body = "";
res.on("data", data => {
body += data;
});
res.on('end', () => {
resolve(body);
});
});
req.end();
});> Undici is an HTTP client library that powers the fetch API in Node.js. It was written from scratch and does not rely on the built-in HTTP client in Node.js. It includes a number of features that make it a good choice for high-performance applications.
This seems like a pretty good reason to fork to me.
> Sending HTTP requests is a basic capability in the modern world, the standard library should include a friendly, fully-featured, battle-tested, async-ready client. But not in Python,
Or Javascript (well node), or golang (http/net is _worse_ than urllib IMO), Rust , Java (UrlRequest is the same as python's), even dotnet's HttpClient is... fine.
Honestly the thing that consistently surprises me is that requests hasn't been standardised and brought into the standard library
HttpClient client = HttpClient.newBuilder()
.version(Version.HTTP_1_1)
.followRedirects(Redirect.NORMAL)
.connectTimeout(Duration.ofSeconds(20))
.proxy(ProxySelector.of(
new InetSocketAddress("proxy.example.com", 80)
))
.authenticator(Authenticator.getDefault())
.build();
HttpResponse<String> response = client.send(request, BodyHandlers.ofString());
System.out.println(response.statusCode());
System.out.println(response.body());
For the record, you're most likely not even interacting with that API directly if you're using any current framework, because most just provide automagically generated clients and you only define the interface with some annotationsIt is not built for convenience. It has no methods for simply posting JSON, or marshaling a JSON response from a body automatically, no "fluent" interface, no automatic method for dealing with querystring parameters in a URL, no direct integration with any particular authentication/authorization scheme (other than Basic Authentication, which is part of the protocol). It only accepts streams for request bodys and only yields streams for response bodies, and while this is absolutely correct for a low-level library and any "request" library that mandates strings with no ability to stream in either direction is objectively wrong, it is a rather nice feature to have available when you know the request or response is going to be small. And so on and so on.
There's a lot of libraries you can grab that will fix this, if you care, everything from clones of the request library, to libraries designed explicitly to handle scraping cases, and so on. And that is in some sense also exactly why the net/http client is designed the way it is. It's designed to be in the standard library, where it can be indefinitely supported because it just reflects the protocol as directly as possible, and whatever whims of fate or fashion roll through the developer community as to the best way to make web requests may be now or in the future, those things can build on the solid foundation of net/http's Request and Response values.
Python is in fact a pretty good demonstration of the risks of trying to go too "high level" in such a client in the standard library.
The stdlib may not be the best, but the fact all HTTP libs that matter are compatible with net/http is great for DX and the ecosystem at large.
Instead, official documentation seems comfortable with recommending a third party package: https://docs.python.org/3/library/urllib.request.html#module...
>The Requests package is recommended for a higher-level HTTP client interface.
Which was fine when requests were the de-facto-standard only player in town, but at some point modern problems (async, http2) required modern solutions (httpx) and thus ecosystem fragmentation began.
The h11, h2, httpcore stack is probably the closest thing to what the Python stdlib should look like to end the fragmentation but it would be a huge undertaking for the core devs.
Yes, and it's in the standard library (System namespace). Being Microsoft they've if anything over-featured it.
And while this article [1] says "It's been around for a while", it was only added in .NET Framework 4.5, which shows it took a while for the API to stabilise. There were other ways to make web requests before that of course, and also part of the standard library, and it's never been "difficult" to do so, but there is a history prior to HttpClient of changing ways to do requests.
For modern dotnet however it's all pretty much a solved problem, and there's only ever been HttpClient and a fairly consistent story of how to use it.
[1] https://learn.microsoft.com/en-us/dotnet/core/extensions/htt...
it's called the STD lib for a reason...
I've noticed that many languages struggle with HTTP in the standard library, even if the rest of the stdlib is great. I think it's just difficult to strike the right balance between "easy to use" and "covers every use case", with most erring (justifiably) toward the latter.
I've often ended up reimplementing what I need because the API from the famous libraries aren't efficient. In general I'd love to send a million of requests all in the same packet and get the replies. No need to wait for the first reply to send the 2nd request and so on. They can all be on the same TCP packet but I have never met a library that lets me do that.
So for example while http3 should be more efficient and faster, since no library I've tried let me do this, I ended up using HTTP1.1 as usual and being faster as a result.
Python makes everything so easy.
I realized this the other day, and dub it Bram's Law -- Bram
Bram's Law
The easier a piece of software is to write, the worse it's implemented in practice. Why? Easy software projects can be done by almost any random person, so they are. It's possible to try to nudge your way into being the standard for an easy thing based on technical merit, but that's rather like trying to become a hollywood star based on talent and hard work. You're much better off trading it all in for a good dose of luck.
This is why HTTP is a mess while transaction engines are rock solid. Almost any programmer can do a mediocre but workable job of extending HTTP, (and boy, have they,) but most people can't write a transaction engine which even functions. The result is that very few transaction engines are written, almost all of them by very good programmers, and the few which aren't up to par tend to be really bad and hardly get used. HTTP, on the other hand, has all kinds of random people hacking on it, as a result of which Python has a 'fully http 1.1 compliant http library which raises assertion failures during normal operation.
Remember this next time you're cursing some ubiquitous but awful third party library and thinking of writing a replacement. With enough coal, even a large diamond is unlikely to be the first thing picked up. Save your efforts for more difficult problems where you can make a difference. The simple problems will continue to be dealt with incompetently. It sucks, but we'll waste a lot less time if we learn to accept this fact.
The notable exception is Go, which has a fantastic one. But Go is pretty notable for having an incredible standard library in general.
(As an outsider I had the impression that Go's net/http was good, but a lot of people in this thread are complaining about it as well. So it may be 0-4 instead of 1-3).
Node.js got its production version in 2023.
Rust doesn't include an HTTP client at all.
Even for stdlib that have a client, virtually none support HTTP/3, which is used for 30% of web traffic. [1]
--
HTTP (particularly 2+) is a complex protocol, with no single correct answers for high-level and low-level needs.
I still hear people complain about how such and such removal between "minor versions" of Python 3 (you really should be thinking of them as major versions nowadays — "Python 3 is the brand", the saying goes now), where they were warned like two years in advance about individual functions, supposedly caused a huge problem for them. It's hard for me to reconcile with the rhetoric I've heard in internal discussions; they're so worried in general about possible theoretical compatibility breaks that it seems impossible to change anything.
Always remember that open-source is an author’s gift to the world, and the author doesn’t owe anything to anyone. Thus, if you need a feature that for whatever reason can’t or won’t go upstream, forking is just about the only viable option. Fingers crossed!
Put your side project on your personal homepage and walk away - fine.
Make it central infrastructure - respond to participants or extend or cede maintainership.
FOSS means the right to use and fork. That's all it means. That's all it ever meant. Any social expectations beyond that live entirely in your imagination.
There is simply no responsibility an OSS maintainer has. They can choose to be responsible, but no one can force them. Eventually OSS licensing is THE solution at heart to solve this problem. Maintainers go rogue? Fork and move on. But surprise, who is going to fork AND maintain? Filling in all the demands from the community, for potentially no benefit?
No one can force him to take the responsibility, just like no one can force anyone else to.
This doesn't come over night and this is a spectrum and a choice. From purely personal side project over exotic Debian package to friggin httpx with 15k Github stars and 100 million downloads a week the 46th most downloaded PyPI package!
If this shall work reasonably in any way, hou have to step up. Take money (as they do, https://github.com/sponsors/encode), search fellow maintainers or cede involvement - even if only temporarily.
An example of a recent, successful transition is UniGetUI https://github.com/Devolutions/UniGetUI/discussions/4444
I feel there should be support from the ecosystem to help with that. OpenJS Foundation seems doing great: https://openjsf.org/projects. The Python Software Foundation could not only host PyPI but offer assistance for the most important packages.
If you need stronger guarantees, pay someone to deliver them.
One of its intended use cases is bridging contribution gaps: while contributing upstream is ideal, maintainers may be slow to merge contributions for various reasons. Forking in response creates a permanent schism and a significant maintenance burden for what might be a small change. Modshim would allow you to create a new Python package containing only the fixes for your bugbears, while automatically inheriting the rest from upstream httpx.
There are a couple of example of this readme: (1) modifing the TextWrapper object but then use it through the textwrap library's wrap() function, and (2) modifing the requests Session object, but then just using the standard requests.get(). Without modshim (using standard monkey-patching) you would have to re-implement the wrap and get methods in order to bind the new TextWrapper / Session classes.
- what's the performance like for big packages (say, pytorch)? Have you done some benchmarking?
- is typing kept for the shims? My immediate guess, again, without even trying it out, is not. If yes, how?
edit: formatting
"So what is the plan now?" - "Move a little faster and not break things"
Loved that little detail, reminds me of the old interwebs :)
- httpx
- curl cffi
- httpmorph
- httpcloak
- stealth crawler
I wrote a framework, link below, which uses them all. You can compare each to verify crawling speed. Some sites can be cleanly crawled with a one particular framework.
Having read the article I am in a pain. I do break things while development. I rewrite stuff. Maybe some day I will find a way to develop things "stable". One thing I try to keep in good shape is 'docker' image. I update it once everything seems to be quite stable.
What uv does is parallelize the final download of packages after resolution, and batch pre-fetch metadata during resolution. I don't think these benefit from async, due to their batch nature classic multi-threaded download pools are probably the better solution, but I could be wrong!
Experiments have been done on the former in pip and didn't find much/any improvement in CPython, this may change in free threaded CPython. For the latter we currently don't have the information from the resolver to extract a range of possible metadata versions we could pre-range, I am working on this but it requires new APIs in packaging (the Python library) and changes to the resolver, and again we will need to benchmark to see if adding pre-fetching actually improves things.
This certainly wouldn't be the first time an author of a popular library got a little too distracted on the sequel to their library that the current users are left to languish a bit.
Just a small headsup: clicking on the Leiden Python link in your About Me page give not the expected results.
And a small nitpick: it's "Michiel's" in English (where it's "Michiels" in Dutch).
Thanks for devoting time to opensource... <3
Or maybe it is that your brain is cooked already, or is on the brink, and your condition attracts you to HTTP and Python, after which it basically has you.
The only way to not go bonkers is to design a library by commitee, so that the disease spreads evenly and doesn't hit any one individual with full force. The result will be ugly, but hopefully free of drama.
Whether actively defending your trademark is actually required is a bit of a nuanced topic. Generally, trademarks can be lost through genericide (the mark becomes a generic term for the type of product) or abandonment. Abandonment happens when either the mark owner stops using the mark itself, or takes an action that weakens the mark. The question, then, is whether failing to defend infringing use constitutes a weakening action. Courts differ on this, and there is a large gray area between "we didn't immediately sue a local mom-and-pop shop" and "we allowed a rival company to use the mark erroneously across several states for years without taking action."
I think if had named it HTTPX2 or HTTPY, that would be much worse because it asserts superiority without earning it. But he didn't.