autocfg, bitflags, bytes, cfg-if, fnv, fuchsia-zircon, fuchsia-zircon-sys, futures-channel, futures-core, futures-sink, futures-task, futures-util, h2, hashbrown, http, http-body, httparse, httpdate, indexmap, iovec, itoa, kernel32-sys, lazy_static, libc, log, memchr, mio, miow, net2, pin-project, pin-project-internal, pin-project-lite, pin-utils, proc-macro2, quote, redox_syscall, slab, socket2, syn, tokio, tokio-util, tower-service, tracing, tracing-core, try-lock, unicode-xid, want, winapi, winapi-build, winapi-i686-pc-windows-gnu, winapi-x86_64-pc-windows-gnu, ws2_32-sys
Platform integration: libc, winapi, winapi-build, winapi-i686-pc-windows-gnu, winapi-x86_64-pc-windows-gnu, ws2_32-sys, fuchsia-zircon, fuchsia-zircon-sys, kernel32-sys, redox_syscall
Primitive algorithms: itoa, memchr, unicode-xid
Proc macro / pinning utilities: proc-macro2, autocfg, cfg-if, lazy_static, quote, syn, pin-project, pin-project-internal, pin-project-lite, pin-utils
Data structures: bitflags, bytes, fnv, hashbrown, indexmap, slab
Core Rust asyncio crates: mio, miow, iovec, tokio, tokio-util, futures-channel, futures-core, futures-sink, futures-task, futures-util,
Logging: log, tracing, tracing-core
The following are effectively sub-crates of the project: http, http-body, httparse, httpdate, tower-service, h2
Not sure what these are for: net2, socket2, try-lock, want
If those platform crates are just backends for libc (or similar to libc), why aren't they all folded in a single project? Having them as separate crates open the gate for supply-chain attacks without really allowing greater control or expressiveness. You are always going to pull all of them in, you are unlikely to ever use them directly, and they have no more impact on compilation than features.
Having to use proc-macro2+quote+syn for macros feels wrong, considering it's a language feature.
Having to pull in 4 crates for pinning also seems wrong. This could easily be a single utility crate, and if you really need all this cruft to use Pin (a language feature) most of this should probably be in std.
The async I/O feels like definite bloat. Not only is that a lot of futures-* crates, but I know from first-hand experience that those tend to implement multiple versions of some primitives (like streams) that are incompatible.
They're not, they're to the specific platform's APIs. The libc crate is a package that targets libc on every platform.
> Having to use proc-macro2+quote+syn for macros feels wrong, considering it's a language feature.
In order to ship procedural macros as a feature, they're pretty minimal. There's a tradeoff here; others could be chosen, but weren't for good reasons.
> Having to pull in 4 crates for pinning also seems wrong.
This is covered by https://news.ycombinator.com/item?id=24730280; that is, it's a transitive dependency issue.
> The async I/O feels like definite bloat.
Really depends, network calls are a textbook use-case for async I/O. And there's so many futures crates specifically so that you can only depend on the bits you need.
> Since Rust 1.36, this is now the HashMap implementation for the Rust standard library. However you may still want to use this crate instead since it works in environments without std, such as embedded systems and kernels.
I don't know if that use case is important to Hyper or not.
I wish it relied only on OpenSSL.
Looking at the authors and publishers numbers from https://github.com/rust-secure-code/cargo-supply-chain it's clear a lot of these are maintained by the same set of trusted folks.
My main worry about Rust dependencies is not so much the number, it's that it's still a fairly young ecosystem that hasn't stabilized yet, packages come and go fairly quickly even for relatively basic features. For instance for a long time lazy_static (which is one of the dependencies listed here) was the de-facto standard way of dealing with global data that needed an initializer. Apparently things are changing though, I've seen many people recommend once_cell over it (I haven't had the opportunity to try it yet).
Things like tokio are also moving pretty fast, I wouldn't be surprised if something else took over in the not-so-far future.
It's like that even for basic things: a couple of years ago for command line parsing in an app I used "argparse". It did the job. Last week I had to implement argument parsing for a new app, at first I thought about copy/pasting my previous code, but I went on crates.io and noticed that argparse hadn't been updated in 2years and apparently the "go to" argument parsing lib was now "clap". So I used clap instead. Will it still be used and maintained two years from now? Who knows.
The STL package contains lots of things, but most of them are pretty ugly to use and quite a few of them are quite complicated to use because they are generalized for as many use cases as possible. There are so many authors out there that decide they can do memory management manually, write their own thread or process pools, implement sorting in a different way, rewrite basic algorithms for list operations like mapping, filtering, zipping, reducing...
Simply counting the number of dependencies isn't a great indicator for dependency bloat. There are extremes on both ends : no deps --> I know everything better and reimplemented the world, and thousands of deps --> I put a one liner in a package ma! One should not judge too quickly.
> These complaints are valid, but my argument is that they’re also not NEW, and they’re certainly not unique to Rust ... The only thing new about it is that programmers are exposed to more of the costs of it up-front.
While thats true, it completely misses the point. The point is not that dependencies exist, or even that a package might have many dependencies.
The point is, Rust (and NPM) I have found many times dont care or even consider the impact of a large amount of dependencies, and often take no steps to mitigate or reduce that number.
As others said, some features could be split off into other crates. Maybe someone only needs HTTP, or maybe they need HTTPS but no Async. Or maybe they dont need logging. With Hyper and others you just have to build everything whether you want it or not.
I agree with this, but that's not what your original post says. Or at least, it’s not what I understood from reading it. :)
> The point is, Rust (and NPM)
and C, in many real-world cases, which is why the above post matters.
This is the primary reason I try to avoid projects built with npm. Fucking dependency hell. If the project hasn't been actively maintained in the last 3 months your chances of getting it to work drop precipitously.
That's funny, because it's only "new" if your experiences primarily lie in newer languages and communities. There's a lot of criticism of C, but one thing it does is make dependencies pretty explicit. Some say that's good, some say that's bad, I guess it can be both at different times.
Let's say you open up a new C codebase: What are its dependencies? You'll have to hunt through its README (hopefully it's up to date!), other build instructions, maybe CMake, maybe some custom build system, etc.
What version of dependencies does it use? If the code has been vendored, you at least know what code its using - but where do you look for updates to that code? Do you manually go out to wherever it was copied from now and then and look for updates? If the code isn't vendored, then how was it installed? From the package manager? If so, what operating system and version was used during development? If its not from a package, its might have been downloaded and installed manually? Again, where was it downloaded from? Where was it installed? What options did it use when it was compiled?
How do you handle transitive dependencies? Probably by hand. How well documented are they?
C suffers plenty of dependency issues.
I am all about improving the world, don't get me wrong, but saying "hey this software works just like all the other software" isn't really the insult that you seem to think that it is.
They could have said it straight "Guys highly optimized, safe compilation of medium size project will be in range of 20-30 min". And that would great and honest way to deal. Instead we get oh, we have reduced compile times from 26 minutes to 21 minutes so it is 19% improvement in just one year and there is more to come. Now this is hard work and great but I am sure Rust committers understand when people say fast compile times they most likely comparing to Go etc which would be under a minute for most mid size projects. And this is very well not going to happen.
Same is now to Cargo dependency situation. Cargo and NPM are ideologically in agreement that people in general should just pull it from package manager instead of rewriting even little bit of code. And again instead of owning it there will be list of shallow reasoning: others do it too, reusing is better that rewriting, cargo is so awesome that pulling crate is much easier and so on.
Plus some of the dependencies are also dependencies of the standard library (hashbrown, cfg-if, libc).
- Secure. Does this contain malicious code, exploits or otherwise known bugs that could be fixed but aren't? This of course is hard, and will never be perfect. There are static security scans though, espcially in a language like rust, that should be able to verify what the code does before its being published for consumption via carog. NPM is trying to do something similiar. This isn't foolproof and more sophisicated exploits always will exist, but getting low hanging and mid-level fruit should be within reach, which is a net win
- Is it quality? This is beyond just secure, but does it provide real utility value? One thing we always hear is DRY, which may mean sometimes you consume alot of dependencies, since the problem space you work in involves a lot of things, so why re-invent every single fix if something exists and you can glue together to start making impact in your problem domain? I don't think this is an issue espcially if #1 is true
So I don't know, I think its fine to have a lot of dependencies, I think the justification for those dependencies is often related to the complexity of the work involved. I'd expect a cURL replacement to have quite a few dependencies, since its complicated software with lots of edge cases, so for instance not re-inventing a HTTP parser is a great idea.
Now if only we all shared as developers this sentiment in terms of upstreaming contributions back too. The more we contribute and share with each other the more productive we can be.
Of course, sometimes package managers do terrible jobs at ensuring some sort of base quality, around security or otherwise, and thats never great. So its important to be aware of the trade offs you make when you source your dependencies.
By no means does this excuse developers from not understanding their dependency tree either. Its really the opposite.
Either you depend on other's work for that, or you roll your own. Choose your poison.
* tracing is only present for logging purposes. does curl need it? It should be configurable.
* itoa is only present for performance purposes, and only seems used by the server.
* It seems that a bunch of projects in the dependency tree use pin-project which is heavyweight and instead could use pin-project-lite. Some already do, which creates both being used, so you are worse off than just with pin-project alone...
* hyper contains code for both http servers and clients. Even if the dependencies are the same (and as seen above they are not), having to compile the server code for the client means an increase in compile time. It would be cleaner to provide separate server and client flags to make it possible to turn one off.
Technically it’s still a dependency but the standard library is maintained with a standard that is rarely matched by third party libraries, and can dramatically simplify the ecosystems’ dependency graph.
E.g. the async ecosystem offers you to pick between runtimes of different complexity and tradeoffs. Std only picked up the essential traits that allow other crates to interoperate with each other and the language itself to define async functions.
I mean, you can go both ways with this. Standard libraries are significantly more difficult to work on than third party libraries, and I've seen a lot of code in standard libraries that is objectively worse than ecosystem equivalents because of it.
[0] https://internals.rust-lang.org/t/scaling-back-my-involvemen...
1. CURL without https seems insufficient nowadays. 2. CURL could be improved by running multiple downloads at once. I'm not sure that curl command line utility could do it, but certainly libcurl.so has this ability, it allows client code to work with multiple connections. 3. Any application having UI could benefit from async: input/output and main task are async by nature. For example, curl might want to show progress/status to a terminal despite of a stalled connection.
So maybe Hyper is too much of a code for a curl, but it is arguable that it is not.
2. I'm not sure that CURL should necessarily support multiple concurrent downloads. It could also be argued it's more UNIX-y to make it just do one thing and allow the caller to run multiple CURL processes at the same time
3. You wouldn't necessarily need async to achieve these goals. You could easily have a synchronous http implementation which allows for printing to the console between receiving chunks of data from the network. And if you really didn't want blocking to have a "spinning" activity indicator or something, you could still achieve it with threads.
I think ultimately you'd have to decide based on the relative cost of including an entire runtime vs. just launching a second thread.
The latter is a better question because:
* It's directly connected to your security posture. * It's a stable metric across languages with different norms about module size.
bytes, futures-core, futures-channel, futures-util, http, http-body, httpdate, httparse, h2, itoa, tracingfeatures, pin-project, tower-service, tokio, want
https://gist.github.com/seg-lol/0d22cf5002f890305cfd094f9ed1...
*edit, passed in -e no-dev to `cargo tree`, thanks @est31 for the suggestion
[1] https://daniel.haxx.se/blog/2020/10/09/rust-in-curl-with-hyp...
So Rust aborts on invalid memory accesses, unwrap on None, etc. It does not abort on memory leaks. I don’t see Rust aborting in that context as much different from a segfault, and it guards against more situations than a segfault is able to do. Additionally, when stack unwinding is enabled (default) aborts can be caught during runtime and handled specially, if that’s necessary.
Edit: I said “aborts” above, I should have said “panics”. The option in Rust is to disable unwinding and instead abort immediately: https://doc.rust-lang.org/edition-guide/rust-2018/error-hand...
That can’t be caught at runtime, to be clear.
Rust will panic on these things, and panics can abort, or unwind. Unwind is the default.
That's not what's being talked about here, I don't think. This is about alloc::alloc::handle_alloc_error, which was not allowed to unwind at the time the linked comment was made. But in the last few hours, https://github.com/rust-lang/rust/pull/76448 was linked to, which shows how that has since changed.
handle_alloc_error seems to do well enough as a workaround, but (from my superficial reading of the GitHub thread) it feels like just a very specific "poor-man's try/catch" for this one particular case. It feels like a workaround.
In general I'm a big fan of Result/panic in place of traditional exceptions, but this usecase makes it really quite unideal
The overall result is that everyone pays the cognitive cost of exception (spelled "unwind" in Rust) safety, pays the syntactic and runtime costs of error code checking, pays the runtime cost of unwind tables, and still can't actually rely on unwinding to actually work, because anyone can just turn panics into aborts.
I hope for a language with Rust's focus on memory safety but without Rust's weird fashionable-in-the-2010s language design warts.
Very, very few people have to think about unwind safety, because it really only comes into play when you're writing unsafe code, and relying on the ability for panics to be caught. Many folks aren't writing any unsafe, and many who are are doing it explicitly in a panic=abort environment. And most don't rely on panics being able to be caught in the first place.
So, some subset of library authors have to pay attention to unwind safety in some cases. This is hardly "everyone."
> pays the syntactic
It's one character.
> and runtime costs of error code checking,
Exceptions also have runtime costs. I'm not aware of anything demonstrating that there is a really major difference between the two in real systems. I would be interested in reading more about this! Most of the discussion I've seen focuses on microbenchmarks, which can be very different than real code.
> pays the runtime cost of unwind tables,
Only if you want them, as you yourself mention, you can turn this off. And many do.
The Rust way of dealing with errors is to return a Result<> and in my experience it's the best system I've used. It doesn't have the code flow breaking aspect of exceptions while being a lot nicer and less error-prone than C or Go style error handling.
I happen to like exceptions in principle, but for a lot of low-level embedded code they're a deal breaker.
Looks like they've figured out a good way to allow bringing in safety while avoiding the risks any change will bring
Glad to see it seems to be going well!
I see that Stenberg (bagder) is receiving funding for the work from the ISRG, but I wonder if McArthur (seanmonstar) is, too? It seems like a sizable amount of work on their part, too.
- The borrow checker enforces mutable XOR shared references.
- The compiler does not allow use of local variables before they're assigned to, requires structs to be completely initialized, etc..
- All the builtin datastructures perform bounds checks
- The compiler disallows deferencing raw pointers except in unsafe blocks.
There's a lot of good things to be said about modern C++, particular smart pointers. However, it's significantly less resilient to common mistakes than Rust is: https://alexgaynor.net/2019/apr/21/modern-c++-wont-save-us/
> Dereferencing a nullptr gives a segfault (which is not a security issue, except in older kernels). Dereferencing a nullopt however, gives you an uninitialized value as a pointer, which can be a serious security issue.
...betrays a complete lack of understanding what Undefined Behavior is/implies. That's not something you want to see in an article discussing memory safety.
> But what are they?
The language's semantics are such that by default, you get memory safe code. This is checked at compile time. While many languages are memory safe, they often require a significant amount of runtime checking, with things like a garbage collector. Rust moves the vast majority of these kinds of checks to compile time, and so has the performance profile of C or C++, while still retaining memory safety.
> How does it compare with modern C++?
One way to look at Rust is "modern C++, but enforced, and by default." But that ignores some significant differences. For example, Rust's Box<T> and std::uniq_ptr are similar, but the latter can still be null, whereas Rust's can't. C++ cannot be checked statically for memory safety, even if modern C++ helps improve things, it doesn't go as far as Rust does.
For example, the type system includes a piece called the borrow-checker, which is able to guarantee that pointers are still valid when you use them, which eliminates use-after-free and buffer overflows.
In a similar vein, the type system includes information about in which ways types may be shared across threads, and by using this information, the compiler can guarantee that there are no data races whatsoever in multi-threaded programs.
> We’d like to thank Daniel for his willingness to be a leader on this issue. It’s not easy to make such significant changes to how wildly successful software is built, but we’ve come up with a great plan and together we’re going to make one of the most critical pieces of networking software in the world significantly more secure. We think this project can serve as a template for how we might secure more critical software, and we’re excited to learn along the way.
cURL is widely available and widely used, obviously, but I'm surprised to see it described this way. I've always seen it mainly as a way for people and scripts to conveniently try out endpoints and download files. But this makes it sound like more than that; does it get widely used in an infrastructural capacity?
> At an estimated six billion installations world wide, we can safely say that curl is the most widely used internet transfer library in the world. [...] curl runs in billions of mobile phones, a billion Windows 10 installations, in a half a billion games and several hundred million TVs - and more.
https://blogs.windows.com/windowsdeveloper/2020/04/30/rust-w...
As a user of software, it makes me happy to know that folks are investing in making the nuts and bolts safer and more secure.
Heck, even Checked C would do, if it ever gets fully done.
In any case, looking forward to the results.
It is not an either/or proposition. Certain, select modules could be recoded in Rust by particularly motivated Rust coders, leaving the huge amount of other code, for which there are too few Rust enthusiasts to work on, to be modernized in C++, and still able to call into the Rust code.
https://github.com/google/tarpc
https://github.com/actix/actix-web/tree/master/actix-http/sr...
https://github.com/hyperium/h2
https://github.com/djc/quinn/tree/main/quinn-h3
https://github.com/speakeasy-engine/torchbear/blob/master/sr...
~
There's beauty in this with the fluency in which complex applications like the coming secure social network, Radiojade, are built. See this example:
!# https://github.com/foundpatternscellar/ping-pong/blob/master...
Curl users, do you really want to stick with Bash's syntax instead of this??
- support HTTP/HTTPS
- support proxy (for by-passing firewall, censorship, etc, http/https/socks5)
- download one large file in parallel (configurable temporary directory)
- download many small files in parallel (seems too high-level to put in a library, not sure this is a good feature)
- configurable retry (maybe too high-level to put in a library)
- resume download
- good error semantics
- an interface with defined behaviour
- progress report (useful for downloading large files)
I tried using a wrapped (in rust) version of libcurl, and in the end I decided to just use the curl cli, and read through the man page and pass about 13 arguments to it to make it's behaviour defined (to me, to a certain confidence level), I also pinned the curl executable to a specific version to avoid unknown changes.
The end result works, but the process is unnecessarily complicated (invoke the cli binary, know what argument to pass, know the meaning of the many error codes), and the resume is not pleasant to use. I guess libcurl is designed to be that way, so that to an curl-master, he can tune all the knobs to do what he want, but to a average library user who just want to download things, it requires more attention than I'm willing to give to.
Used in an interactive context, the issue of defined behaviour is usually overlooked, but when used a library in a program that runs unattended and expensive to upgrade/repair, achievable defined behaviour is a must, and test is not an alternative to it, even experience is not an alternative (experience are time consuming to get, and not transferable to others).
All package managers needs to download packages from internet, often via HTTP, it's good to have a easy-to-use, well-defined, capable download library, many of them uses curl (Archlinux's pacman, rust installation script), many of them use others with varying level of capabilities, I thinks it would be beneficial if we can have a good library (in rust) for download things.
The --libcurl command line argument can help translate curl to libcurl.
Rust is dual APL2/MIT, and Curl's modified MIT is compatible with MIT, so no such issue would exist for Rust.
(Standard disclaimer, I'm not your lawyer, no citations offered, seek legal counsel.)
I do not know much about Wuffs, but it seems to be completely safe. No arithmetic overflows, no bound checking failures, no None unwrapping panics, no memory allocation failure panics.
makes me wonder about other interactive tooling. would be interesting if there were malicious binaries that were benign at runtime but triggered bugs in debuggers and profilers.
For instance libcurl can use any of 13 different TLS backends (one of which is already in Rust), or 3 different HTTP/3 backend (one of which is in rust).
Piping it to "sudo bash" is perfectly acceptable in the eyes of the system. It's doing the instructions the user has asked it to, they've explicitly been configured as sudoers, and usually have been prompted to enter their password.
What this looks like it does, and indeed does today (modulo bugs some of which could be prevented using Rust) is:
Ask totally-not-evil.example.com for this install.sh resource and then run that as root as a Bash script. This is no worse than if you were to have totally-not-evil.example.com give you the bash script on a floppy disk or something. If you suspect they might actually be evil, or just incompetent, that's on you either way.
But for some years curl didn't make any effort to confirm it was getting this file from totally-not-evil.example.com. Connect over SSL, ignore all this security stuff, fetch the file. So then it's like you just accepted a floppy disk you got in the mail which says it's "from totally-not-evil.example.com" but might really be from anybody. That's definitely worse. Today you have to specify the --insecure flag to do this if you want to (Hint: You do not want to)
curl has verified the server certificates by default since version 7.10, shipped in October 2002.
In practice, perhaps. But detecting whether the file is being piped to bash (and not cat/less/grep/etc) is pretty straightforward.
https://www.idontplaydarts.com/2016/04/detecting-curl-pipe-b...
Glad you feel safer though.
How dare you.
I love the idea of a safer cURL, but I don't think you should take this as a magical answer to all of cURL's problems.
[1]https://web.archive.org/web/20200506212152/https://medium.co... [2] I ran `grep -oR unsafe . | wc -l` after cloning the repo
Is anyone actually suggesting this?