I don't doubt this in the least. I've been a professional Python developer for 15 years, and I can't believe Python ever had the reputation for "high dev velocity" beyond toy examples. In every real world code base I've worked in, Python has been a strict liability and the promise that you can "just rewrite the slow parts in C/C++/Numpy/etc" has never held (you very often spend at least as much time marshaling the data to the target language format than you gain by processing in the faster language and you have all of the maintainability problems of C/C++/Numpy). Python trades developer velocity for poor runtime performance.
I don't think Rust is the paragon of high dev velocity languages either, but it seems to be more predictable. You don't have something that works as a prototype, but then you run into a bunch of performance problems (which virtually cannot be worked around) when you go to productionize, nor do you get all of the emergent quality and maintainability issues that come about from a dynamic type system (and last I checked, Python's optional static type system was still pretty alpha-grade).
I strongly recommend avoiding Python for new projects. Use Go if you're looking for an easy GC language with max productivity and a decent performance ceiling. Use Rust if you're writing really high performance or correctness-is-paramount software. I'm sure there are other decent options as well (I've heard generally good things about TypeScript, but I'd be concerned about its performance even if it is better than Python).
This is arguably true for small scripts, but for anything non-trivial I find that static typing means I can be more productive, because type errors are caught straight away (rather than potentially in production), and I can be much more confident in refactoring code because I know certain kinds of errors will be caught by during type checking. This doesn't remove the need for automated testing, but does reduce the number of test cases you have to write.
Things have improved a lot in the Python ecosystem of late with the introduction of type annotations and mypy, so to the extent possible I treat Python as if it's a statically typed language, putting type annotations everywhere and always making sure the codebase passes type checking before pushing a commit.
However, the fact that this is optional sadly means that not everyone does it, and if you're working with parts of a codebase written without this level of discipline, you're basically back to square one. I've come to the conclusion that for medium to large projects, dynamic typing encourages sloppy programming, because it doesn't force people to think as carefully about what exactly their data types are, and whether they're being used correctly.
I agree completely. I feel like there’s a pretty common arc among programmers:
1. learn to program using verbose statically typed languages
2. discover fun dynamically typed languages, eschew statically typed languages
3. discover dynamically typed languages are a shitshow for large real-world projects
4. re-discover fun/less verbose statically typed languages that make working on large projects tolerable
(Obviously, I am talking about CS students here, not about the general public who just want to learn what writing a program means)
auto x { get_something_complicated() };
or foo(int x) -> auto {
return get_something_complicated(bar(x));
}
so it's less "not-fun" these days.It is still more fun and efficient to program in a language that is garbage collected and statically typed.
I'll add to that: static typing also guarantees that your annotations are correct and complete. Prior to Mypy, in the Python world, we would document types in docstrings, which meant that docstrings gradually came to be incorrect--a function that returns a string might sometimes return `None`, but you called it assuming it always returned a string. In other cases, people (including the Python maintainers via the stdlib) will add annotations like "argument 'foo' is a file-like object", which is ... insufficient (do I just need a "read()" method? or do I need to implement "seek()" and "close()" as well?). Ultimately, in the absence of static analysis, a lot of time is wasted reading source code trying to understand the actual type signatures.
> Things have improved a lot in the Python ecosystem of late with the introduction of type annotations and mypy, so to the extent possible I treat Python as if it's a statically typed language, putting type annotations everywhere and always making sure the codebase passes type checking before pushing a commit.
This hasn't been true in my experience. Last I checked, Mypy still couldn't express things like recursive types (think "JSON" or "recursive linked list"), and a lot of things still required jumping through obscure hoops (defining a callback with kwargs). I also couldn't figure out how to export type annotations in my packages, and I had a lot of trouble getting mypy to pull in type stubs for packages that didn't provide their own annotations. In general, Python's typing story still feels very shoe-horned, although it's been a minute since I last fumbled with it, so maybe some of this has improved.
Fully agreed. It's not quite the same as Rust vs. Python, but I've pulled off several sweeping refactors and rewrites in Swift codebases that I wouldn't dream of even attempting with Objective-C.
Story -- I was a very early adopter of Python, back in the mid-90s. When other people wrote their CGI scripts in Perl, I always reached for Python. The first paid gig I ever had was a CGI script ("resume builder") I wrote in Python in 1996. But almost nobody was using it back then, and I'd get quizzical stares from people in job interviews etc. when it came up.
So for many years back then I really really wanted to get a job working in Python. And I was super excited to get a job around the 2001 time frame in it on a pretty cool embedded Linux project.
But my first experience working in a large codebase was disappointing. I could see right away that once many developers started working in that codebase together, best practices were really hard to enforce, and things started to fall apart into a bit of a mess, and lots of problems just showed up at runtime in bad explosions. That was the beginning of my disenchantment with dynamically typed, late-bound languages. They only give you the illusion of fast prototyping. You just shift your time to fixing type and binding errors later in your dev cycle.
Python really improved a lot with the 2->3 transition, and the community has much better best practices now. But I think the core problem with late bound languages remains.
All that said, I'm currently in another window working for my current employer wiring an embedded CPython interpreter into a Rust-based runtime...
Also I think people misread 'python speed', few c++ masters said it was more about prototyping to converge on good overall design rather than suffering the efforts of early iterations in c++ (90s c++).
I still use it for quick one-off scripts to "do things" with bits of data and so on. And as a calculator.
That is the perfect way to describe it! And sometimes you’re out of the dev cycle and into production if your test suite didn’t hit all cases.
I'd add one more addition to use Rust (speaking from an ex-Go dev): Use Rust if you want a robust std lib. Go is good, i used it for ~5 years, but man Rust was a breath of fresh air with the amount of tooling that helped me solve problems. Iterators, are a great example.
I also preferred Crates vs Github Repos, but i think that's just personal preference.. not objective.
Here's an example where the iterator-chain version is way more complex:
let posts = read_dir(source_directory)?
.filter_map(|result| {
(|| {
let entry = result?;
let os_file_name = entry.file_name();
let file_name = os_file_name.to_string_lossy();
if Self::is_bundle(&entry)? {
self.parse_post_bundle(&file_name, &entry.path())
.map(Some)
} else if file_name.ends_with(MARKDOWN_EXTENSION) {
self.parse_post(
file_name.trim_end_matches(MARKDOWN_EXTENSION),
&mut File::open(&entry.path())?,
)
.map(Some)
} else {
Ok(None)
}
})()
.transpose()
})
.collect::<Result<Vec<_>>>()?;
Here's the imperative version: let mut posts = Vec::new();
for result in read_dir(source_directory)? {
let entry = result?;
let os_file_name = entry.file_name();
let file_name = os_file_name.to_string_lossy();
if Self::is_bundle(&entry)? {
posts.push(self.parse_post_bundle(&file_name, &entry.path())?);
} else if file_name.ends_with(MARKDOWN_EXTENSION) {
posts.push(self.parse_post(
file_name.trim_end_matches(MARKDOWN_EXTENSION),
&mut File::open(&entry.path())?,
)?);
}
}If you're going to write an iterator chain in an imperative way, then you might as well right it imperatively.
A few notes:
- you can compare `&OsStr` with `&str` without `to_string_lossy` any explicit conversion (`&str -> &OsStr` done in `PartialEq` is basically free)
- you can get extension from `Path` by using `Path::extension`
- you can get filename without extension by using `Path::file_stem` or `Path::file_prefix`
- making your parse functions take the same arguments makes it a lot cleaner
Something like this:
read_dir("")?
.map(|e| e.map(|e| e.path()))
.map(|path| {
path.and_then(|path| {
if Self::is_bundle(&path) {
self.parse_post_bundle(&path.file_name().to_string_lossy(), &path).map(Some)
} else if Self::is_markdown(&path) {
self.parse_post(&path.file_stem().to_string_lossy(), &path).map(Some)
} else {
Ok(None)
}
})
})
.filter_map(Result::transpose)
.collect::<Result<Vec<()>, std::io::Error>>()
// Ignore this, just an example
fn is_bundle(path: &Path) -> bool {
path.extension().map(|ext| ext == "bundle").unwrap_or(false)
}
fn is_markdown(path: &Path) -> bool {
path.extension().map(|ext| ext == "md").unwrap_or(false)
}I personally use iterator-style for 1-3 simple ops, but go into a loop for anything more complicated. The functional part of me wants to force iterator style but I fight it for complex use cases because it is irrational on my part, and I find loops easier to read for stuff like that.
I however will often have a dozen maps/filters/etc and that style in imperative is what, nesting repeatedly or using `continue`. Meh.
If your Iterator consists of a single `filter_map` i can see why you don't get much value out of them in that case. It's because.. well, there's not much value to get out of it.
The tooling is the #1 reason I'd like to learn rust. I have not kept up with C++ and I'm not sure I ever will... sometimes plain C seems more straightforward. But those languages (C++ especially) have suffered from lack of standard tooling, in my opinion.
Even C# being a few years younger than Java seems to have made a huge difference in tooling availability (I'm sure having a single corporate driver during its lifetime made a difference, too).
I'm excited by the prospect of a compiled language with modern tooling.
Rust has been very rewarding in that it largely fixes all of C++’s tooling and language problems. It very much is a significantly improved C++.
Overall tooling and std lib are better on Go, actually there are not many languages that are on part with Go to that regard.
When you see what you can do with the single go command, it's pretty powerful.
Go doesn't have a standard library problem, it has a userland language problem.
* No sum types / algebraic data types in 2022.
* No exhaustive pattern matching in 2022
* No move semantics / Uses GC
* No borrow checker
* Still suffers from nil problem / No type-safe nil / No type-safe Optionals in 2022 -- (I can show you all the nil panics in kubernetes logs if you like)
Current go users are already sold so they don't expect more, and that's fine, but further evangelism will require a better host language.
[0]: https://tokio.rs/
I'd argue it's almost certainly something that should NOT be in the standard library. If anything, the feature was added to Rust too quickly and has way too many rough edges.
I've noticed however that there has been an uptick in great libraries over the last 2 years, with examples like pola.rs, rust-bert, tokenizers etc. starting to build momentum in the ecosystem.
As I understand it, the article talks about switching from python to rust in a non-trivial database service that is required to be fast and robust. I can't imagine python to do well in this regard, especially when C/C++ extensions are required to enhance the performance.
I also agree that 'Python has been a strict liability and the promise that you can "just rewrite the slow parts in C/C++/Numpy/etc" has never held'.
I just don't agree those points necessarily imply "avoiding Python for new projects". Even in the case of the original article, was it, in hindsight, a bad decision to start with python? I'd argue not necessarily. Python is still great for quickly hacking together something that works, not necessarily fast or maintainable, but if you're under time pressure or you're still trying to feel out the market fit for your product, it's not a great idea to waste brain cycles to deal with borrow checks for example.
Python's weaknesses only show when you need to scale, both on the performance and on the lines of code / number of collaborators metric, which is a problem you have only when the project succeeds. Avoiding python because of these problems reeks of premature optimization, unless you know it's how the project will end up in advance.
As other sibling comments point out, python is still great for small scripts and weekend projects. Perhaps you do mean 'avoiding Python for new projects that are expected to grow non-trivially', which is fine advice. I'm just a bit tired of all those over-reaching blanket statements people make regarding software design that only really makes sense in a limited context (here's a funny one I've heard recently: never write nested-for-loops because it's sloooww). It would be unfortunate to have the message "don't use python for large projects" be misinterpreted as "python sucks no matter what".
The article specifically talked about following the conventional Python advice to use C/C++ for the fast parts and Python for the "glue". This is precisely where Python people say Python shines.
> I just don't agree those points necessarily imply "avoiding Python for new projects".
You should avoid Python for new projects because you never know at the outset whether you will run into a performance bottleneck that can't be resolved by "just rewrite in C/numpy/etc" and there are other languages that are better than Python at both its strengths and weaknesses. The odds of painting yourself into a corner with Python are very, very high, and there are other languages that are better than Python at just about everything (the only area where Python still dominates is data science exploration libraries). Even if you don't paint yourself into a corner, you'll be dealing with lots of surprising papercuts: if you want reproducible builds, you'll have to wade through an ecosystem of package managers each with their own pitfalls including long build times (e.g., I saw pipenv take 30-45 minutes just to `pipenv lock` on a small-medium sized project); developing with docker basically doesn't work on mac (and windows?) because the VM uses all available CPU marshaling filesystem events over the guest/host boundary; your developers accidentally pulled in an API SDK that makes sync I/O calls (or else does some CPU-intensive task) which seizes up the event loop bringing the process down (and the failure cascades to remaining instances bringing the whole app down); and then there's the "regular" dynamic type system errors like forgetting to "await" things or missing/incorrect type documentation.
> Even in the case of the original article, was it, in hindsight, a bad decision to start with python? I'd argue not necessarily. I'd argue not necessarily. Python is still great for quickly hacking together something that works, not necessarily fast or maintainable, but if you're under time pressure or you're still trying to feel out the market fit for your product, it's not a great idea to waste brain cycles to deal with borrow checks for example.
My point isn't that Python is bad for literally every project--I actually think this project was probably a best-case scenario for Python (they didn't describe many of the issues I described in my previous paragraph) and they knew ahead of time where there performance issues would be so they were able to start with an architecture that was amenable to "write the slow parts in C/C++". My point is that there are languages that are better at fast iteration than Python (namely Go or TypeScript) while also excelling at Python's weaknesses (performance, tooling, distribution, type system, package management, etc etc etc). I think it's incredibly reckless to start a new commercial project in Python in almost all cases (with a possible exception for data science, but even then I would look hard at everything else first).
> Python's weaknesses only show when you need to scale, both on the performance and on the lines of code / number of collaborators metric, which is a problem you have only when the project succeeds.
This might be true for certain Django CRUD apps, but it's patently untrue in the general case. I've seen Python work fine for a prototype data science app, but fall over when real customer data was introduced. Similarly, I'm currently an SRE attached to a Python project which falls over regularly because of some event loop issue or another (often someone pulls in an SDK that makes sync calls under the hood, or else some CPU-intensive task blocks the event loop). I see stuff like this regularly, and it never happens in our Go apps.
> As other sibling comments point out, python is still great for small scripts and weekend projects.
Granted. Do what you want in your free time, but it's risky to use Python for new commercial apps.
> Perhaps you do mean 'avoiding Python for new projects that are expected to grow non-trivially', which is fine advice.
Yes, this is what I meant. I didn't mean "hobby projects".
> I'm just a bit tired of all those over-reaching blanket statements people make regarding software design
I don't usually make blanket statements, but Python really is so bad that this is a blanket statement I'm pretty comfortable with. I've scarcely ever heard of someone getting burned for picking Go over Python (there was some Python2 veteran who wrote an angry blog post once about Go back in the early days, but never since). Python just refuses to adequately progress because it wants to maintain compatibility--that's fine, but as professional engineers we should understand that the rest of the ecosystem is moving beyond where Python is willing/able to go, and it's no longer an appropriate tool for the overwhelming majority of new projects.
> I don't usually make blanket statements, but Python really is so bad that this is a blanket statement I'm pretty comfortable with.
Speaking as someone who agrees with you that something static is very likely better than python for a large project, how do you square this with the existence of a lot of Python projects, even "very commercial", that seem to go on just fine?
It's a fine theory and something I instinctively support, but I don't see measurable evidence in support of it.
The cognitive dissonance embedded in this statement is rather striking, considering Django itself is written in Python and is known for its ruggedness and code quality. Django manages all of this without even taking advantage of somewhat newer features in Python like typing.
Most startups that YC has funded that became successful (Series B or higher) were written in Python or Ruby. Now you can say that this is a tradeoff for post Series B, and for that I don't know. I've never worked on a massive Python mono-repo for a company that size. But I know what I've done in Python and, yes, it includes performance optimization in Numpy / Scipy / Cython, and other than Ruby, no other language comes close to the developer performance I see with Python and the ecosystem around it.
That said, there is a good and a bad way of writing Python. You absolutely need tests and lots of them since there is no compiler for vanilla Python. Mypy helps a ton too and so do the linters and even Black. Also, get pdb++ and customize it. The tooling helps a lot.
Now I haven't ever written production Rust, but I have for Go and many other languages, and so Rust may be faster, but other than Ruby I haven't seen anything close.
> Most startups that YC has funded that became successful (Series B or higher) were written in Python or Ruby.
It depends on what your goal is. If you want to get rich off of VC money, Python and Ruby might be a good fit. If you, on the other hand, want to write good, performant, maintenable and (relatively) bug-free software, then there are better choices.
So, what was the percentage of YC startups that started with Python or ruby and what was the share that became successful?
Is there data on startup lifespan ranked by tech stack? Genuinely been looking for this for awhile now.
But it is indeed quite fast when you are just trying to get your first file of code written.
There are more rungs on the ladder.
I worry about the more ardent Rust enthusiasts because it seems popular in those circles to embrace an attitude that we’ve somehow arrived at the right conceptual framework.
Rust is a competent, pragmatic embedding of a few of the big ideas from serious PLT into a well-optimized C++ compiler toolchain, and it’s cool and useful as a result.
But I hope to God it’s not the endgame on fast ML/Haskell/Lisp.
Naturally one can argue they weren't popular, with exception of Object Pascal.
https://github.com/dotnet/Docker.DotNet#example-create-a-con...
That seems like a bold claim? I've also been writing Python for a decade and it takes no time at all to whip up a small script. I have no doubts the Rust version will be much faster, but stepping into a new language would take me at least longer to write the script than an equivalent version in Python. Plus there's all the other stuff to learn.
Once you get past learning the basic syntax (creating variables, loop structure, assignment/etc.), you still have to pick up new things, like executing your work with parallelism, downloading files, parsing and dumping JSON, connecting to databases... Each new topic introduces a bunch of stuff that takes a little bit to figure out. I guess you could get through most of it in a few hours though if you had the right problem to work on.
But still, there's the muscle memory aspect. Can a new language really sink in so fast that it's faster to use than Python after 10 years of writing Python?
I'm not sure I understand. Was the typo in something like a dict key? Or did you add two of the wrong variables together? How did another language protect you from this? I fail to see how you couldn't also add two of the wrong variables (with the correct type) together in a different language.
For full projects outside websites, Rust for almost every case.
The important factor here is "with all strict checks in place". Whenever I hear fellow FE devs complain about how they dislike Typescript, the complaints often seem to center around how it doesn't really offer that many guarantees... only to find out that their typechecker is set to the least strict settings!
I'm not gonna lie, though: the transition from JS to TS absolutely comes with growing pains. This obviously includes actually updating the codebase and adding types, as well as the developer onboarding/learning process. Personally speaking, I found it very jarring to go from a permissive, lightning-fast JS dev cycle to a more methodical Typescript one.
And don't get me started on how frustrating it can be to diagnose weird type errors related to generic types :D
I often work on codebase that involves typed async Python plus a collection of Rust CLI tools that do most of the heavy lifting. And it's really not bad at all.
1. Tooling. `go build` just needs a `go.mod` file with a list of dependencies. There's no MSBUILD stuff (I think dotnet had a similar JSON file format, but then went back to MSBUILD?)
2. Runtime. I love that Go's default is static, native binaries, and that the ecosystem revolves around that. I especially love that this enables ~2mb Docker images. (I'm of the impression that you can do this in dotnet, but it's not clear to me how easy it is in practice or if there are performance tradeoffs)
3. Value types. Go doesn't have a classes; everything is a struct, and polymorphism and so on work on structs.
4. No inheritance. I'm pretty militantly opposed to inheritance. I know you don't have to use it in C#, but in Go I also don't have to deal with APIs or coworkers that assume inheritance.
That said, these aren't major things; I'm sure C# is a great language in general, and I'd definitely reach for if I wanted to make a Windows GUI app.
That said, Go is a fine language. It doesn't have exceptions last I checked, so that could be a deciding factor. Exceptions are handy in a lot of cases.
Their performance is, impressive, to say the least.
Go is also simpler because of the GC, given that Rust can get a bit tricky with lifetimes. Go routines and channels are a delight to use together with contexts.
Rust allows for a lot more ways to solve one problem, i.e. you can iterate over an array, of use a loop. Go only provided a for loop and that is pretty much the only way to iterate over something.
Some people might find this limiting but I find it refreshing given that I can focus on what I need to do instead of how I should do it.
I wouldn't recommend Typescript for CPU-bound tasks, but for I/O-bound tasks with complicated business logic it's type system is on the order of magnitude better than Go, and makes you able to get really close to the "make invalid states of the system a type error" ideal of functional languages like Haskell, while still being a lot more approachable.
I've had experience with large codebases in Python, Java, Go and Javascript before.
Python - as everyone already said, just sucks. Refactoring is a nightmare, you have to test every single line of code for any sort of dependability and that makes refactoring even harder. It allows for too much meta-programming and within 3-6 months your codebase is legacy.
Java is just not fun to write for me, but Kotlin is kinda fun and I could see myself choosing that. The problem is that my product deals with a lot of deeply nested, user-defined objects, and converting JSON (or YAML, or XML) to POJOs is a verbosity nightmare.
I just fucking really hate Go. I can't explain it, but I find the language infuriating. It somehow manages to be less expressive, more verbose and more straight up boring than all the other options.
As a Go fanboy, this made me chuckle. :)
> I can't explain it, but I find the language infuriating. It somehow manages to be less expressive, more verbose and more straight up boring than all the other options.
I sort of get it. I think people fixate too much on the for loops and the `if err != nil` boilerplate, but there's definitely some validity with respect to "how to properly annotate errors" and emulating enums (in the Rust sense of the word) is pretty error prone and it still doesn't get you exhaustive pattern matching.
The stuff I like about Go:
* Single static, native binaries - being able to just send someone an executable is pretty nice, not needing to make sure people have the right libs and runtime installed is phenomenal.
* Great tooling. I love that the Go tool takes a list of dependencies. Unlike Gradle/CMake/etc I don't have to script my own build tool in a crumby/slow imperative DSL.
* Compilation speed. Go compiles really fast.
* Small language, easy learning curve. Any programmer can read just about any Go project with very little experience. Any Go programmer can jump into any other Go project and be productive immediately (no extensive configuration of IDE or build tools or anything like that). You can also write surprisingly abstract code without reaching for generics, but you will likely have to change your programming habits.
* Value types done right. Most other mainstream languages don't have them, and the ones that do very often implement them poorly (e.g., IIRC in C# you can define a type as a struct or a class and only structs can be value types, and classes are more idiomatic).
* Go eschews inheritance and prefers data-oriented programming over object-oriented programming (although OOP is poorly defined and some people think it means "anything with dot-method syntax").
* I'm just more productive in Go than any other language
Stuff I don't like about Go:
* No enums
* I wish there were fewer tradeoffs between abstraction and performance
* Needs a better system for annotating errors
* Doesn't run on embedded systems (although I've heard good things about TinyGo)
* Doesn't make me feel as clever (this is a nice property for professional software development, but for hobbies not so much)
On balance, I think Go is the better language, but I can also understand why it chafes people. Different strokes. :)
This is how code is supposed to be written. Sort out the logic first, don't write the actual implementation details until later (models, http handlers, file parsers).
The support for this type of SOLID design is much more common in Go than in Typescript projects which often seem to inherit bad practices from Javascript developers.
I don't hate it like you, I just find it amazingly backwards, and I think I can explain why:
The community has the same persistence as certain teachers when it comes to locking things down and making things harder than necessary for no good reason, all for our own good.
Prime example: announcing a supposedly cutting edge language without generics in 2009, launching version 1.0 without it in 2012, 8 years after notoriously conservative Java got it, then waiting until 2022 before listening, that is some hardcore teacher mentality IMO.
I’ve been looking at Nim or Scala for the backend. It’s either that, or just dive into Rust.
It really seems if you don’t want to use Node, Java, or Go the available choices for a statically typed backend get quite slim.
Specifically, I posit that returns on type system features diminish--there's a ton of value in going from a dynamic type system to Go's pre-generics type system, but every addition thereafter offers decreasing value.
It is, but does it really matter when you write backend? I've written and operated Node-based backends at scale, and I can't honestly remember any bugs or outages that would have been prevented by this.
At the other hand, there's a lot of potential bugs and problems that didn't happen because proper usage of TS's type system prevented me from committing them.
But I write software not to enjoy the process, but to solve business needs. And using Rescript, Scheme or Haskell drastically limits the talent pool and rises the cost. Which, BTW, can be very good for some businesses, that would want to hire only the best programmers anyway and are ready to pay for it. But that's some, not all.
TypeScripts's type-system soundness holes kill the fun with that language very quickly, imho. That was a really big disappointment for me.
A "static" language that compiles fine and than crashes at runtime with complete WTF-bugs is not much better than a dynamic language. You just can't trust the compiler and need to double check every line of code anyway. So there is not much additional value in this kind of "type safety".
I would prefer Rescript of Scala.js anytime to TypeScript. Got really disillusioned by TS.
If a package does not supply its own types (increasingly rare now), you can generate stubs and put them in typings/$package.
It's picked up a bunch of bugs for me already. And if runtime performance is not a priority, I'd rather write Python than Go or Rust any day. It's more fun and more expressive.
https://numpy.org/devdocs/reference/typing.html
Which is cool, even strongly typed languages don't usually support that.
Why not D?
Nothing you said is verifiable.
Right, I was pretty explicitly speaking from experience. See my first paragraph.
> If Go replaces your Python this easily, you have always been writing Go
Go is only a 10 year old language, and I've been writing Python professionally for 15 years. I definitely have more hours on Python by a wide margin.
> The abstractions available in Python are in a totally different class than that of Go.
Python definitely is more abstract than Go, but I don't think that's a feather in its hat. Moreover, most of my criticisms of Python had nothing to do with its abstract nature.
Still, when I migrated my Python codebase to Rust I got rid of whole classes of bugs and honestly code faster in Rust on a "per debugged line of code" basis. In Python, every line MUST be run or could trigger an exception (and so many do and it is hard to see). This now requires a test harness that tests every single line which is a huge pain. In Rust, the things that can obviously panic are pretty easy to see and are limited (unwrap, expect, indexing, etc.) making it much easier to write robust code, even with a more limited test suite.
Rust gets safety with its ownership system. The ownership system brings strong guarantees around who is mutating / reading any given object.
Python has uncontrolled ownership. Any function can generally mutate / store any variable you pass in. This can definitely cause bugs if a rouge function starts modifying input.
Rust makes this impossible without the mutation “permission” being part of the method signature. Strong ownership controls force functions to be extremely explicit with their intentions
Rust keeps track of this, no matter how deep you twist your logic and it does so for every variable, no context manager blocks required.
Rust has a very expressive type system[0], which you won't otherwise find before hitting the more functional and research-y side of things. Modelling data in terms of enums (sum types), move types (affine types), etc... makes it a lot more reliable and a lot less faillible as it just removes entire swathes of edge cases. See concepts like "making invalid states unrepresentable".
[0] though it's still lacking in many ways, the more you get the more you want after all
That's not entirely true. As an easy example, Go has a type system but it's relatively anaemic and recreates the "billion dollar mistake" of nil. I have personal experience with Go applications that have broken in production due to nil dereferences, to bugs related to type-switching off `interface{}`, and to the language's inability to enforce exhaustive case statements in general.
While the unique safety features in Rust are generally about memory safety, it still takes a much more safety-conscious approach in general than most languages that are statically typed.
Unlike those exotic options, Rust also comes out of the box with a very powerful free linter (clippy) which you definitely want to use, and has a wide community, library, cargo ecosystem, is very stable and tested etc.
When you total it up, you get a fast, safe program that exhibits 60-70% less runtime bugs than something written in either Python or C++ for a similar development effort. You still need to test of course, but you can focus on major integration and logic bugs instead.
And of course you don't have nullability which is another big one.
I was very surprised to read the article. I think in the code I have in mind the borrow checker would only add overhead.
The most frequent errors I see in the Python webapp code are: 1. logic errors, especially around concurrent DB transactions, 2. type errors (missing values, improperly modeled data), 3. performance problems.
Usually I end up looking at source of libraries and making a list of all exceptions that can be thrown. But of course this is also terribly error prone.
However, I think unchecked exceptions are a mistake in any language for writing _serious_ code. Interestingly, everyone hated checked exceptions in java so now everyone makes new exceptions unchecked. I would say this is not because people disliked the idea of ensuring errors were handled, but the verbosity required to do so meaningfully was annoying, so people took shortcuts to avoid pain by wrapping in unchecked exceptions and in the process introduced "exceptions gone wild" (and now everyone who runs a known java app is familiar with the exception stack trace in logs and in the console).
It's just that in practice, the trade-offs the designers made for easy development made writing large applications next to impossible, unchecked exceptions being one of the major sources of instability. You write very fast buggy code and get excellent productivity initially, but as the program grows you fave a factorial growth of interactions between components, and, because the interface points are not well defined and checked at compile time, what you gained during development is lost under an uncontrolled torrent of subtle runtime bugs.
this is one big exception, because logical bugs are most common and most problematic.
[0]: https://msrc-blog.microsoft.com/2019/07/22/why-rust-for-safe...
On a language level though (I still hate java) but the type safety was nice and did save you a lot of worry. In retrospect I realize most of the issue was the frameworks available and all the terrible configuration stuff.
I love that today you have so many more options because for a lot of time you do save time on a per line level..
Also that Java’s type system is… not great.
So you have to put in a lot of effort (especially if it was before java 8, to say nothing of Java 5) and you just… don’t get much out of it.
Even more so given the syntactic overhead of defining java types before records (or without Lombok), to say nothing of the performance cost.
Rust is great, but modern C++ is great too.
My guess is that even with initial discipline in the form of code review and style enforcement by one or two people, the C++ descends into a riot of different opinions about, as usual, which are the good bits.
A similar piece of software could be written in C++ but I doubt it gets written successfully, in similar time, by this team. I reckon the post mortem of such an attempt would be written off as "Don't do rewrites, duh"
Maybe they're doing a Rust rewrite too? I'd be interested to read about that.
The main difference would be there wouldn't be a blog article on the front page. :)
It sounds to me they already had a good chunk of it written in C++ with python glueing it all together.
From a ‘dev velocity’ standpoint one would think having the senior devs learn a new language and rewrite an entire database wouldn’t make sense when they could just systematically replace the python parts with C++, which they already knew.
Or, who knows, maybe all their initial issues stemmed from the locations of semicolons around the else keyword?
Based on my experiences, I would agree. There seems to be a very important difference between my experiences, which probably align with most of the industry, and the Pinecone team. My career has been working with average developers doing ordinary business and internet things. Stuff that takes organization and teamwork, but not necessarily advanced or esoteric knowledge. So the COBOL programmers that went to the "Java in 21 Days" training that I consulted with, or the C# programmers who went with Go, they did poor-to-OK, but that was Good Enough.
The Pinecone team, though. They were already writing C/C++ to solve Hard Problems. As experienced programmers, they probably already understood the landscape of systems languages with pointers and having to manage their own memory. Rust, while imposing, would not be a huge leap for them.
Another aspect worth noting: they did they re-write incrementally, never completely abandoning the existing system. At least it seems so. I'll be interested in seeing the presentation from the upcoming talk about the rewrite that's mentioned in the story.
In the words, the team wasn't moved, the team moved itself.
Great in the way that Alexander the Great was great. Achieved big things and killed a lot of people.
Though in the case of C++ they're just dead inside.
I think this is pure and memory safe BS
No, modern C++ is still cobbled together from tools built in a different era. It doesn't even come with a package manager/build system.
Just on the fact that cargo/crates.io exist and it has modules - it would have to be really bad to make me go back to C++ if I ever need to write something at that level.
I like cargo better than any c++ tooling but those tools don't really make a ton of sense for c++ where every platform has a different compiler.
Ultimately, I agree. I don't think I'll be going back to c++ any time soon, but for me it's more about the tooling fragmentation than the tooling itself. I actually like the language itself.
I wince when I think about how much time I spent dealing with that crap back when I was doing C++.
Coding in a manifestly immature language signals that you are not really serious about being used where long-term support across a variety of platforms may be important. The language is still more likely than not to fizzle. If it does, it makes your flagship project a niche player.
What makes you think that?
In any given week, more people pick up C++ or Javascript to use professionally than the total number now employed to code Rust. Not to fizzle, it has to increase its adoption rate by orders of magnitude, but its fans are almost uniformly hostile toward any measure that could make it easier to adopt.
> That’s when internal murmurs about a complete rewrite started brewing…
> We decided to move our entire codebase to Rust
> there was still one minor problem - no one on the team knew Rust
"We have runtime issues so we will fix them by a complete rewrite in a language that no one on the team knows."
I would not call that a "hard decision", rather "rolling the dice". It awesome that they managed to make it work.
Rust fit the need they had pretty well (on paper anyway since they didn’t have experience with it), and the industry is showing it to be a good choice for solving these types of problems. That’s not rolling the dice, it’s taking a chance based on research.
Rolling the dice just means relying on getting lucky (instead of being skillful or controlling the outcome), usually on an unlikely outcome. When you literally roll the dice the odds are completely transparent (1/6 chance of each d6 value).
In this case (and in general with big rewrites), the thing you can never know about up-front - and therefore are relying on luck for - is whether you can build the new version within the time/resource budget that you allocate to it. If you get the estimation wrong, or hit unexpected problems/delays, then the rewrite can easily balloon from an ROI-positive project to a massively ROI-negative one. For a startup, this is particularly important because you could run out of runway before your rewrite pays off. This is why most people regard it to be a bad idea (or at least a risky one) to do a full rewrite of a product instead of incrementally evolving it. (Spolsky's essay on the subject I think is the canonical list of concerns with a rewrite: https://www.joelonsoftware.com/2000/04/06/things-you-should-...)
I think hoping a Rust rewrite would fix "complex runtime issues which were almost impossible to reproduce or isolate" as rolling the dice. If you cannot reproduce or isolate an issue, how can you know a rewrite in another language will fix the issue?
I say this because I'm often _AMAZED_ at just how undisciplined the majority of developers I work with are wrt stability and error propogation.
Rust is one of those languages it takes time to master... if you write a lot of code in it while you're learning, you're likely to make all sorts of mistakes that later will need to be cleaned up... at least, looking at Rust code I wrote when I was early in the learning process, I can't even imagine any of that code going to production... so I would say it worked for them either because they are highly skilled and can learn difficult things like Rust quickly and efficiently... or their previous code was so completely broken than anything else would be better :D...
What it doesn't tell you is that this is frequently a result of having a design that doesn't naturally fit into the kind of more restrictive paradigm that Rust wants you to write programs in. This is a natural consequence of moving from very forgiving languages where almost anything goes (e.g., garbage-collected languages). You'll try to write your software using paradigms that are familiar to you, and many of those don't fit well into the set of restrictions Rust forces upon you.
There are escape hatches that let you get around these limitations, and people become very accustomed to using them early on. This happens because the Rust compiler tells you what's wrong and that you can use one of these escape hatches to fix it. But as you continue to build, you have to use more and more of these to overcome the mismatch in your design and the kinds of designs that fit well into Rust's ownership model.
I believe it's very important to—early in your Rust career—try and understand if and when the compiler is trying to tell you about a deeper problem with your design than simply throwing surface-level ownership nits at you. I think for some people who are very used to RAII in C++ or very strict C styles, this can come somewhat easily as a lot of the patterns are common (though not enforced at compile-time as in Rust). It is much more difficult for people who are used to garbage-collected languages that tolerate program designs that are entirely unsuitable for languages without a GC.
You may try to avoid using things like lifetimes for a while, but those are completely undispensable to writing Rust... same with macros. They are really important and widely used in Rust.
One very good way to avoid simple mistakes is to configure Clippy to run on your IDE (or just run it manually): https://github.com/rust-lang/rust-clippy
1. Allowed them to evaluate if it would actually work. 2. Allowed them to train the team on it in a scalable way.
The result is that the team learned Rust and Rust fixed a bunch of problems. They were appropriately pragmatic in their approach and Rust was appropriately pragmatic in it's solutions to their problems. If you are ever considering migrating your codebase for some reason this is how you should go about it.
TLDR: You are heavily mis-characterizing their team and this article.
Rust cannot make a bad programmer a good programmer, but the fact it points out several classes of mistakes (sometimes too aggressively, hopefully someday we get Polonia and other improvements to the borrow checker to remove some things that are hard only because of the current implementation of various checking systems) allows far more confidence. I don't care how good someone is, having an automated way to validate large scales of issues aren't there is a big deal, and unlike unit/integration tests you don't have to write these.
This mirrors my experience: getting buggy software out is easier and faster in JS (or whatever you're familiar without compile guarantees), getting something out to production is easier with rust.
Sometimes you have enough senior developers and discipline to reproduce and catch obscure bugs (which happen more or less based on your software complexity) and you can fix your mess but that's not always feasible.
FAISS is also great, and has been for years. I don’t doubt that someone has done a meaningful rev on it.
But as someone who likes Rust and wants to like it more: don’t lead with that. The software speaks for itself. Or should.
The original title appears to be “Inside the Pinecone”, but seeing as they submitted it themselves I’m guessing they wanted the uptick on the Rust.
The article itself doesn’t discuss their transition to Rust until the end. So I wouldn’t say they led with it originally, but perhaps didn’t get traction on the original title.
I use a lot of great software written in Rust, it’s demonstrably a good vehicle for great software.
But too many of its fans are advocates, this can start to seem like an agenda. Don’t take engineering advice from people with an agenda.
I have been using C++, Python and Rust professionally, the developer experience is simply one order of magnitude better with Rust. The tooling is excellent, a sane compilation model brings you a lot, the type system is very helpful. Compared with C++, I measured x3 productivity on writing initial code on a project (sometimes C++ was the rewrite, sometimes Rust was, so not this kind of bias). Compounding maintenance, I expect it to tend to 1 order of magnitude in time.
Meanwhile, using Python can sometimes feel magical, but I have to maintain a library written in the language, and I can feel that it hasn't been optimized for this use case. It is terribly hard to keep backward compatibility, adding typing annotations is somewhat useful, but has terrible ergonomics, ensuring no regression is a pain that translates to endless suites of unit tests even for trivial matters ("be your own compiler" will never be a great idea IMO, although tooling like pylance does help a bit).
Meanwhile the rearguard is defending for new projects some kind of "No true Scotsman"ish modern C++ delusion, that I am still looking for and that moves like goalposts in every discussion I have with its proponents. My only explanation for the phenomenon is that it is fueled by hubris and sunk cost biases, because my experience with the modern idioms is that while they bring significant improvements, they are still very subpar compared to the expectations brought forth by a modern language, and memory safety related CVEs are still being written everyday in C++.
So, what should I do? It is true that, in more than 15 years of coding, Rust represents a revolution in programming in my eyes. Should I sit idly while others are missing on it and someone is wrong on the Internet?
I believe this is off-putting primarily due to the mental antibodies people must develop in an advertising-saturated culture. The death of enthusiasm.
I'm grateful to muizelaar but I don't know who that is, so we can't take credit for this one.
1) We want to write software that is fast and inexpensive to run. Our rust services are quickly growing, but cumulatively, they remain in the sub-1000 core count, while processing many millions of requests per second across our service with excellent latency.
2) We want to write software that won't break and page us in the middle of the night, or blow up when we deploy it. We operate some of the core persistence primitives at our company, so the consequence for error in the worst case could be corruption or loss of user data. Our experience thusfar with Rust is that we're able to confidently iterate and deploy our services. The developer velocity is indeed higher, and continues to grow with your experience in the language.
Our team of 10 (with 3 engineers having started in the last 2 weeks) owns all of the database clusters, and all of the services surrounding them. Our on-call is surprisingly quiet, with maybe 0-3 low sev incidents a week, with most of them being due to the underlying database/hardware, and not the rust services. We are actually writing a Rust control plane for our databases soon, which seeks to automate most of the response and remediation of most of the incidents we experience in a given week, and also to automate a bunch of manual work that we currently do around operations of the databases.
The learning curve can be a bit rough, but our team of engineers is motivated to learn, and we have a great culture of mentorship and teaching from the engineers who are familiar with Rust.
And no, we do not use an ORM like diesel, since we primarily use ScyllaDB. We wrote our own - it's very nice. We have a fully type-safe scylla/cassandra client, that does static validation of the queries (which we define as annotated structs), and also, when the service starts up, it validates that the schema that the service has agrees with the schema the database has. Insofar a making sure that if you say `WHERE id = ?`, and you pass a i32 but it wanted a String, the service will not start.
It also does request coalescing and tracing/telemetry, and will soon do rate limiting, circuit breaking and speculative retries as well.
https://www.scylladb.com/2022/02/15/introducing-catalytic-an...
Is this on a crate by any chance? Thanks
> there was still one minor problem - no one on the team knew Rust.
Is this real or satire? I can't tell.
Maybe no one in that specific company really knew the language, but there's loads of people actually using the language, it's not like it dropped from the clear blue sky. It's not like there's a Rust salesperson that knocked on their door with a leaflet. They're not even the first persons to write a database engine in Rust.
I wonder if other engineering domains have this same silliness. Do the architects have intense debates over steel H beams vs wooden LVL beams, and at some point newbie architects believe you should only switch from one to the other as an architecture firm if your architects have experience with it, because the only way to know if it would work is reading the marketing claims.
A compiler is just a tool. If you're doing a rewrite you better pick your tools correctly, and it seems like they did. You could theorise they were lucky somehow, or you could recognise that they're experienced, educated and skilled and made the right choice based on their researched.
Hopefully what they meant is that Rust was only being considered when they started having the team learn it, or that the team was not extremely proficient at it when they picked (but had sufficient knowledge of the language).
Not saying that this part didn't surprise me as well. And if they did commit based only on the marketing material... what a gamble.
I think Rust is really shaping up to be in a sweet spot.
One thing that is not talked about enough with Rust, is that you do not need to start with fighting the borrow checker from day one.
Making copies and using Rc can bypass a lot of the trickier parts of the borrow checker and still allow you to access much of the benefits of Rust in terms of performance and correctness, especially compared to a dynamic language like Python.
As you get more experience, you can then use references and other techniques to eliminated unnecessary copies, etc.
> As you get more experience, you can then use references and other techniques to eliminated unnecessary copies, etc.
I've always found this advice to be infeasible in practice, despite sounding like a reasonable pitch.
Arc allows you to avoid thinking about ownership, which is largely what dictates the architecture when writing idiomatic Rust. Plastering on ownership later is going to change the code so much you might as well rewrite it .
Moreover, if you want to prototype (with Arcs, clones) a new part of the code, while integrating with a more mature part of the code base, you still need to adhere to the ownership structure.
In fact, this is one of my main issues with Rust, that it's a terrible language for prototyping. I don't have a solution, and I'm not blaming Rust – it's still a great language. But you have to have a very solid design/mental model before you start coding.
There is obviously always some nuance, but I find that there is a wide useful space between Rcs that can be avoided quite simply and making dramatic changes to your code to avoid an Rc.
Python "velocity" is, to me, more about the ease with which you can make changes without breaking other code. You can still break other code and if you don't have a proper test suite then you're wasting your time - which is something that smells slightly in the story.
Why not try leveldb. We're doing random reads of 170,000 vectors (/130M) a second. No startup needed.
The performance imperatives have changed with the hardware, but NAND flash at that time had an asymmetry between reads and writes in terms of the amount of data: once you were writing even one byte you had effectively paid for writing a whole block and were therefore incentivized to get your money’s worth and write to a “log”, which then would be “merged”, in some “structured” way, hence LSM.
This is old news these days but it was quite the novelty at the time!
Although our graph-based vector index does not use HNSW, the concept is similar.
If you think about it, you can try to optimize a Rust program through trying brute force stuff and trying weird things that you are not even sure will break something or not. But you can try ~without 100% knowledge or confidence~ and boom the compiler will tell you if you are shooting yourself in the foot. Now try to that in C++. Even with 95% knowledge and confidence I will be very worried as a Lead or CTO.
In my own case I rewrote decent size Python app in C++ and it had taken me about the same time as for original developers. And no I was not porting code but rather working with the specs so did my own design.
How did they verify that the new code was conformant with the old app's behavior/logic?
> To make matters worse, we would discover issues only after deploying (or in production!) due to Python’s run time nature.
I don't want to infer too much, but this makes it sound like perhaps they didn't have a very robust set of E2E/Acceptance tests, which would make a full-cutover migration scary to me. If you're finding Python bugs only after deploy, how can you find the inevitable rewritten-Rust-code incompatibilities before deploying/production?
I've been digging into Rust/Python interop recently using https://github.com/PyO3/pyo3 and maturin, and this points to an interesting migration strategy; PyO3 makes it quite easy to write a Python module in Rust, or even call back and forth between them, so you could gradually move code from Python to Rust. A migration path to full-Rust might be:
1. Incrementally replace all your fast-path C/C++ code with Rust, still having Python call these compiled modules. End-state: You now have a Python/Rust project instead of Python/C/C++.
2. Gradually grow the surface area of your Rust packages, moving logic from Python into Rust. Your existing Python tests can still run, and your existing entrypoints are the same Python.
3. At some point, you presumably need to cut over the entrypoint layer (the API?) to use pure Rust, instead of Python. This should be a much less scary migration since both versions are calling into the same underlying Rust library code. Depending on your architecture, if you have an API Gateway you can split your service backends to migrate one endpoint at a time to the new Rust API service, while keeping the old Python API service around to fail back to. (They are using k8s so you can do this with your Ingress for example).
I'm interested in others' experiences with Rust/Python interop, are there any rough edges worth knowing about?
With that said. I learned Rust recently. I think it is a viable choice for a business core. Rust is not at all that difficult to learn as some claim. The tuts are great. Some things are very different though.
I love Rust but seems like they just needed to write some test before shipping to prod?
> Built-in test, CI/CD
Does it have anything to do with the language?
IMHO - a db is all core. Everything should be fast. Even the CLI tools - Rust is pretty good at cli tools. Why is he building a database out of python? How much python is in this repo anyway?
Also this is a Python + C++ app - not python. And the article critiques python, not c++. Which is weird, because I would have thought the source of problems would have been C++ not python.
Don't get me wrong - Rust is great. Cargo is great too. Compiled programs are great to deploy rather than python (pip hell) and C++ ({CMAKE_HELL}). But this is apples to oranges comparison.
The things that make Rust better than C++ / Python are not the things this article talks about.
Rust deserves mention in the title only if actually getting it done in Rust was particularly difficult.
Reality: It's insanely hard to find talented developers, but Rust makes people excited, so they are just using it as a marketing tool.