1. Scripts should be maintained and tested, and
2. The language used for scripts necessarily is and should be treated as a project language, and the appropriateness of that choice should have all the factors that go into choosing a langauge for any other purpose, including the impact on complexity if it isn't the main language and fitness for purpose and any additional dev platform, tooling, etc., constraints it imposes, but...
This doesn't imply scripts should be in the main project language, any more than it is generally the case that projects must be monolingual.
In fact my "scripts" are actually part of the main executable. I use cmd-line args to invoke the needed functionality.
For example, in the past I would have written a Python script to deploy my Go binary to a server, possibly using tools like Fabric that provide functionality to make it easier.
Today I add `-deploy-hetzner` cmd-line to my Go binary and it does the work. It builds itself, copies the binary to the server, kills the old instances, configures caddy if needed, starts newly uploaded instance etc.
For example my deploy.go is 409 lines of code, which is not that bad. You can see exactly how this works: https://github.com/kjk/edna/blob/main/server/deploy.go
I standardized on how I deploy things so deploy.go is mostly re-used among several projects.
Writing this code isn't much more difficult that what I used to write in Python.
This kind of code can be shorter because I don't have to handle errors, I just panic if something goes wrong.
I like that I don't have to switch between different languages and that I have full control and understanding over what happens. Fabric used to be a bit of a black box.
I even wrote an article about this idea: https://blog.kowalczyk.info/article/4b1f9201181340099b698246...
Candidly I think that it's much easier to do this in go (or rust), rather than say python/ruby/node as I can use complied binaries without needing a run time.
Edit: //go:build ignore is idiomatic
No per-platform install instructions, no “ok but you have to run this in WSL2… you don’t have that? Oh god, ok, let’s find those instructions…”, no “oh fuck the flags for that are different on BSD and Linux, so this breaks on macOS”. “Wait the command’s python3 not python, on your system? And also it’s erroring even after you fix that? Shit you’re still on 3.8, we use features that weren’t included until 3.10…”
“Unzip this single binary and run it. Tell your OS to trust it if it hassles you. That’s it.”
C# with mono might be close to as good? Not sure. Rust’s probably OK but a bit of an investment when you’re just trying to dash off some quick tool or script. C++ and C can do it but you definitely don’t want to. Java’s obviously out (ok, now update your JRE… wait, your env vars are all fucked up, hold on…).
Go’s also highly likely to be something another developer with or without Go experience can read and tweak after the person who wrote it leaves, without much trouble.
Well they'd run into the same problem if the script was Python, wouldn't they?
And Go is ultimately far far easier to read, modify[1] and debug for someone who doesn't know it than Python is.
[1] For example, adding a third-party library to handle new cases. I've never had a good time doing it with Python's various package management, but Go seems pretty good at it.
For a project where multiple people are involved, I think it's better to split concerns. Deployment should be handled transparently and reproducible in a CI pipeline that the team members who are allowed to deploy have access to.
I'm OK with python for scripts, but go "just works".
I think Go scripts have a much better odd of “just working”with only the Go toolchain installed.
go run cmd/server/main.go --task db --args migrate
go run cmd/server/main.go --task cron --args reset-trial-accounts
etc.
It can go the other way as well. tcl was originally conceived as an embeddable scripting language that can drive GUI applications, and has been very successful in that use-case. It has a few flaws, but I would have preferred it to been embedded in the browser instead of making up Javascript back in the day.
That and often making the scripts overly complex and a software project in themselves. Changes are that if you're building/deployment scripts are this complex that you are not leveraging modern tooling and platforms efficiently and re-invented the wheel in script form.
A recent example I encountered was an Ansible environment. While looking how other teams had set up their playbooks for something, I came across an extremely complex one. It basically pulled a bunch of java apps from artifactory and wrapped them in complex bash logic. All of this to do some conditional checks and send out a mail through a custom java mail client.
This amounted to over a hundred lines of code that I could replace with a handful of lines in Ansible.
yes, this. Engineers frequently try to converge everything into the one-true-way and go too far. (I blame this thinking for phones without buttons or touchscreen cars)
Makes no sense to write scripts in C++. I've worked on projects where people have tried to do this, and they end up being more fragile, cumbersome and not that useful.
I think python is ok for cross-platform.
a shell is good for command line/os logic.
seriously a shell script with 'set -e' at the top, and a lot of command invocations and not much else is pretty easy for a group to maintain.
Pretending it is not, OTOH, is.
Scripts should be testable.
Scripts should use functions with appropriately scoped variables. (This really helps with testability)
Scripts should list assumptions.
Scripts should CHECK assumptions.
Scripts should call commands with --long-style-options instead of -L, especially uncommon options.
---
As someone who migrated a couple hundred shell scripts over the past year, I'd rather have these done before I ask someone to write a script in C.
edit: and for ${deity} sake, use shellcheck.
Too bad `getopts` only supports single-char options. :p
And to be more specific - I meant when the script calls someone else, not how it handles options.
> The learning curve is minimal since you already know the corners of the language.
Learning a new language shouldn't be difficult. Programmers are expected to familiarize themselves with new tech.
> Internal language APIs can be leveraged, which drastically changes the mental model to write the script (for the better).
This is true. I myself have encountered situations where I needed to call into my C API's from a higher-level language, but since most languages can interface with C this hasn't been an issue for me. For example I've interop'd Go+C, Python+C, and Lua+C.
> Scripts feel more natural and eventually maintainability increases. Team members are familiarized with the language!
This sounds like a subjective rehash of the first point.
> Development machines compatibility increases. Windows users can finally run all scripts.
This is true if you're talking about shell scripting, but if you're scripting with a general purpose programming language then it shouldn't be an issue. What language (besides shell) isn't portable these days? And even then, you can install a *nix environment on Windows.
I wish any large company agreed with this. I've worked for a company that on boarded every single new engineer to a very niche language (F#) in a few days. Also, everybody I worked with there was amazing. Probably because of that kind of mindset.
Meanwhile google tiptoes around teams adopting kotlin because "oh no, what if other teams touching the code might not be able to read it". Google is supposed to be hiring the brightest but internally is worried the brightest can't review slightly-different-java.
It's shocking how everybody acts like senior engineers might need months to learn a new language. Sure, maybe for some esoteric edge cases, but 5 mins on https://learnxinyminutes.com/ should get you 80% of the way there, and an afternoon looking at big projects or guidelines/examples should you another 18% of the way.
Not for C++, and even for other languages, it's not the language that's hard, it's the idioms.
Python written by experts can be well-nigh incomprehensible (you can save typing out exactly one line if you use list-comprehensions everywhere!).
Someone who knows Javascript well still needs to know all the nooks and crannies of the popular frameworks.
Java with the most popular frameworks (Spring/Boot/etc) can be impossible for a non-Java programmer to reason about (where's all this fucking magic coming from? Where is it documented? What are the other magic words I can put into comments?)
C# is turning into a C++ wannabe as far as comprehension complexity goes.
Right now, the quickest onboarding I've seen by far are Go codebases.
The knowledge tree required to contribute to a codebase can exists on a Deep axis and a Wide axis. C++ goes Deep and Wide. Go and C are the only projects I've seen that goes neither deep nor wide.
I've seen instances where people were worried it would take someone a month longer to fully onboard. Completely ignoring the fact that *fully* onboarding in any complex environment is going to take several months anyway.
I'd also argue that in the case of setting up your scripts, it matters even less. Automation scripts shouldn't be so complex that you fully need to know the ins and outs of the language they are written in. If they are, then maybe it is time to re-evaluate your building/deployment process.
Furthermore, I'd say that historically, both bash and python should be languages any semi competent developer at some point learns to work with to some degree. I say historically, because it always has been difficult to not encounter it when doing software development in the past... 20 years or so. But with modern environments and deployments it is more feasible as much more is abstracted away in pipeline yaml syntax.
It's true for some languages, like C++, some might wrongly extrapolate from that. I agree with your general point though. If your senior engs can't learn Python/Ruby/F# etc in a few days to a level where they can contribute, you might want to ask yourself what senior means in your org.
That is highly ironic, seeing how often Google tries randomly shoving Dart down people's throats.
A dev that switched from language X to Y "in a few days" will just write X using Y syntax. Here's a good example of going from Java to Python: https://youtu.be/wf-BqAjZb8M?t=1327.
> Learning a new language shouldn't be difficult. Programmers are expected to familiarize themselves with new tech.
But in practice, it is. Maybe you're on a team of elite 10x programmers that can quickly become experts in anything, but that's rare. A lot of programmers don't want to bother coming up to speed with the quirky choice of some past developer. And a lot of places have programmers that aren't even that good with the "project main language," and just lack lack the ability become productive in that quirky choice in a reasonable amount of time.
Defensive coding against organizational problems is not a bad thing.
Indeed. There's also 'learning' and learning: really knowing all the nooks and crannies of a language, learning the standard library and learning popular third party libraries all takes time and makes a big difference to code quality.
I have written a few projects in Haskell, but I freely admit theat when I read any of Simon Peyton-Jones papers (eg the one on build systems referenced on HN recently) I am in awe of the way he can map concepts into Haskell code.
Pray that you aren't, because that is a recipe for misery.
Learning it is fine, but will you still know it 2 years from now when you need to modify the script? Me, definitely not.
I think that expectation is a problem.
I mean, sure, you can't use one single technology for any non-trivial project. But, on the other hand, is it really faster to read the spec and short comings of 20k different `left-pad` type "tech"?
I think there's a line to be drawn: each new $TECH added as a dependency is a liability. The expectation should not be "throw every single tech we can think of into there, because programmers are expected to learn new tech".
The calculus really should be "Each new $TECH we add increases our hiring burden, our ramp-up times, our diagnostic burden for when things go pear-shaped, our cognitive load when actually adding features, our tests, and eats into our training-time budget."
Those are a lot of downsides, so before padding their CV the responsible developer should be balancing the trade-offs.
Unfortunately that is rarely how it actually works.
Have windows users use WSL (the VSCode integration is great!), and mac users should install GNU tools since the system tools are obnoxiously incompatible.
The only time I've found that scripts should be in another language is:
1. You need to call libs that to do something fancy and it would be too troublesome to make a small Unix style executable to do the thing. 2. The developers on your team lack Unix/bash experience, and you don't trust them to learn in a timely manner (sad).
At that point you might as well target Python 3.6. Seems like the same hassle for the developer to install and you don't have to worry about wonky differences for users who haven't installed GNU tools, but still think they can run your script because it says `.sh`
Unless you're doing some extremely niche work, Bash >= 3.2 (because Mac) is nearly always going to be available. Even if it _isn't_, there will still be sh or dash, and it's not _that_ hard to stick with pure POSIX for most small uses.
The last time I (by which I mean my team) rewrote a script from Bash into Python was because it had gotten unwieldy over time, I was the sole maintainer, and very few other people at the company knew Bash well enough to understand some of it. The upside was testing frameworks in Python are way better than Bash.
Maybe this is what Perl is for, but I never learned it.
I think this point is especially important for C++ projects. It is my gut feeling that C++ and Python cluster very closely in terms of developer familiarity. That is, a C++ developer very likely is also a passable Python developer.
Given that it tends to take more time to write a C++ program than the equivalent Python program, the stable result is that many C++ projects 1) expose C++ to Python (via e.g. pybind11) and 2) write all scripts in Python.
And you get almost all of the benefits that the article suggests, because almost all C++ developers are also Python developers.
Discrete programs are superior in many ways, as you do not immediately incur maintenance costs to write them. More, they typically force you to have a discrete API that they work with, and then you can lean on that.
Yes, you can do all of this with modular programming techniques. Indeed, "unit tests" are easy to see as similar to what I'm advocating here. Such that I think my assertion is softer than many folks are probably seeing. If you are "scripting" something to add data to the system, it should emphatically not hit the database directly.
I don't know where this lands me on the infrastructure as code (IAS) debate. I'm sympathetic to the desire. I start to think of it as navel gazing when I see some of the very engineered testing practices some people take those to.
That hasn't been true since at least Java 11. You can execute any .java file using `java foo.java`. No compilation required. You can reference dependencies using the usual classpath options etc.
Startup time is minimal.
Been using such scripts in exactly the way the author suggests for years. Much more pleasant than messing around with maven or gradle plugins.
I don't know anyone who thinks that a java script (being the main project language) is going to surpass:
#!/bin/sh
dep-start.sh
./gradlew clean bootRun
Maybe there are outliers with this "hot take". The result of years of projects (even changing hands), are a lot more instructive than someone posing theoretical value of reusability.
It always bugged me that build scripts are hardly ever tested or engineered. They just grow into giant balls of mud.
This is a mess and added a lot of extra complexity to the build pipeline since now we had to manage an additional xtask container. Python is very well suited for CI scripting, xtask should only be used for things directly run by the developer, and even then Python may be a better choice for most things.
Use the language that best fits the job. For me that's TypeScript on a serverless platform, SwiftUI front-end and Java for business logic.
I used to feel like I was winning using only Java for UI, Web and business logic, but the complexity became immense. It's too easy to create yet-another-internal-API that gets forgotten until it needs to be refactored.
Also learning a language gives you a new perspective on software engineering, a bit like learning a foreign language gives a new view on human culture.
I don't think it's a requirement, but it's an advantage
But that does not imply you know how to get the creation date of a file or how to zip a directory.
> Internal language APIs can be leveraged, which drastically changes the mental model to write the script (for the better)
That sounds like a rather empty statement.
> Scripts feel more natural and eventually maintainability increases. Team members are familiarized with the language!
Don't you think familiarity with the OS (or OSes) comes first? And that knowledge usually comes with the knowledge of a shell or batch language.
> Development machines compatibility increases. Windows users can finally run all scripts.
Script development time increases, too. And Windows has WSL nowadays.
I think of scripts as the middleware between the operating system and the shipped code. The code is controlled by the operating system, so the operating system's tools should be used to manage it. In many cases this means bash or make.
Plus, I don't want modern Javascript to do things on the filesystem that would require importing dozens of projects that I need to vet before using. Golang or Python perhaps, but the buildchain for modern Javascript is hell as-is; it doesn't need another layer of Javascript.
I have never understood the appeal of LinqPad whatsoever.
It's much faster to just hit the + in LINQPad.
It's quite useful when I'm reviewing someone else's code, or if I'm "in the zone" in a large change. I can verify some syntax in a few seconds, as opposed to the minutes it takes to make a throwaway console app project / solution.
The package.json scripts "work", but it's quite clunky, and relying on shell scripts that run node.js scripts causes issues. (cross-env solving a problem that really shouldn't exist.)
I’m sure there’s something like rake that exists for Node, but the community won’t standardise on it because it’s not enough a problem
* If performance matters, use something other than shell.
* If you are writing a script that is more than 100 lines long, or that uses non-straightforward control flow logic, you should rewrite it in a more structured language now. Bear in mind that scripts grow. Rewrite your script early to avoid a more time-consuming rewrite at a later date.
* When assessing the complexity of your code (e.g. to decide whether to switch languages) consider whether the code is easily maintainable by people other than its author.In Ruby/Rails, the concept of ad-hoc script execution is baked in via rake.
Scala is a lot less exotic than Clojure or JRuby to most Java devs, as expressive as Groovy, yet fully type-checked.
Instead I found a concise, measured and sane opinion that I honestly can’t disagree with. Like, sure, if your language broadly supports the secondary use case (scripting) why not? If it doesn’t, then the juice almost certainly won’t be worth the squeeze.
IMO some languages like C/C++ cry out for an embedded scripting language so that you don't write the basic, one off, performance insensitive parts of your code in a "hard" language. This is taking it the other way around - suggesting that as much as possible of your "main project" should be in a scripting language so that you're not wasting "hard" development cycles on areas that don't need it.
Code reuse.
The benefits of hard language: 1) I now can do interesting things like text parsing, 2) my scripts are cross platform, 3) I don't need to figure out how to deploy python everywhere in advance or on demand, 4) if the user has python, I don't need to tell him I don't like his python version and he must install a different OS, 5) I don't have python as an extra dependency, 6) I can reuse my main code in my scripts, 7) scripts are written in a language with a decent type system.
If you want type systems everywhere then I think that's another debate. That is really a total rejection of almost all scripting languages for all purposes.
From other view, yes, it is not good, when for example on front you use strict typed lang and on back dynamic, so need to constantly do conversions.
So need some reasonable combination. Traditional Java+JS is looking reasonable. JS (front) +Python (back) is also reasonable. C backend from my view is weird, so should be some intermediate lang, for example Lua.
If your project uses C or C++, you also use the Zig build system.
Advocacy - Maintain it With Zig:
https://kristoff.it/blog/maintain-it-with-zig/
HN discussion:
https://news.ycombinator.com/item?id=35566791
Reference:
These days I'm much more comfortable both writing and maintaining code in languages that aren't my daily driver (like Bash or jq or AppleScript or even Go) because I can get an LLM to do most of the work for me, and help me understand the bits that don't make sense to me.
Granted, this probably takes a fair amount of experience to even know what you're looking at well enough to search for it.
I trust LLMs with tools like Bash and jq and ffmpeg which have been around for years. I wouldn't trust them with anything released within the past 12-24 months.
An example from just the other day: https://til.simonwillison.net/go/installing-tools
I wanted to understand this:
go install github.com/icholy/semgrepx@latest
How does that @latest reference mention? There's no branch or tag on that repo called "latest".I tried and failed to find documentation. I gave up and asked GPT-4, which said:
> @latest: This specifies the version of the package you want to install. In this case, latest means that the Go tool will install the latest version of the package available. The Go tool uses the versioning information from the repository's tags to determine the latest version. If the repository follows semantic versioning, the latest version is the one with the highest version number. If there are no version tags, latest will refer to the most recent commit on the default branch of the repository.
Is that correct? I have no idea! But it still gave me more to go on than my failed attempts with the real documentation.
The rest of the code base started in Java, then Clojure, now it’s Go. The scripts are there still in their very-not-modern Perl style though. They have a self-evaluating behavior consisting of data blocks that are interpolated. To be honest, I’m not sure exactly how it works. Very discouraging for the casual passerby looking for some cleanup to do.
I always start with a simple shell script which does most of the work as simple as possible, even simpler is using `make` then introducing shell scripts when needed.
Here's a hotter take: your team colleagues are more likely to wear your clothes than to use your scripts.
/Serious!
... as long as that main language is a scripting language. Otherwise it's just dumb.
Also, tunnel vision by the author: "Almost all projects I’ve worked on have scripts we wrote to automate a repetitive process. "
Well almost all projects I’ve worked on have scripts we wrote to automate a ONE-TIME process. Like collect some data from the log to figure out a bug, fix it and forget about both the bug and the script. Automate it since can't manually process 30Gb of data and grep only can do so much. Sure as funk won't write the "script" in C++ but Python or Perl or something.
Personally, I prefer writing shell scripts regardless of what the main language is in a given project. They're portable (more or less), and can e.g. detect missing dependencies and install them if necessary, which isn't possible with the main language if you're missing its compiler or interpreter.
Sometimes these bash scripts are used in lots of contexts, local dev, CI pipelines, Docker build steps... good luck running Java or Rust in the last two. Even if you get it to work, good luck debugging if there are any issues.
I'd hate to do with typescript what we're doing in scripts with bash.. just like I'd hate doing in bash what we do in typescript..
In C#, I could see doing this.
https://hshell.hydraulic.dev/14.0/
Advantages:
• It's as concise as bash but far more readable, logical and less bug-prone thanks to the static type system with plenty of type inference.
• IntelliJ can provide a lot of assistance.
• You can easily import and use any JVM library, and there are lots of them.
• Ditto for internal project modules if you work on the JVM.
• If you need to, you can easily mix in and evaluate Python/Java/etc using Graal. It has an integrated disk cache that can be used for storing file trees that manages free disk space automatically, and there's a command to set up a Python virtualenv in the disk cache transparently.
• We have a high level shell API that makes console, file and network operations as easy as in bash, and sometimes easier because the commands aren't afraid to deviate from POSIX when that would be more convenient. For example most operations are recursive by default.
• A smart progress tracking framework is integrated into every operation including things like wget.
• It's fully portable to Windows including edge cases like being able to set POSIX permissions on a file, add it to a tar, and the resulting tar will retain the correct permissions.
• SSH is deeply integrated, and so commands "do the right thing" automatically. For example all the commands take Path objects as well as strings, and path objects track which machine they refer to, so if you open up an SSH sub-shell you can easily copy to/from the remote machine using regular copy/move commands. Other commands are "smart" for example the wget() function given a path that's on a remote machine will execute curl or wget remotely rather than download locally then reupload.
Although this sounds like it was all a lot of work to build, in reality our main product is a kind of build system (it makes deploying desktop apps easy, see bio for link). So all that functionality is in actuality functionality we built for the product and just kept nicely factored out into modules. The work invested into the scripting-specific parts is probably a couple of weeks over a period of a couple of years, and it was well worth it given the number of scripts we have for things like QA, deployment, server management and so on.