> we calculate the swear factor as the number of swearwords divided by the lines of code
That's what I suspected. Assuming that most swear words will be contained in comments, what this is actually measuring is the ratio of comments to code. In other words, code that is more heavily commented is better.
I think we already knew this.
That said I would like to see a more critical analysis. First control for comment density. Then compare code quality to swearing in comments and also variable names.
I tend to focus more on documenting the surprising code paths, not the mundane. And when my code needs to do something special because some other component (library, hardware, API) has issues, there's usually some colourful language describing the sad state of the world outside my control.
Who's "we"?
In my many years of software development, I've found a very large fraction of developers use very few, or even zero comments, and it's getting worse. Just look at the posts below here: there's a bunch of people arguing that comments are useless or harmful. It's no wonder that software sucks so much these days, since apparently no one believes in documentation or code maintenance any more.
I think this comment explains why software gets worse in many cases:
> Of course nowadays, this is legacy nonsense. Everything uses UTF-8 for "char", and what doesn't is broken and terrible anyway. But the old ways stayed with us, and the stupidity of it as well.
The problem is the "legacy nonsense" tends to accumulate over time & as people depend on it, takes a long time to finally remove.
> They are so hilariously misdesigned and insufficient, I can't even fathom how this shit was _standardized_.
They did their best given their circumstances & abilities. Now we must forever pay the price.
> Several decades later, the moronic standard committees noticed that this was (still is) kind of a bad situation. Instead of fixing the situation, they added more garbage on top of it. (Probably for the sake of "compatibility").
At least they tried...
> All in all, I believe this proves that software developers as a whole and as a culture produce worse results than drug addicted butt fucked monkeys randomly hacking on typewriters while inhaling the fumes of a radioactive dumpster fire fueled by chinese platsic toys for children and Elton John/Justin Bieber crossover CDs for all eternity.
Yeah! Time to get back to work...
Credit to https://news.ycombinator.com/item?id=36626018 for pointing this out.
It could also be that understanding code in any non-trivial project is likely to back the developer into a corner where they become frustrated and swear at the computer.
More importantly, the lack of swearing might be a sign that the devs lack the competence to know when they are cornered.
I wouldn’t be surprised if code quality goes up with comment curses and down again with commit message curses.
You can really feel the author's rage at the state of the world.
Saying "control for comment density" presumes one knows how to even do that or how to even define it.
How do you decide that a given line of code or comment should weigh more or less than another?
If a codebase has both a lot of swear words and a lot of all other words, so what?
I used to work for a bank and the policy is no comments unless absolutely necessary, because comments become out of date. Doxygen is the only real comments allowed.
Code gets out of date as well, so let's just stop writing it altogether..
1. An odd business requirement (share the origin story)
2. It took research (summarize with links)
3. Multiple options were considered (justify decision)
4. Question in a code review (answer in a comment)
And the article on how/what/why in code: https://max.engineer/maintainable-code
// Set foo to true
foo = true;
Somebody saw one too many comments like that and overreacted. As long as you a) don't write comments describing what is self evident from the code, and b) try to make the code as descriptive as possible, then it's fine. Comment away.Inline comments are a reflection on the authors' abilities to write good comments. They can be kinda useless, actually-bad, or really helpful.
One canonical example of a "good comment" is explaining why a strange or not-the-least-complex approach was taken to implementing a certain solution. The code is like chesterton's fence, and the comment is a post explaining why it's there. That way, future readers can better assess for themselves whether it's worth their time trying to tear down the fence.
You can imagine a world where all the projects that aren't realistically going to spend the effort on high-quality maintained comments makes the correct choice to skip comments unless absolutely necessary. And where projects that are realistically going to put effort into high-quality, maintained comments, do so.
In this world, comment density would correlate highly with code quality per line of code. Profanity might not, I'm not sure. I do think you'd still find profanity in high-effort, high-quality, maintained comments, but it might indicate lower quality surrounding code, not higher.
And it would still be unclear whether the existence of comments are a cause of higher quality code, or just a proxy for amount of effort and care taken per line of code.
The result is that I don't read any code at all. The whole thing is compiled to the native format that is human language. The code is great for illustration.
If I keep it in separate files as documentation it takes to much effort to find and update. It takes needles extra effort and is less precise.
It is just a personal preference of course but if one had any experience writing code in any language it should be easy to grasp say at 4 am while drunk.
Too many comments might actually be a bad thing. It's more lines to maintain, and sometimes the comments just tell what the code is doing where there is no need to.
When I worked at SAP where VCS for ABAP is ancient and has no analogue for git blame we had a practice of putting a SAP Note next to every code change, since some of the things that we had to implement are dictated by business/legislation, so you need a proper explanation from time to time. Without it, the code becomes unmaintainable.
I think swearing in comments indicates you are unburdened by bureaucracy and pointy haired bosses (because they prohibit such things), which would certainly lead to better code.
I personally have very different commenting styles between my work and personal projects. Not that any of it's good.
"This is bullshit" is an important realization. If you can't say it, then things will stay miserable.
(But I concede that effort and productivity are not the same thing.)
The cognitive and time cost of compliance for language policing takes away from valuable programming and planning involved in developing solutions. (i.e. "banned words" [swear words] and politicalized words [whitelist/blacklist,etc])
Antoher possibility is the people who don't want to deal with that are gone and we're seeing a loss of their contributions.
Who's the narc on your team that would even point it out? It's not like HR has some commit hook on the repos filtering for this stuff...
Sounds likely to be a classic case of correlation != causation
I'll do mine: there's likely a correlation between needing to maintain a professional conduct which includes forgoing foul language (you're programming at work) and writing code under time pressure where getting a product ready for release is more important than strict adherence to clean programming practice (you're programming at work).
Everyone post your favourite conjecture!
Take almost any two things like this and you're actually virtually guaranteed to draw out some weak, but quite likely statistically significant, correlation.
What lies behind that correlation is probably a entropic mishmash of so many factors that it defies human explanation, and also, defies any attempt to try to "harness" the forces that seem to appear. It could be that all the siblings to the comment are right all at once.
I'll cop to just glancing at the graphs, but they don't look out of line for this effect to me intuitively.
Also backing this is that more-or-less the same article/thesis could easily have been written for the opposite correlation.
Places uptight enough that developers never swear in comments are uptight in other ways that lead to poor team dynamics which hinders quality.
The other extreme: if you have no idea what you are doing, you might try to mimic "corp speak" in your code to hide the fact that you actually have no clue.
In other words: it needs some confidence in your ability to assess some aspect of the code in order to use swear words.
I thought this was cool, and was talking excitedly about it to my boss and some of the senior devs. They were less amused. Cut 20 years later and I too am less impressed by this.
Not that I think it's *bad* per se, I'm not clutching pearls or anything. But I never find myself thinking what the code really needs are profanities in the comments. Whereas back then I thought it'd be funny/cool and went out of my way to do so when I could. Which wasn't often.
On the other hand, I'd like to write something like "this is a bit shit but will be replaced later" because that's how I naturally speak. Sanitising it to "crap" or "poor" just makes me feel like I'm teaching a youth club or something, and it is a minor pipeline stall in my train of thought while I do a mental synonym search
It was the most painful code review where I asked someone to remove a joke they wrote in the comments. It was a good joke, funny, short, in good taste, I loved it, but.. distracting and unnecessary.
if (*some_bullshit >= shit_tolerance){
fucks_given = 0;
exit(IM_DONE);
}
I would be curious to see the ratio of swearing in comments vs code identifiers. I'd also be curious to see if the repos with swearing in their comments just have more comments in total. Perhaps the correlation is, "code with more comments is more likely to be higher quality".
It makes me happy that it remained being called that for quite awhile.
They were not great people, and I'd happily kick them in the face if I would encounter no legal or professional repercussions, but, there definitely does seem to be some correlation (in my experience) between being abrasive and being a skilled programmer.
I'm withholding my own judgement on that.
For anyone curious, the authors are coming up with a code quality score using an open-source tool called SoftWipe[0]. From the paper:
> SoftWipe is an open source tool and benchmark to assess, rate, and review scientific software written in C or C++ with respect to coding standard adherence. The coding standard adherence is assessed using a set of static and dynamic code analysers such as Lizard (https://github.com/terryyin/lizard) or the Clang address sanitiser (https: //clang.llvm.org/). It returns a score between 0 (low adherence) and 10 (good adherence). In order to simplify our experimental setup, we excluded the compilation warnings, which require a difficult to automate compilation of the assessed software, from the analysis using the --exclude-compilation option.
This was standard practice, and the M&A policies knew that there was no way to actually understand all the code so there was a policy document to describe what to look for.
Of course the red flag things were unexpected 3rd party copyrights and/or license terms in case the code was encumbered.
But "swear words" were on the yellow flag list, in addition to "ToDo", "XXXX", and "Fix Me" types of things.
I remember thinking about places I have been in the past and that the people used those style comments tended to be the better programmers.
I mentioned this to the person leading the evaluation, and was told that point of noticing these kinds of comments was to look a more closely at the nearby code and try to decide if major functionality was missing or being faked.
It all worked out for that acquisition, but I remember being curious about whatever deal had gone bad in the distant past that made them codify this specific practice.
I'm fond of pointing out, despite every time I get downvoted, that causation is the thing we have no knowledge of, and therefore correlation is all we have. As Feynman said about gravity, there is no how or why to gravity, as far as we know it's simply a property of matter. But of course, that means we only know that because of the perfect correlation between matter and gravity, including every time we conduct an experiment about it; but still we have no cause to point to.
> This means that swearing will not automatically improve the quality of your code.
When it’s clear someone was stuck, frustrated, banging their head against the wall etc while writing a particular bit of code, you can refactor a lot less defensively because you know the crappy parts weren’t secretly there for a reason.
I love real, honest, emotional comments. Pour all the frustration in there. Future you and your colleagues will thank you.
Everyone swears sometimes. If you never do it in front of others, it signals that you're always filtering yourself.
How long will it be before someone who doesn't understand causality starts encouraging developers to write profane comments? It wouldn't be any more absurd than lots of other non-causal behaviors I've seen pushed because somebody successful does them.
I'd need more evidence of that. My anecdotal experience is that saying "fuck" a lot is indicative of a lack of imagination. For example, Winston Churchill's legendary devastating insults, with no profanity necessary.
“What the f… heck is this, kiddo?”
Definitely gotta utilize those brain wrinkles in that case.
https://www.osnews.com/story/19266/wtfsm/
One wonders if profanity in the source code interferes with reviewers and skews this important metric ...
I would read that study as coding standards lead to profanity. (Not sure wether or not coding standards should be correlated with code quality, I just think it is obvious that the measure is correlated with the conclusion in an obvious way.)
[Post posting:] Also looking at the plots, it seems that the two distributions are different, first the swear word distribution seems to be wider and second it has a clear outlier at "software quality" 8, so if anything it is an indication that something much more complex is going on.
1. Passionate developers often swear more often when they feel safe to do so
2. Developers work better in a "safe environment" where they are not judged / forced to follow other guidelines by social or employment pressure.
And another point : those places where it's unsafe (often due to managerial micromanagement) are miserable places to work. That can drive away skilled developers or suppress them if they remain.
All this is assuming the research metric is real, though I'm not sure it is. If the metric for "code quality" is actually "precision following a coding standard" you'd have though that rigid adherence to procedure would lead to a higher score?
I'd hypothesize, that programmers, who actually care about quality, swear more.
Individuals with AD(H)D might have a have a lower tolerance to pain. This, coupled with wide open sensual channels and decreased impulse control, might be a contributing factor.
[Edit] added parenthesis and link
Not correlated to swearing, but AD(H)D:
Ding ding ding, I think we have a winner!
If you're not moved to profanity by most code-bases, you're either not paying attention or don't understand.
Swearing was more abundant in the earlier days and the code that survived until today is probably better that what got lost along the way.
In general the coding population has grown, we're more used to coding in corporate settings with code reviews, commit message processing etc. and the bulk of devs aren't just as emotional in writing their comments (some will still swear like sailors, but it's not the norm)
> The study relied solely on the source code written in C.
This in particular, probably reduced the number of hobby and beginner's project in the study.
#define fuck if
#define shit else
#define ass returnhttp://www.art.net/~hopkins/Don/unix-haters/x-windows/motif....
Why not use the full range of one's vocabulary?
Or by "full range" did you mean "limit it to a few well-worn cliches"?
"Which idiot wrote this crap?
You did!
Which idiot hired me?"
I think this also points to the statistical significance. Code that has been worked over a couple of times and/or has been worked on by different people for all those hard and fringe problems will be better, but also accumulate more comments venting the trouble people had fixing them. It does not seem very interesting.
But one of my favorite projects to ctrl-f for "fuck" is in the jedi outcast source code. Since it is proprietary and was a good game: https://github.com/search?q=repo%3Agrayj%2FJedi-Outcast+fuck...
i = 0x5f3759df - ( i >> 1 );
in the results are one of those inverse-square floating point bit tricks.lolwat
Swearing in code, however, is much easier to quantify, and of course chosen to chuff up those who think swearing itself is a virtue.
It would be a mistake to draw the conclusion that allowing swearing in code will improve code quality.
Someone accidentally a verb.
Wow, over 3800 code? Thats so many code! And its my study? Even better!
It makes it quite.
Perhaps I could use this as an excuse for not reaching a deadline...
Schrodinger's chat (room).
And probably, increases the chance that the person is fed up with fixing someone else's code - hence the anger
https://opensource.apple.com/source/emacs/emacs-59.0.80/emac...
1990-08-26 Richard Stallman (rms@mole.ai.mit.edu)
* terminal.el: Move possibly offensive comments to term-nasty.el.
https://www.digiater.nl/openvms/freeware/v10/emacs/common/li...
[...]
;; disgusting unix-required shit
;; Are we living twenty years in the past yet?
(defun te-losing-unix ()
nil)
[...] ;; (A version of the following comment which might be distractingly offensive
;; to some readers has been moved to term-nasty.el.)
;; unix lacks ITS-style tty control...
(defun te-process-output (preemptable)
;;>> There seems no good reason to ever disallow preemption
(setq preemptable t)
[...] ;; I suppose if I split the guts of this out into a separate
;; function we could trivially emulate different terminals
;; Who cares in any case? (Apart from stupid losers using rlogin)
[...] (?\C-b . te-backward-char)
;; should be C-d, but un*x
;; pty's won't send \004 through!
;; Can you believe this?
[...] ;; Did I ask to be sent these characters?
;; I don't remember doing so, either.
;; (Perhaps some operating system or
;; other is completely incompetent...)
[...] ;;-- Not-widely-known (ie nonstandard) flags, which mean
;; o writing in the last column of the last line
;; doesn't cause idiotic scrolling, and
;; o don't use idiotische c-s/c-q sogenannte
;; ``flow control'' auf keinen Fall.
"LP:NF:"
;;-- For stupid or obsolete programs
"ic=^p_!:dc=^pd!:al=^p^o!:dl=^p^k!:ho=^p= :"
;;-- For disgusting programs.
;; (VI? What losers need these, I wonder?)
"im=:ei=:dm=:ed=:mi:do=^p^j:nl=^p^j:bs:")))
[...] (setq te-process
(start-process "terminal-emulator" (current-buffer)
"/bin/sh" "-c"
;; Yuck!!! Start a shell to set some terminal
;; control characteristics. Then start the
;; "env" program to setup the terminal type
;; Then finally start the program we wanted.
(format "%s; exec %s"
te-stty-string
(mapconcat 'te-quote-arg-for-sh
(cons program args) " ")))))
[...] ;;;; what a complete loss
[...]https://www.digiater.nl/openvms/freeware/v10/emacs/common/li...
;;; term-nasty.el --- Damned Things from terminfo.el
;;; This file is in the public domain, and was written by Stallman and Mlynarik
;;; Commentary:
;; Some people used to be bothered by the following comments that were
;; found in terminal.el. We decided they were distracting, and that it
;; was better not to have them there. On the other hand, we didn't want
;; to appear to be giving in to the pressure to censor obscenity that
;; currently threatens freedom of speech and of the press in the US.
;; So we decided to put the comments here.
;;; Code:
These comments were removed from te-losing-unix.
;(what lossage)
;(message "fucking-unix: %d" char)
This was before te-process-output.
;; fucking unix has -such- braindamaged lack of tty control...
And about the need to handle output characters such as C-m, C-g, C-h
and C-i even though the termcap doesn't say they may be used:
;fuck me harder
;again and again!
;wa12id!!
;(spiked)
;;; term-nasty.el ends here
Note to the gentle readers: "wa12id" stands for "with a 12 inch dildo".Jamie Zawinski kept Lucid Emacs nasty:
https://groups.google.com/g/gnu.misc.discuss/c/U5oXKOfWinQ/m...
Noah Friedman, Aug 3, 1992, 4:54:20 AM
In article <15i2n9...@hal.com> wood...@hal.com (Nathan Hess) writes:
>In article <FRIEDMAN.9...@nutrimat.gnu.ai.mit.edu>, friedman@gnu (Noah Friedman) writes:
>>It's by no means necessary, but it's funny.
>Along the same lines, look at lisp/terminal.el
Of course, terminal.el is actually useful, albeit not terribly powerful.
(and terminal.el is pretty mild compared to some of the other things I've seen written by mly. :-))
Incidentally, a lot of terminal.el has been rewritten in version 19.
Too bad... I liked all the variable names and comments in the original.
Jamie Zawinski, Aug 5, 1992, 12:40:38 AM
In the FSF-distributed Emacs 19, the obscenities (will) have been stripped from terminal.el, though they are preserved in a file called term-nasty.el, to avoid appearing to bow to the censors.
In Lucid GNU Emacs, terminal.el will remain as nasty as it ever was.
-- Jamie "Truth, Justice, and the Fucking First Amendment" Zawinski