Want cleaner code? Use the rule of six (opens in new tab)

(davidamos.dev)

319 pointsda123y ago333 comments

333 comments

219 comments · 80 top-level

kouteiheika3y ago· 34 in thread

I don't necessarily agree with the step of putting the code in a separate function; that often works, but just as often makes it so that the code can't be read top-to-bottom anymore which hurts readability.

In this case there's, I think, a better alternative; the equivalent-ish code in Ruby for the example code here would be something like this:

   values = s
      .partition('?')[-1]
      .split('&')
      .map { |key_value| key_value.partition('=')[-1] }

You can write these nice functional pipelines where you just read the code top-to-bottom and see step-by-step what is being done to the data on each line. You don't have to jump up-and-down around the code when reading it, and you don't have to keep too much context in your head when reading it.

This is one of the reasons why I vastly prefer Ruby over Python for most data processing tasks. I wish more languages would support this style of programming.

davnicwil3y ago

> the code can't be read top-to-bottom

The idea of the technique is to split out code at a different level of abstraction with a clear name communicating what it does, while hiding the details of the how, because you don't need to care about that detail at all to fully grok the code in the calling function.

Where this breaks down is when the code you're trying to split out is not at a different level of abstraction, and how it works is meaningful to the surrounding code in the calling function.

So I think the issue you are seeing isn't with the technique, it's with the technique being misapplied. I think this is likely the only difference between when it 'often works' and 'just as often doesn't' in the code you're working in :-)

lamontcg3y ago

Each function becomes something new that needs to stick in your brain.

Someone that applies "MORF" to their code winds up nearly inventing their own language in the file that they're writing. All that takes up more memory when you're reading their code, because due to leaky abstractions the actual implementation of whatever the function name that you replace it with is often important.

I have an actual track record of taking code that someone had MORF'd to hell and rewriting it, and making it about 40% shorter, with much fewer concepts to process.

Inventing a term like "MORF" is probably illustrative of the problem itself. Without looking at the blog post what exactly was that acronym again? That is just one more thing for you to try to memorize. The author is riffing on things like "DRY" and "YAGNI" that are well-known, but it isn't really helping with readability when you lift it out of that context.

5 more replies

wpietri3y ago

This is my experience too. I love hiding detail for readability, but you have to hide the right details!

It's also definitely related to testability for me. If I'm pulling out the right details, then I'll often get a nice cluster of tests that pin down the higher-level concept in a way where it's both the production and test code that gets more readable.

kristopolous3y ago

The problem is those who create the messes to need that advice probably will take that advice and create even bigger messes

In practice, pithy adages don't get us any closer to sanity

Really the only news you can use is

1. Try to modify things with code you didn't write by not simply throwing parts away

2. If you think it's really difficult to deal with, figure out if other people agree with you.

3. understand why everybody thinks this

4. If you are doing that in your own code, then stop doing that.

You have to viscerally understand why a practice is bad and how doing it affects other people.

This is how you can intuitively avoid such practices in the future. Not through things that rhyme or acronyms that spell words but through social intelligence. It's fundamentally behavior

danmaz743y ago

I understand the idea, but it requires:

* great ability at naming methods - which isn't very common * massive discipline at RE-naming methods when they get changed even slightly to do something more

Regarding the latter, I've seen so many times methods which originally described what they were doing, but now they don't any more, that I never just trust the name to tell me what a method really does.

1 more reply

shellback33y ago

I agree, my thoughts were along this line especially when I noticed the 'magic' number -3.

khendron3y ago

One of the advantages of functions is that a well-named function is self-documenting. If you can take a bunch of lines and wrap them in a function whose name summarizes exactly what it does, then you have improved readability in my opinion. In this example, I don't really need to know the details of how the query parameters are extracted. I just want to know I've got them.

marginalia_nu3y ago

Emphasis on well-named. Naming things is hard.

Maybe not relevant in simple toy examples, but you don't have to look far until to find a function that isn't so easy to name.

3 more replies

rocqua3y ago

Those functions get more annoying when debugging though. Because now you have to jump to the body of the function to see if it really does what it says. And you view the body out of context so it is harder to see if there is a wrong assumption between the callee and caller.

On the whole I think such functions are valuable. But they do have downsides.

1 more reply

patrick4513y ago

Yeah, but these little 5 line functions with one caller whose only purpose is to avoid a 50 line function somewhere else are almost never named well. At best, I can infer about 20% of what is going on without stepping into all those little helpers.

layer83y ago

One issue with functional pipelines is that the reader has to keep track of what the types and the data are on each line. It’s fine for 2-3 lines, but it can get non-obvious quite quickly. Assigning intermediate points to named variables can be appropriate, or indeed factoring portions of the pipeline out into separate functions.

mdaniel3y ago

That may be true for less capable editors, but IntelliJ (and therefore Rubymine for the cited code block) annotates the stream type variable when it can prove what it is: https://www.jetbrains.com/help/ruby/viewing-reference-inform... regrettably doesn't show an example of what I'm talking about

    .map { | *String* key_value | key_value.partition... }

where String shows up in light grey text indicating that IJ knows `key_value` is a String

2 more replies

WastingMyTime893y ago

I think it is compounded in the article exemple by the flow of the code being non obvious. The map applies to code in its last argument. The lambda is the last action and the url parsing will happen first but is last in the line.

I have far less issue with the piping operator in Ocaml (|>) which works exactly like a shell pipe because the code is in sequential order and that makes a huge difference.

Python like JS weren’t designed as functional languages and it shows in their syntax. Still grateful for the added functionality however.

e_i_pi_23y ago

Definitely a good point, refactoring is great and has a bunch of benefits but can also be overdone and create problems. I've heard/read Sandi Metz talk about this as "The Wrong Abstraction"[1].

The basic argument is that any time you do an extract refactor you're creating a new layer of abstraction that the next reader will have to learn and understand. This can also get worse over time as the abstractions drift away from their original purpose.

The solution she provides is to be okay with a little bit of duplication, then as patterns naturally arise in the codebase you can refactor when you know a few use cases and can clearly define the concept.

[1]: https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction

chrisweekly3y ago

Yes! IME (24y and counting in the profession) devs reach too quickly for DRY while neglecting its counterbalancing principle: AHA (Avoid Hasty Abstractions).

1 more reply

xigoi3y ago

Nim, D, VimScript and other languages have “uniform function call syntax”, which allows you to chain arbitrary functions like this (not just the ones that the author decided to declare as methods, like in Ruby and other class-oriented languages).

irq-13y ago

I wonder how much of Go's simplicity comes from having "package.function" syntax that doesn't allow you to put too many functions in a single line? People complain about Java because of the length of the names. Do they complain about to many functions in line? Maybe with ( ? : ) Lisp has )))))) which seems like a stupid complaint, but maybe that's tied to people having to remember too much?

WalterBright3y ago

D does too:

    import std.algorithm, std.array, std.stdio;
    // Print sorted lines of a file.
    void main()
    {
     auto sortedLines = File("file.txt")   // Open for reading
                        .byLineCopy()      // Read persistent lines
                        .array()           // into an array
                        .sort();           // then sort them
     foreach (line; sortedLines)
         writeln(line);
    }

somehnguy3y ago

Looks similar to Streams in Java. We try to use that style where appropriate as it is much more readable and compact than imperative style imo.

vbezhenar3y ago

For me that's questionable. Yes, there's code which fits finely with streams and looks very readable. Yet there's code which looks like it was shoehorned into a Procrustean bed and looks much better with ordinary loops.

jonnycomputer3y ago

I like this style too, though it can make debugging trickier.

wizofaus3y ago

It's been my feeling for a while that debugger technology hasn't really caught up with newer styles of coding (despite them having been around for over a decade). Only being able to set a breakpoint at a line-level or watch values that are assigned to a named variable is incredibly limiting.

1 more reply

atoav3y ago

I build most of my classes that way in python.

    class Motor:
        def __init__(self):
            self.max_speed = 10.0
            self.clockwise = True
            self.controller = Controller()

        def set_max_speed(self, max_speed: float=10.0) -> 'Self':
            self.max_speed = max(0.0, max_speed)
            return self

        def start(self) -> 'Self':
            self.controller.start()
            return self

        etc..

Then you can use it as follows:

    motor = Motor().set_max_speed(5.0).start()

There is nothing stopping you from building iterators that work the same way

BiteCode_dev3y ago

I don't know, it doesn't seem very far form the python version:

    values = (
        key_value.partition('=')[-1]
        for key_value in 
        s.partition('?')[-1].split('&')
    )

oriolid3y ago

This is already showing why the the Python list/iterator comprehension syntax isn't great. The part "s.partition('?')[-1].split('&')" reads from left to right, and then the rest is read from end to beginning. It gets even more confusing with nested comprehensions. In my opinion both the dotted pipeline style and the Lisp style where you always start from the deepest nesting level are both more readable than Python's approach with sometimes from left to right, sometimes from right to left and sometimes from middle out.

leobg3y ago

Any way to do that in Python? Basically an anonymous function across multiple lines, which can be collapsed in the IDE view?

gilch3y ago

Python's lambdas can have as many lines as you want. Just wrap parens around it. Hissp uses this form as a compilation target. Its REPL shows the Python compilation. Play around with it til you get it: https://github.com/gilch/hissp

1 more reply

vorticalbox3y ago

Not exactly the same but in python you can use pipe to create pipelines

https://pypi.org/project/pipe/

watwut3y ago

No, this pipeline is not more readable. It is a annoying to have to constantly have to read it. It makes it harder to figure out what larger algorithm and design is.

KerrAvon3y ago

This style is increasingly common in Swift, especially for Combine pipelines and SwiftUI modifiers.

_8j503y ago

Sub routines are underused. Make it a function right where it is?

tomlin3y ago

Ruby maps are so ugly.

  .map { |key_value|
  key_value.partition('=')[-1] }

Reading this literally makes me sick to my stomach. Language design is much more important than language popularity, although it will be popularity that wins. (Yay downvotes for pointing out things everyone can see - highschool dynamics)

samtheprogram3y ago

Just because it’s a different syntax than you’re used to reading in another language doesn’t make it ugly. If you’re used to reading it and work in the language regularly, it actually looks quite clean.

This sounds like a Windows user who can’t stand macOS because they don’t know where anything is.

Your post downvote edit assumes your opinion here is objective. It isn’t.

1 more reply

xtracto3y ago

I'm curious , can you show an example of how to achieve the same in another language that you will consider pretty?

noncoml3y ago· 12 in thread

We break everything down and then we reach one of the most difficult problems in software engineering: Coming up with good and short names for all these extra intermediate variables and functions.

mike_hock3y ago

Yes, and a source file littered with tiny helper functions that do very specific things and don't make any sense except in the precise context in which they get called, isn't necessarily more readable.

Here, "query_params" means "extract the last three query parameters, raw (i.e. not unescaped and not broken into key-value pairs)." The transformation shown makes precisely nothing more readable or easy to understand. "The second argument to map()" is just as easy for your brain to group into a black box to be analyzed later as a call to an opaque "query_params" function that you need to read the implementation of to really understand what the code is actually doing.

Of course sometimes it's the best solution to just extract local helper functions, especially if the actual function just becomes too unwieldy and/or the helpers are called from more than one place, but in general I try to extract things that do something more general than the thing I'm extracting it from and have an interface / a purpose that's easy to understand and describe on its own.

To stay with the example, actually extracting the query parameters would be a generic, extractable utility. Half-extracting the last three parameters because the function I'm writing needs precisely that for some reason, is a local helper function, and I'd only extract it if there's a good reason, certainly not to make an already trivial function no easier to read.

lbriner3y ago

I wouldn't attack the example too much, it does seem a little contrived to make the point - the very thing I don't like about contriving examples!

It is hard to know whether the principle is valuable with such a weird example. In this example there are lots of other ways it could have been done more meaningfully but, again, don't know if the example is real.

1 more reply

overgard3y ago

Your mileage clearly varies, but I found the transformed example much easier to understand. While I suspected it was parsing a query string from the initial code, having that stated explicitly in the variable removed the guessing. I think the main problem is he just didn't go far enough, there was still more to deconstruct.

I suppose the function to parse the query string could have been better, its name isn't very descriptive, and the method with which it parsed wasn't very obvious either (I'd expect to get back a dict or a list of key/value tuples, not a list of strings)

I know a lot of programmers are against comments, but I also think this is exactly the kind of code where a comment is handy..., the purpose of the [-3:] part wasn't obvious to me at all.

2 more replies

onion2k3y ago

short names

You don't really need short names. I wouldn't advocate going full Java naming but trying to compress names just to save a bit of typing is unnecessary. Your IDE will help you out. Just learn to press tab when you've entered enough of the name instead of typing the whole thing.

patrick4513y ago

Short names are easier to read, because they fit on fewer lines. Doubly so if the statement fits on one line.

3 more replies

jstimpfle3y ago

Typically, relatively unspecific names like "i" or "size" are good enough. It's better than not naming at all and producing a complicated expression tree instead. More specific names cost energy, both inventing and reading them (because they are typically longer). Err on the side of short and not too specific.

kevin_thibedeau3y ago

It depends. Go spelunking through old Unix and Gnu code from the 80s and you'll see a lot of maddening usage of single and double letter variables all over the place where a descriptive name would make things much more readable.

1 more reply

leetcrew3y ago

in most situations, I would rather see a complicated statement split over several lines than several simple statements with vague/unhelpful variable names. if the variable name itself doesn't help me understand what it means, I have to remember the full expression anyway.

mixedCase3y ago

It takes some deliberate practice and being able to create decent contexts within your code.

It's not a trivial problem but it's not a hard one. It's just that most people don't even try to dedicate a sliver of active brain power to the task because they don't deem it worth it even if they claim to agree on the importance of readability.

nicwolff3y ago

Crisis == opportunity ツ

TFA missed the point of splitting complex expressions into separate lines: naming the single-use vars clearly makes the whole calculation easy to follow. In the example given, nothing about the one-line `map split over split of split` tells a reader that it's parsing a query string – just splitting it in two and naming the temp var `query_params` makes it clear.

Although `last_3_query_params` would be more precise, and something that explains why TF you'd want that would be better... ツ

quickthrower23y ago

React makes it funner. [score, setScore] = useState … means you gotta think of another name for any intermediate “score”-like sub calculations in all the scope levels of the functional component. (Or yuk keep track of the scope you are in mentally). Then throw the firebase api on top of that with it’s myriad intermediate step api and fun fun fun.

zhte4153y ago

Just be consistent, whatever it is.

standardUser3y ago· 10 in thread

I see a troubling trend with some coworkers where they seem to stretch the limits of time and space to make every line as dense as possible, usually using lodash. I think it is a point of pride for them, but I think it's obvious that everyone's life would be easier if they just wrote their code out "long form" and, god willing, added some comments for various steps. Instead, I find myself having to re-write ultra-dense blobs of code in order to debug or even simply understand what's going on.

f1shy3y ago

Once I read something along the lines of “every programmer goes through that phase were we wants to show how clever he is, by writing whole programs in one line. Until he understands how stupid that is”. I do not have the source, regrettably.

ethbr03y ago

I was taking my first multi-threaded resource allocation course when I first ran into the famous Kernighan quote.

> “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”

It clicked and instantly disabused me of the notion that smart people write code that's any smarter than the minimum required to solve the problem at hand.

1 more reply

whynotminot3y ago

I think there's a real smell with those long, dense lines of code. Tends to mean your data structures are out of control: objects with arrays that point to other objects that then also have arrays on them. My oh my.

Comments being required are also another smell that the code doesn't explain itself. I know this is said so often it's a cliche, but it really is true.

I think both of these things point back to the same problem: out of control data structures.

guenthert3y ago

Sure, if the computer can figure out what a given fragment of code is supposed to do, so can you (a sufficiently clever programmer). The question rather is, do you spend 20s reading a comment or 15m to solve the riddle?

There's a real danger that comments aren't updated when code is, particularly if 3rd parties make those changes. This is one of the corners where there will never be a single answer which is right in all circumstances.

3 more replies

syntheticcdo3y ago

Do your co-workers use lodash's chain functionality? I've found it useful to achieve kouteiheika's ideal from the current top comment, with top to bottom readability without too much mental overhead.

As an example, given:

  const sales = [ {month: "Jan", day: 1, total: 120 }, ... ]

You could determine, say, the highest sales day of a given month as follows:

  const highestSalesDayByMonth = _.chain(sales)
    .groupBy("month")
    .mapValues((salesForMonth) => _.maxBy(salesForMonth, "total"))
    .mapValues("total")
    .value()

  // highestSalesDayByMonth = { Jan: 140, Feb: 90, ... }

Naturally, minimizing the complexity of the iteratee functions and carefully naming of their arguments is very important to ease debuggability.

ParetoOptimal3y ago

> I see a troubling trend with some coworkers where they seem to stretch the limits of time and space to make every line as dense as possible, usually using lodash. I think it is a point of pride for them, but I think it's obvious that everyone's life would be easier if they just wrote their code out "long form"

Concision can be used for emphasis as verbosity can be used to obscure.

> Instead, I find myself having to re-write ultra-dense blobs of code in order to debug or even simply understand what's going on.

Is this problem because their method is inherently nee complex or is it due to lack of familiarity?

Perhaps they debug that code differently then you do and it's incompatible with your previous mental model?

professorTuring3y ago

It is not new, it’s been ages since some developers try to show off by bringing “cool” one liners to solve problems. Stretching operators, bringing up imaginative uses for lambda expressions or kind of abusing parts of the language to make some other teammate or reviewer, What did you make there?

I believe, definitely, that they are quite intelligent people that know a lot about the language or maths, but definitely they are not usually making the smartest choice, because you should use “languages” in order for the people to understand you.

So you can call those: “50 cent expressions”

They help no one but their ego…

f1shy3y ago

In the last times I’ve been seeing lots of “show off” with nitty gritty features of C++… lambdas, templates and inheritance in a pattern that reminds me to the characters in an Agatha Christie book, where you need a graph to keep up with it… Hope this era ends soon.

ParetoOptimal3y ago

At least in Haskell, I feel like this doesn't hold true.

Typically more concise code takes advantage of core language abstractions.

It is actually simpler unless you are unfamilar with core language abstractions, but I'd argue that's a you problem.

RajT883y ago

I had a professor who did that. He'd write a solution to the assignments he was giving us, and then spend another 5-8 hours on it trying to make it fit on a single overhead slide.

He was a much beloved professor.

jstimpfle3y ago· 9 in thread

I like how this article explains that "clean" must be "readable for humans". However, the concerns raised are only superficial. It's much more important to get the larger scale structure right. I recommend drawing diagrams and explaining the architecture to humans. Then again, I'm not saying overdo it, because some things are hard to draw, some are hard to explain. In the end, it's important to get a complete understanding of a certain module, and the code should then be relatively easy to write.

I have learned to take a step back when I find myself having a hard time to get the code "clean". Often I put that thing to rest if possible, maybe for days, months, or even years. There could be a simple solution that solves 80% of the problem, and that can ease the pressure coming from the stakeholders. If it kind-of-works and can be produced in a short time, that is much better than going down a rabbit hole for months, coming out at the other side (probably burnt out) with a solution you can't deploy because it's too complicated.

Karellen3y ago

> Show me your flowcharts (code) and conceal your tables (data structures), and I shall continue to be mystified. Show me your tables (data structures), and I won’t usually need your flowcharts (code); they’ll be obvious.

-- Fred Brooks, The Mythical Man-Month, 1975

> a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read and only incidentally for machines to execute.

-- Abelson & Sussman, The Structure and Interpretation of Computer Programs, 1984

sidlls3y ago

The first quote is an over-simplification that often does not hold in practice: one usually needs both. The exceptions are mostly trivial programs.

3 more replies

Shorel3y ago

Well, dealing with graph algorithms, none of them are obvious from the representation. Which is usually a matrix.

In fact, Gauss Jordan is also not obvious, just seeing a matrix.

overgard3y ago

I tend to find that top-down designs usually end up clunky. In my experience, bottom up designs (starting with specific things and creating new abstractions as they're needed) tends to create simpler and more obvious designs. You also don't waste time on hypotheticals, since every line you write has a purpose. This blog post really nails it: https://caseymuratori.com/blog_0015

In that context, I don't think this is superficial at all. If your code is hard to read, I suspect your design is hard to read too.

Sure, for super important architecture decisions you have to make a few top-down (spatial partitioning structures, database decisions, network architecture, etc.), but I think it's generally better to late-bind on those decisions if you can.

jstimpfle3y ago

It is superficial. No amount of trying to pretty up the code can fix underlying design deficiencies. That's what I said, so we're not even disagreeing here :)

I know the semantic compression post, and I don't think it makes a point for bottom-up design. I find top-down and bottom-up to be quite misleading anyway. Someone told me, they don't like to think of things at the "top" and the "bottom". It's data transformations, maybe more like "left to right".

If you design bottom-up, you end up with lots of artifacts you never needed (and likely still missing the ones you can make use of). If you design top-down, you end up with lots of code you don't need. (this is where semantic compression comes in, in my understanding).

I suspect that if you like to think bottom-up, maybe that's because you like it more at the bottom (you are a low-level type of guy, or like to make libraries). If you like to think top-down, maybe you like it more at the top.

I like the semantic compression term because it reduces the act of design to the essentials, without introducing fluff terms or opinions. I find myself doing this compression no matter what kind of code I'm writing.

1 more reply

29athrowaway3y ago

What is easier to read:

a) 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1

b) 20

If all the code in your project was written like a), how would you feel? Does it make your job easier or harder?

I'll tell you how most people feel when they read code that looks like a):

- The author didn't care about other maintainers.

- The author is selfish and does not have empathy for others.

- The author ruined my fucking day.

- Team members are competing by sabotaging each other's productivity.

- If I clean this, by the time I am done, the author would have pushed 10 more commits that look exactly like this and eventually become my boss.

- The author is wasting everyone's time.

- The author is forcing others to volunteer to clean up after them.

- Why does management tolerate code following the a) style? A simple intervention would make it go away and my job would be so much better.

- It's sad that everyone is too busy looking at Jira and nobody cares about the actual fucking product.

- This is slowing everyone down and I have stuff to do.

- The code is error prone, one day I'll break it.

- Why should I contribute quality code if low quality is acceptable?

As you can see, it objectively fucking sucks. It's draining, demoralizing to read, it's frustrating, it wastes people's time, it gives people the perception that nobody fucking cares and the code is everyone's toilet with no trip lever.

And while it's "superficial", it's the surface that all engineers interact with. If I spread superglue over the surface of your kitchen counter and every dish and utensil in your kitchen every day around lunch time, that problem will also be "superficial", but it will ruin your life.

So, the conclusion is: Just fucking write clean code. Shitty code ruins the morale of people who care, who are the people that want to build great things not the ones cashing a paycheck and resting and vesting.

You are not a full-time architect, you are not in business development/marketing/finance or whatever, you are in the fucking engineering department. Your contribution to the business are your deliverables. The "superficial" stuff you talk about is your job. Do it.

"Ah ah ah, you didn't say the magic word!! ah ah ah!" Don't be the fucking Dennis Nedry of the team. Format your code, make it readable by your team and your future self.

Do you want everyone to love you? Write code like this:

https://norvig.com/spell-correct.html

mod3y ago

If your comment were code, I think it's an example of a)

1 more reply

teo_zero3y ago

Your example is artificially designed to make your point, but imagine a case like this:

  # Add the left & right margins
  width = calculatedWidth + 1 + 1

The "+1+1" might indeed be more readable than "+2".

2 more replies

jstimpfle3y ago

What the f* is wrong with you?

1 more reply

upsideDownBlue3y ago· 5 in thread

I enjoyed the article and agreed that working memory places a fundamental limit on the intelligibility of otherwise equivalent pieces of code. As a former psychologist with experience of memory research (though not quite this area), it might be useful to others if I add that:

- The size of the short-term store is normally said to be 7 plus or minus 2 (the 'magic' number 7)

- The Working Memory model has somewhat overtaken the 'short term' memory model, and it is unusual to see them being presented alongside each other like this (though 'short term memory' remains a useful, good-enough metaphor for explaining certain key aspects of memory)

- Chunking is typically viewed as a memory-supported division of stimuli (what you're reading, hearing etc.) into meaningful units based on LTM memory representations. A good example is a chess expert 'chunking' the layout of a chess board with many pieces in perhaps one or two units (e.g. 'It's the mid game configuration of [famous players] in [famous game], except the king's position is different'). We would expect more expert programmers to 'chunk' increasingly large units, I think (e.g. 'Oh, this is just the [famous sorting algorithm]').

- A single chunk is usually considered to take up a 'slot' in short term memory

If anyone wants papers/sources for the above, let me know.

ethbr03y ago

Not high priority, but I'd love any references or names / key words I could look into it with.

I'm traditional wide comp sci by academic training, but spend my day job as a low-code enabler for non-programmers with varied backgrounds.

The working memory model explains and fits well with what I see them get and struggle with in day to day work, and I'd welcome references I could use to optimize my approach.

ooloncoloophid3y ago

An overview of the model and its history: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4207727/

The Wikipedia entry for WM is also very good: https://en.wikipedia.org/wiki/Working_memory

It's a bit tricky to tell what you're doing exactly - perhaps drop me an email if you have any queries (using this account; my original parent post was on a throwaway account because I had login problems).

pramodbiligiri3y ago

Check out Anders Ericsson’s book on Deliberate Practice or Barbara Oakley’s A Mind for Numbers.

gilch3y ago

This means it should be the rule of five. We can't count on everyone having full capacity all the time.

ooloncoloophid3y ago

I see what you mean - five will fit with more room left over. But much like a hand that has a capacity to hold 7 or so marbles, if you grasp only five at a time, your overall productivity will (almost certainly) slow.

3pm3y ago· 5 in thread

Reminded me of 'Object Calisthenics' by Jeff Bay. Basically an exercise for a toy project where you adhere to 9 rules:

1. Only One Level Of Indentation PerMethod

2. Don’t Use The ELSE Keyword

3. Wrap All Primitives And Strings

4. First Class Collections

5. One Dot Per Line

6. Don’t Abbreviate

7. Keep All Entities Small

8. No Classes With More Than Two InstanceVariables

9. No Getters/Setters/Properties

https://williamdurand.fr/2013/06/03/object-calisthenics/

BlargMcLarg3y ago

>Wrap All Primitives And Strings

Gah. I've seen the other side of this, a few people far too trigger happy to make FivePlusVeryLongNounVO/DTO for every little thing, and it gave me some new appreciation towards tuples and primitives. Sometimes you really don't want to go into another new file for an object type which is used in only one specific place. Especially with

>Don’t Abbreviate

Meaning the variable name will end up long anyway. With tuples, you get deconstruction without the hassle, too.

3pm3y ago

> Gah. I've seen the other side of this, a few people far too trigger happy to make FivePlusVeryLongNounVO/DTO for every little thing, and it gave me some new appreciation towards tuples and primitives. Sometimes you really don't want to go into another new file for an object type which is used in only one specific place.

The rules are an exercise for a toy project. Like all similar 'rules' they are just hints to make you think. When done with the exercise, and you see a string with a social security number in a production code, you may consider creating a dedicated SocialSecurityNumber class. The class will guarantee a well formed social security number according to official rules. The class may even offer Area, Group and Serial parts of the social as separate fields. The class may decide to use a string or integers internally, but that would never be exposed to the class consumers. All the code that uses SocialSecurityNumber will not have to guess whether string is valid, if it has dashes etc. The same reason you use built-in types like an Integer (as oppose to a tuple of 4 bytes or 32 bits).

1 more reply

overgard3y ago

Oof, these all seem absurd to me.

> 1. Only One Level Of Indentation Per Method

One level of indentation just leads to an explosion of tiny one-use methods with weird names, and now you can't read the code linearly. You will almost certainly never reuse these tiny methods, especially since you're likely consigning them to an instance of a class instead of a free function, so all you've done is forced people to jump around a lot.

> 2. Don’t Use The ELSE Keyword

Not using the else statement just obscures the fact that there's a branch in the code. Obscuring something important seems to be the opposite of what you should do.

> 3. Wrap All Primitives And Strings

Ugh, that seems verbose and clunky, especially in a language like Java without operator overloading. I'm all for type aliases or typedef's, or, creating a class if the builtin primitives don't work (I think a Money class makes sense because you don't exactly want to use a float, for instance). But just putting wrappers all over the place sounds grotesque.

> 4. First Class Collections: Any class that contains a collection should contain no other member variables

Why even have a class then? Why not just have functions that operate on a collection? It's much more generic that way, since if you're using iterators or an abstract collection interface, you can potentially allow the user to choose the exact data structure, and you avoid the ceremony of creating a new type that's again just a wrapper.

> 5. One Dot Per Line... Basically, the rule says that you should not chain method calls.

This is the first one I roughly agree with, but I wouldn't consider it a hard rule. Chaining .map and .filter together for instance is a very common pattern.

> 6. Don’t Abbreviate

min/max is just as clear as minimum and maximum. I'm not using "index" in my for loop when "i" will do. "n" is perfectly well understood as a count of things. Abbreviations when used properly make code easier to read, not harder.

> 7. Keep All Entities Small... No class over 50 lines and no package over 10 files

Ok, assuming the problem can't be simplified, all you've done is now fractured all that functionality into tens/hundreds of files. How is that easier to follow? Sure, there's balance in all things, but I'd probably rather read a 1000 line class than 20 small files split over 2 packages.

> 8. No Classes With More Than Two Instance Variables... I thought people would yell at me while introducing this rule, but it didn’t happen

They were being polite. I'll do it for them. What the fuck?

The example he gives is also awful, where instead of using a string for name, he makes Name a type (ugh) with FirstName and LastName. Not only is that overly ceremonial, but it's wrong, there are plenty of names from various cultures that do not fit cleanly into FirstName and LastName. Also, what happens if he wants to store a MiddleName? That's three instance variables! Ohno! OR what if the person has like 10 middle names (this shit happens). Are we going to have 5 nested data types for that?

> 9. No Getters/Setters/Properties ... My favorite rule. It could be rephrased as Tell, don’t ask.

My brain feels like it's going to explode.

> It is okay to use accessors to get the state of an object, as long as you don’t use the result to make decisions outside the object.

Why else would you want to get the state of an object?

> Any decisions based entirely upon the state of one object should be made inside the object itself.

If your classes are 50 lines long, I guarantee you that other classes will be making decisions on other objects behalf.

> Then again, they violate the Open/Closed Principle.

I think the industry is largely realizing that this is a bad principle, as it implies inheritance. I think most people outside the enterprise java world now realize that using interfaces or free functions is largely better.

3pm3y ago

Please keep in mind that these were rules that you apply to a _toy_ project. You apply the rules once, blindly, even if they don't make sense to you at the moment. Then, when you work on a real thing, you may remember, for example, to create a dedicated PhoneNumber class with strict rules (e.g. E164) instead of a string that gets shuffled around with no one really knowing whats inside. Or you just forget the rules as a nonsense and move on.

You seem to be criticizing the 'rules' as if they are suggested for production code. You couldn't be seriously thinking someone suggest maximum-of-2-fields as some sort guideline for the real world.

bruce3434343y ago

Draconic

didibus3y ago· 4 in thread

What the author is missing is that easy to read/reason/understand about is within the context of making a change to the code to fix a bug, add a feature or make some non-functional improvement to it.

This is what most of the "easy to read" articles forget.

Show me why it is easier to fix a bug, add a feature or make a non-functional improvement to the code with their style than without.

For example, if you've extracted something into its own function, are you then sharing this function and using it in other places as well? If you then change the body of that function, are you now possibly breaking other parts of the code that relied on its old behavior?

If you've introduced a local mutable variable in between two lines, are you then mutating that variable prior/later? Is the query_params different at the end of the function then in the middle? Can you safely use it again?

How easily can you now introduce new behavior before, in the middle, after, and anywhere in-between?

When you modify the behavior to fix a bug, add a feature or make a non-functional improvement, is it an isolated change? How many tests break? Did it require major refactoring to make or very few things had to change? How easy was it to add a test for your new behavior? Was it easy to find the most appropriate place in the code to make the change? Etc.

Sure sometimes maybe you just read code for the fun of understanding what it does, but almost always in practice when you're working on a code base, you only care to understand and reason about the code because you're looking to deliver that next sprint task that involves changing something about it.

I wish more people focused on "easy to change/modify" then simply on "easy to read/understand".

michaelcampbell3y ago

> For example, if you've extracted something into its own function, are you then sharing this function and using it in other places as well? If you then change the body of that function, are you now possibly breaking other parts of the code that relied on its old behavior?

You absolutely are. Which is why I think that DRY was pushed too hard, and for the wrong reasons. Everyone said "you only have to change it once!" rather than "consider how many callers depend on this". There's a reason the "rule of 3" came around; and those reasons I've found were mostly experience and empirically based.

edgyquant3y ago

If you are unit testing you should not have to worry about tweaking a function and it breaking everywhere else.

Supermancho3y ago

> If you are unit testing you should not have to worry about tweaking a function and it breaking everywhere else.

"should" is a word loaded with authority.

Why?

If you believe a unit tests is for turning an impure function into a pure function (so you can just test what it's doing and no other effects), then in many cases tweaking will break existing unit tests. If the function exists, it's assumed it's used by more than the tests for it. Changing the signature or even the internal dependencies necessarily breaks the known contracts with other units.

2 more replies

sidlls3y ago

Unless the test cases are incomplete. Or the CI jobs are configured to run tests independently and off-cadence so that code that would break is tested after a merge. Or the tests have a bug in them. And so on.

aaronbrethorst3y ago· 4 in thread

My opinion is that maintainable code is written first for reading by humans and second for executing by computers.

Unless I'm writing throwaway prototype code (famous last words, lol), I try to write code such that I will be able to figure out what my intention was 6-18 months from now when I'm staring at a piece of code in a panic trying to debug a production issue.

That doesn't mean I'm going to get it right when I write this code. Instead, I'll be able to better ascertain what my assumptions were, how they fell apart in practice, and what a minimal, correct fix that doesn't make things worse might be.

Edit: Incidentally, this also applies to my commit messages. I’m writing them primarily for my future self so that I can figure out WHY I made a change, not WHAT the change was.

Shorel3y ago

My opinion is that you can write code that's easy to understand, and it is also good for the computer to run.

One thing is not in contradiction with the other.

It could lower the reusability of the code, by not having many abstractions, but it will be easy to understand, concise, and it will do what it was written for very well.

moffkalast3y ago

It should be written so it's easy for humans to understand first.

If that's too slow, then it can be optimized so it's fast, if a bit less readable.

9 times out of 10 it won't be too slow in the first place these days, unless you know beforehand that you need maximum speed for valid reasons.

2 more replies

Karellen3y ago

> My opinion is that...

You make it sound like you came up with that all by yourself

aaronbrethorst3y ago

Nah, I stand on the shoulders of generations of developers, just like everyone else here.

I didn’t claim my opinion was novel, just that it’s mine. I hope others share my opinion, because I’d find codebases that fit my criteria easier to maintain than many other types.

Also, do you agree or disagree with any of the ideas I put forth?

1 more reply

bloaf3y ago· 4 in thread

I have pretty mixed feelings about this. Personally I find it much easier to debug code that:

1) fits entirely on my screen and

2) doesn't involve much state modification

Every intermediate variable is a chance for me to miss some modification (e.g. it was passed to a func that modifies its arguments) and consequently misunderstand what is happening.

I've been experimenting in Python with the function chaining style of coding enabled by the toolz library. So while not at all idiomatic, the example in the original article would come out as something like this:

https://gist.github.com/ZeroBomb/8ac470b1d4b02c11f2873c5d4e0...

I would say that function-chaining this example would constitute over-engineering, but I have found that writing in this style has really helped me express pretty complex function composition in a way that is still concise without using a bunch of intermediate variables.

readthenotes13y ago

Do you really write a very long comment after every function call?

And have you looked at code that's over a year old and modified by other people to see how poorly those comments now match the code?

bloaf3y ago

No. I did that for people having a first exposure to this non-idiomatic currying/function chaining.

In actual code I would have put most of those on one line.

strager3y ago

How do you debug that code?

bloaf3y ago

This particular case is special because it uses 100% library functions. Typically you're composing your own functions, so you just... put breakpoints in your functions.

If you want logging, I've added an example of how to auto-log the composed functions to the gist.

hardwaregeek3y ago· 4 in thread

This seems perfectly reasonable advice. However I do wonder how many people actually struggle with this sort of code quality. It's certainly more than a few, since I've encountered bad code with these issues. But it's not exactly the most pressing issue either. As the author demonstrated, you can refactor this with a little thought. It's the code equivalent of tidying your room, sweeping the floors and putting your stuff away.

Whereas the refactoring issues I'd love to learn more about are the equivalent of a sinkhole in your living room. Stuff like "you have data dependencies that go in, out, left, right, and through the code", or "the codebase is a mishmash of React combined with Vanilla JS that is hooked up to a custom PHP MVC". Basically refactoring that involves issues that cannot be cleaned up all at once, that involve deep architectural decisions and that require some amount of buy-in from the team.

Mostly I'd like to know more about this because I've realized that I'm not very good at it. My inclination is to just refactor everything and that's not a feasible strategy. I also struggle to balance it with getting feature work done. Definitely something I plan on reading more about.

BurningFrog3y ago

My opinion is that each line of code should be easily understandable. Without that, code is very hard to work with.

You're right that other code problems can be worse. But that's no excuse to avoid doing the basics.

To clean up system design issues, you must first know what a better system design would be. It's not enough to realize that what you have is bad.

I do this a lot, and part of my approach is to always be incremental. Improve one detail/aspect at a time. The worst, very tempting, idea in this field is to throw everything away and start over...

> I also struggle to balance it with getting feature work done

FWIW, I like to spend 1/3 of my time cleaning up and refactoring.

theptip3y ago

I’ve seen some engineers that think it’s clever to put everything into a one-line list comprehension where possible, even if that means rewriting named variables as letters to make them fit. The result is really hard to read.

I’ve also (more common) encountered engineers that don’t actively try to be clever by being terse, but also don’t put their mind to writing clearly.

Put differently, I think one has to actively try to write easy-to-read code.

I agree with your point that this sort of micro-style point isn’t as big as architectural questions, but it’s definitely something you want to teach junior engineers so that it’s second nature by the time they are at the level where they are thinking about architecture.

For that you can try reading Bob Martin, Martin Fowler, Kent Beck, Domain Driven Design, Hexagonal, etc. - but you also just need to build for a decade while thinking about that stuff to really master it. Sadly architecture often seems more craft than formal engineering at this level.

brabel3y ago

Exactly. This post is helpful to beginners, sure.... but after some experience, the problem you describe becomes much more pressing.

I still don't have a good solution other than try to "keep things simple" from the beginning, then as soon as new features are introduced and everything becomes messy, mercilessly refactor the architecture itself to make things "make sense" again. I fully realize that this is not very helpful because what does "make sense" even mean in a code base? But that's the best I can come up with and I don't think there's anything more specific that can be said :(

9dev3y ago

What always helps me with architecture redesign projects like that is trying to define a goal to work towards: getting alway from that custom PHP MVC and to a proper Symfony application, for example. If you know where you want to go, it’s easier to stay focused and to align what you do with what you want to achieve.

RajT883y ago· 4 in thread

I have written a lot of Powershell in the last few years. I eschew the clever powershell ways of doing things if someone else may end up owning it (think: where-object, foreach-object) in favor of expressions that resemble other languages (foreach, for).

If I'm writing it for myself, and only ever myself, I'll use the more clever powershell ways of doing things. Expressions like:

1..10 | % {$_}

If you're coming from another language, you're going to have to run it to understand it, or look it up. That is time lost.

blown_gasket3y ago

I primarily write in PowerShell for end-user shell tools and Go for network services.

Where-Object is going to let you cut down on the number of lines of code compared to foreach() and for(), and in my opinion will make the code more readable.

$vms | Where-Object -Property Name -match "sql"

$vmOutput = @()

for($i = 0; $i -lt $vms.count; $i++) {

    if($i.Name -match "sql"){

        $vmOutput += $i

    }

}

$vmOutput = @()

foreach($vm in $vms){

    if($vm.Name -match "sql"){

        $vmOutput += $vm

    }

}

For the Foreach-Object point, that cmdlet also give you the option to use begin{}, process{} and end{} blocks. So that you can with begin{} do something before any of your objects are processed, process your objects with process{}, and after all objects have been process do something with end{}. This logic with for and foreach would have to come before and after the for and foreach statements.

I don't see this as a "PowerShell being clever" but more as a PowerShell is a shell that uses pipelines like nix shells but it has everything as an object unlike nix shells. So you get to take advantage of that.

PeterWhittaker3y ago

> PowerShell is a shell that uses pipelines like nix shells but it has everything as an object unlike nix shells. So you get to take advantage of that.

That was one of my favourite PWSH features when I was using it regularly. I’m a UNIX CLI-and-filter guy from way back and after using PWSH for a while I longed for the same power in bash (my shell for reasons of history, availability, and muscle memory, I’m unlikely to change).

ParetoOptimal3y ago

Some things are short and self explanatory though. `1..10` is just syntax sugar for a stream or list from 1-10 right?

RajT883y ago

1..10 is powershell range operator, and % is an alias for foreach-object.

im3w1l3y ago· 4 in thread

Give mysterious things room. In this case the most mysterious is [-3:]. That, together with the split, should have it's own line or maybe even multiple (function declaration, comment).

raldi3y ago

Right?! That should instead be -len("foo") or -NUM_PREFIX_PARAMS

xigoi3y ago

That doesn't improve anything in terms of knowing why the number is there.

1 more reply

ParetoOptimal3y ago

> [-3:].

That kind of index notation really isn't mysterious if you write a lot of python in my experience.

im3w1l3y ago

The mysterious part isn't what it does. The mysterious part is why. Why are we taking the last three url parameters? What are the meaning of those particular url parameters? Also url parameters are normally used as a unordered key=value dictionary, which makes it strange that we rely on a given order.

iafiaf3y ago· 3 in thread

Early in my career, I took to heart such books and articles and often felt guilty and lessor-programmer when I cut corners. Here's my 2 cents now:

- Some of this is the coding equivalent of "6 rules for financial freedom" or "6 ways to find your dream soulmate". Generic advice that doesn't reflect highly nuanced reality.

- These rules are guidelines at best. There are justifiable reasons to break them; which I do often. Albeit this requires experience (and dare I say, wisdom). For example, refactoring code into a separate function levies a cost (of indirection) on the reader. Therefore copy-paste is sometimes fine.

- Clode "cleanliness" is a moving target. For a coder's mental health and value proposition for his project, he/she should know what code can afford to stay dirty.

PS: I love Jonathan Blow's opinions on coding/programming. Here are a few: https://www.youtube.com/watch?v=21JlBOxgGwY https://www.youtube.com/watch?v=ubWB_ResHwM https://www.youtube.com/watch?v=KcP1fXQv0iU

BlargMcLarg3y ago

Don't forget the most prominent part: 'your clean' and 'my clean' can differ greatly.

You can do your absolute worst and you will still find someone claiming there aren't enough comments, or the naming is bad, or the code is too dense, or the code isn't dense enough, or you should use typed objects instead of tuples and anonymous classes, or your code should be more functional, or your code should be more imperative, or it should be more event-driven, or it requires more logging, etc.

And it turns out, there is almost no research to tell you who is right and who is wrong. The only thing I can safely tell others, is all these discussions and additions will add 900% more work all things considered, and there's no guarantee it will be less bug free or more.

29athrowaway3y ago

Jonathan Blow is a creative, productive and overall smart guy, but reading his code will make you want to slam your head against the wall.

What irritates me the most are the long, non-linear comments full of distracting noise. It's like reading a choose your own adventure novel.

jstimpfle3y ago

When he talks about his approach he typically mentions the importance placed on getting feedback from an actually working prototype as quickly as possible. Don't judge him by most of the ad-hoc code you've perhaps seen on twitch. But I've seen some really good-looking stuff there as well - straight to the point, no noise, not overabstracted. I would be interested what a finished project looks like.

gilch3y ago· 3 in thread

The article starts with some reasonable premises, but the conclusion does not follow.

I think most APL programmers would disagree with this take. Dense code has real advantages, and naming everything has real costs that are hard to see. There's nothing magic about a "line" that suddenly allows for chunking. You have to build a parse tree in your head in any case.

I'm reminded of Doug McIlroy's challenge to Knuth.[1] It's worth a read. Would you rather have 6 lines of dense shell, or 10 pages of Fabergé egg? I'll take the shell, thanks.

Look at the source code for J (an APL derivative)[2]. It's written in C, but that C was written in APL style by APL programmers. Lines leverage macros and 1–2 character names, making them extremely dense. Some files have a comment on nearly every line. For an average C programmer, this code looks absolutely insane. But it's not. The J devs find this perfectly readable and maintainable. It's clean code! If written with the typical C idioms, it could easily be 10x as long, and therefore harder to maintain. Your first impression is a snap judgement due to a difference of culture. You can learn to read this style with practice. Whatever your current style, that took practice too.

[1]: http://www.leancrew.com/all-this/2011/12/more-shell-less-egg...

[2]: https://github.com/jsoftware/jsource

overgard3y ago

I think the comparative rarity of APL compared to every other programming language in existence says a lot. Even if I were an expert in APL, I can't think of a single place where I could get a job writing it.

gilch3y ago

I think language popularity in industry mostly comes down to path dependence[1]. It doesn't say as much as you seem to think.

A few approaches got lucky in the rapid inflationary period of the personal computer revolution (C), and the advent of the Web (Javascript), and became deeply entrenched in industry, while superior alternatives that had been known for decades missed the boat. Industry languages still haven't caught up to where Lisp, Prolog, Smalltalk, and APL were in the 1970's, but they are clearly (if slowly) trending in that direction.

APL and derivatives are still used extensively in finance, a highly competitive field, to say the least. That's where you find the jobs.

[1]: https://en.wikipedia.org/wiki/Path_dependence

ParetoOptimal3y ago

> I think the comparative rarity of APL compared to every other programming language in existence says a lot

That's just the whole "popularity means it must be good" argument, which I disagree with.

1 more reply

retrocryptid3y ago· 3 in thread

in the example given, I started with 10 things to keep in working memory. now that we've added a named function or a named variable, we have 11. I suggest it is at least as important to name things well (or add comments) as it is to break lines up.

Wowfunhappy3y ago

A nice feature of working memory is that although it's limited to around 6† "things", each thing can be any size, if your brain considers it a single unit. This is called "chunking".

So, it's easier to remember the three numbers 34, 765, 812 than the eight numbers 3, 4, 7, 6, 5, 8, 1, 2.

Refactoring code into separate functions with descriptive titles is probably a lot like combining numbers.

---

† Well, the article says 4–6; I'd always heard the average was around 7.

pavon3y ago

I'm working on a project that is following Uncle Bob's Clean Code guidelines of striving to having functions be ideally 3 lines or less, and nor more than say 7. I have mixed feelings about it.

My initial prejudices have largely held. I do find the code harder to read and follow. Having to jump around, follow variables that change name as they are passed through functions, keeping track of state that was moved to a class member rather than in a function body (because breaking into pure functions resulted in too many function parameters). I can't fit as much code on screen because of all the additional function definitions.

Lastly, the success of the method relies heavily on how well you name your functions, which is often considered one of the hardest parts of programming. A name that makes perfect sense to me may not be as clear to others, or even to myself in two months. And the devil is in the details - there are so many implied semantic preconditions and postconditions with every function you write, there is no way to fit that into a function signature no matter how well chosen, and if you tried to document them them all your comments would be larger than code itself at this level of granularity. So you still end up having to read all the called code to understand the details of what is happening anyway, which is easier to do with more flat code.

On the other hand, I've found the process of "extract till you drop" to be very helpful in forcing me to find ways to clean up my code. It naturally tends towards maintaining separation of concerns, finding ways to DRY when the initial structure wasn't conducive to it, and generally disentangling things even more so than when I try to refactor to meet these goals directly. If I had all the time in the world on other projects, I think I would apply "extract till you drop" on my code, then after it is disentangled, recombine it back into reasonable size chunks.

1 more reply

artemonster3y ago

You forgot a KEY property of a function: this is an abstraction. You don't (and in most cases shouldn't) care how this "black box" does what it does, you identify it by the name and move on. So a function collapses N things to understand to 1, not how you have described it.

1 more reply

foolfoolz3y ago· 3 in thread

the only known metric for code complexity is as number of lines grows complexity grows

marginalia_nu3y ago

I'll just leave this here:

https://github.com/KxSystems/kdb/blob/master/c/c/odbc.c

2 more replies

joshuacc3y ago

I’m not sure what you’re trying to say, but that’s not true at all. There are other metrics for code complexity, including fairly simple but useful ones like number of logical branches.

f1shy3y ago

Irony, right?

jiggawatts3y ago· 2 in thread

Now I know where Rust got some of its syntax from...

As an aside, when I see samples like this, it makes me itchy. I hope and assume that they're being used as made-up snippets just to illustrate a point, and aren't being lifted from an actual codebase.

Because... ugh... isn't it obvious? Attacker-controlled input such as URLs should never be manipulated with naive string processing! Always use a proper parsing library. Not to mention that complexities of URL encoding, character escapes, etc...

The problem is that the author is using abstractions at the wrong level, with or without his fixes. The correct solution would be something like:

    var uri = new Uri( "http://foo/demo?test=a&blah=b%20c" );
    var map = System.Web.HttpUtility.ParseQueryString( uri.Query );
    
    Console.Out.WriteLine( "is blah equal to 'b c'?\n{0}", map["blah"] == "b c" );

The above example is C#, but similar code can be written in any language. It's simple, direct, and doesn't violate the "rule of six". It can be read like English:

1. Construct a URI from a given string.

2. Parse the query part of the URI into a map.

3. Test if the 'blah' value in the query is "b c" as expected, with the escaped space decoded properly.

The example of how to apply the "MORF" rule in the article still has low-level operations involved, which doesn't make the code more readable. It doesn't describe the intent, which is the key thing to writing code that doesn't need comments every second line.

Too3y ago

The python equivalent is in the urllib.parse module, part of standard library.

steveklabnik3y ago

… and Ruby got it from Smalltalk. :)

overgard3y ago· 2 in thread

This really resonates with me. I remember when I started programming (at like 10 or so), my dad tried to teach me Smalltalk. Smalltalk is a great language, but there were just too many concepts and abstractions happening on each line of code. To understand even basic code required understanding messages, objects, classes, blocks, etc. Maybe to an 18 year old that would have been ok, but for my 10 year old brain it was too much.

A few months later though, I started with QBASIC. BASIC of course gets an awful rap, but it was so much more intuitive for me at the time. I started out with just global variables and GOTO's everywhere. Over time, I worked up to loops, and subroutines, etc. etc. However, the simplicity of "program runs one line at a time, each line does something obvious" was incredibly important to beginner-me.

Even once I moved to C, when I was an amateur I still had a tendency towards one line per thing happening. I really hated code like

    while(i++ < 10) { doSomethingWith(i); }

(Actually, I still do).

As I got more sophisticated in my 20s, I started packing a lot more ideas into a single line. If I'm being perfectly honest, I think some of it was just showing off. You certainly look clever if you can put 3 list comprehensions on one line or use some of the more advanced collections apis. However, besides understandability, I found that style of code had two really big problems:

1) It's a lot harder to debug. Either you can't get a breakpoint in the precise place you want, or you can't insert a print statement easily into a complex expression, or iteration variables become implicit and you lose context.

2) It's hard to add error handling to that type of code. When a lot of things happen in a complex expression, you're depending on the entire expression working.

Luckily I've grown out of that phase, although ironically now my much more mature code looks a lot like the very simplistic code I wrote as a teenager.

jstimpfle3y ago

3) It's also harder to edit with an editor (like vim), and (IMO) harder to read. I never even do "int x, y = 3;". I always put each variable declaration on its own line.

mafuy3y ago

That avoids bugs and misunderstandings, too.

You know this, but for those unaware, in the previous example, x is not initialized to 3. Similarly, in "int* p1, p2;", p2 is an int, not an int*. Easy to misread.

happyweasel3y ago· 2 in thread

use statically typed programming languages. Favor composition over inheritance . Develop bottom-up (reusable classes) instead of large-scale up-front design. SOLID principles (SRP being the most important). The Bottom-up approach also favors Unittesting. Code reviews, clear code formatting rules (simple editor plugins do the trick). Use static code analyzers. IMHO this kind of object-oriented programming leads to NEW code being written to implement features and NOT old code being tampered with. Ideally, the (tested) units of code (classes) have such clear responsibility that you do not have to touch them once they are implemented. If Super-classes begin to emerge, refactor.

ParetoOptimal3y ago

> Develop bottom-up (reusable classes) instead of large-scale up-front design.

I tend to really hate the UX of bottom-up designed API's and find them incoherent.

MH153y ago

First point is killer. There's little reason for dynamic languages nowadays outside of scientific computing (e.g. Jupyter notebooks). A decade ago static typing did incur a real overhead in verbosity, but with popular languages adopting type inference algorithms there's no excuse anymore IMHO. Static types have won.

twblalock3y ago· 2 in thread

The "bad" Python code in that example is perfectly fine. I'm not a Python programmer but I can read Python a little bit, and the example uses basic programing concepts like string splitting and array ranges.

If you don't understand that, multiple smaller lines won't help you, because you just don't know what you are doing.

In addition, that code example is easily testable. Testability is more important than readability in modern programs that follow modern CI/CD principles -- and the readability is not really that bad either. Also, modern debuggers don't have issues with nested/lambda statements like these.

If the article's author had a legitimate bone to pick, they would have better examples.

overgard3y ago

The question isn't if you can figure it out, but how long does it take you? The simple version might take me 2 seconds to read. The original version might take me 10 to 15 seconds. Multiply that out over a day and you're hurting quite a bit.

macintux3y ago

> In addition, that code example is easily testable.

I'm skeptical, because typically a line like that is embedded in the middle of a larger function.

Extracting the logic into a dedicated, pure function helps with testing.

gilch3y ago· 2 in thread

At least for the contrived example from the article, the solution isn't to break up the code, but to use denser code. Use a regex.

Does anybody really think that e.g. sregex[1] is better than just learning and using the regex language directly? Because that's where this kind of thinking leads.

[1]: https://github.com/jwiegley/emacs-release/blob/master/lisp/o...

toiletduck3y ago

I know it's not the point the author is trying to make, but I couldn't help get the feeling this example isn't good enough to carry the point.

from stdlib:

  from urllib.parse import urlparse, parse_qsl

  url = 'https://www.example.com/some_pathsome_key=some_value&foo=bar'
  parsed_url = urlparse(url)
  values = [v for _, v in parse_qsl(parsed_url.query)]
  print(values)

which I guess you could oneliner back to this..

  [v for _, v in parse_qsl(urlparse(url).query]

cavisne3y ago

I think for any code thats meant to be read and maintained by someone else a regex is a bad idea.

You are saving a few lines on the surface, but adding a potential backtracking bug in the future.

skitter3y ago· 2 in thread

In this case `query_params` works well, but it's sometimes hard to find descriptive and reasonably concise names for the intermediate value. In those cases, the ideal would be using only postfix chaining, so that you can read it by only keeping the intermediate value and the next operation in mind:

    s.split('?')[1]
     .split('&')[-3:]
     .map(lambda x: x.split('=')[1])

Unfortunately, that's not how Pythons map(), len() and such were designed.

gilch3y ago

Idiomatic Python wouldn't use a map here, but a generator expression:

    (x.split('=')[1] for x in s.split('?')[1].split('&')[-3:])

Removing the lambda cuts down on the noise considerably.

And honestly, with this many splits with fixed indexes, I'd probably use a regex. Now there's a dense language for you.

chii3y ago

i don't really see why or how a generator expression is any easier to read than a chained call like the OP's example.

In fact, for people unfamiliar with python, this expression is even more strange - you read expression starting from the middle (the _in_ ... part), and then return to the beginning. It makes your eye dart forward and backwards on the text.

thisismyswamp3y ago· 2 in thread

Skimmed the article and the comments and got no answer - can someone tell me what the rule of six is?

_dain_3y ago

>That gives us a rule for deciding if a line of code is too complex:

>A line of code containing 6+ pieces of information should be simplified.

he put it in bold.

krapp3y ago

The rule is literally described, in bold text, within the article.

Try actually reading instead of skimming next time. Or at least skim more slowly.

1 more reply

dcow3y ago· 1 in thread

This is why setting an arbitrarily short max line length matters. And consequently why auto-formatters suck.

A short line length, while yes imperfect, forces complex lines to be decomposed into individual concepts. And it allows the code to read like a book rather than <there is literally no other media format that you read sideways>. Ultra-wide monitors be damned.

And auto-formatters suck because they don’t split concepts onto individual lines. They can’t. They just mangle code and scrunch it into whatever space is allowed without regard to how the code reads. The idea of them is great and intensely alluring, but the implementation leaves much to be desired. If an auto-formatter could make my code look and read like a LaTeX document, I’d shut up already.

So if you want people to implicitly start structuring their code as advised in this post, set a 80 or 100 char line length. And adopt a fuzzy “one statement per line” philosophy.

alpaca1283y ago

Agreed, auto-formatters have the single purpose of making code on screen visually more readable for humans, but seem to not consider that humans are not machines. There is no regard for "visual code density", no attempt to use vertical alignment to highlight similarities and differences between consecutive lines.

To be fair considering such visual details is a complex task and probably hell to implement.

lbriner3y ago· 1 in thread

Perhaps a more helpful principle I heard a long time ago was that all methods are either a specific method doing a specific thing (like splitting up a string) or they call a series of methods of the first type. When we mix the two, it becomes harder to reason since type 1 is generally logically complex, so keeping these small makes them testable and readable, and the logic of the high level is more easily encapsulating as a series of DoThis(), DoThat(), ThenDoThat() calls.

If I've only helped one person today, it was worth it ;-)

layer83y ago

There’s a balance to be struck if most of the Do methods need a common and/or interdependent set of parameters. Inlined code can be clearer because you can directly see how/why those parameters are used. You rarely have

  DoThis();
  DoThat();
  DoTheOtherThing();

Instead you usually have something like:

  x = DoThis(a, b, c);
  y, z = DoThat(c, x, a);
  w = DoTheOtherThing(a, z, x, y, b);

…and on top of that have to add error handling for those calls.

jonnycomputer3y ago· 1 in thread

Then you have to name things. And naming things sucks, especially because not every intermediate has an obvious name for it, distinct enough to distinguish it from the next intermediate chunk.

overgard3y ago

Ok, but if you're reading code for the first time, you're going to have to store the intermediate parts in your working memory somewhere. And if you don't have a mnemonic, like, a name, then "the return value of the lambda after a split" is a lot harder to remember.

Naming things is hard, but it's also important.

Krasnol3y ago· 1 in thread

Your cookie banner "manage settings" thing never stops loading.

tmtvl3y ago

Yeah, I bypassed it with Firefox's Reader Mode, but the original page slowed FF down to a crawl.

meitros3y ago· 1 in thread

This seems like an interesting heuristic for anything that could automatically either generate or "format" code - a little more semantic than just relying on a text parser

djmips3y ago

That is a cool idea. Show the code the way you understand it best without even having to change the underlying text.

charles_f3y ago· 1 in thread

Ok, quick rules that focus on single lines. That's neat, but from experience most of the complexity comes from the structure more than just how the code is written, conventions about how to write a line of code won't fix corrupt indirections, misplaced coupling, lack of cohesion, undue repetitions, missing tests, etc.

Clean code is not just a few rules about how to write a line. You can write nice lines that still don't make sense and amount to shit code

ARandomerDude3y ago

Ah yes. I remember when I read the Clean Code about 10 years ago and produced the "cleanest" code I had ever seen – only to have it destroyed by a senior dev during a code review because the task was relatively complex and my overall structure was garbage. One of the saddest days of my career. Probably the most helpful day of my career too.

xfz3y ago· 1 in thread

I can't access the linked page without accepting cookies.

gilch3y ago

Can't you just use an incognito tab? It'll delete all the cookies when you're done.

kazinator3y ago· 1 in thread

This rewrite is more performant than the original:

  query_params = s.split('?')[1].split('&')[-3:]
  map(lambda x: x.split('=')[1], query_params)

The calculation of query_params, having no dependency on the lambda parameters or anything being mutated, has been lifted out of the lambda, and thus spared from repeated execution by map. The compiler for that language won't do this automatically.

gilch3y ago

What? No it isn't! You didn't parse that correctly.

The query params were never in the lambda to begin with. Python function calls have strict (not lazy) semantics, i.e. "applicative order", i.e. both expressions passed as arguments to map() are evaluated before the map body gets them as parameters, thus the query params would only be evaluated once, even when inlined as they were originally.

Same with the lambda definition: it's evaluated only once. It's just the lambda body that gets reevaluated each loop, and only evaluated for the first time on the first loop.

1 more reply

michaelwww3y ago· 1 in thread

If you're like me and like to step through code with a debugger, shorter lines are better for setting breakpoints and checking values.

convolvatron3y ago

this is important and true.

but I really wish debugger evolution hadn't stopped at the line.

artemonster3y ago· 1 in thread

I always liked the quote "you need to be twice as smart to debug a code. If you write smart code, you, by definition, cannot debug it" (sorry I have no idea who said this). This is why I still code in C. No smartass bullshit, just plain old undefined behaviour and out of bounds access. Lovin it.

kazinator3y ago

This is from Brian Kernighan (the 'K' in the "K&R C Book" and AWK), known as Kernighan's law:

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

pavon3y ago· 1 in thread

Never cleaned up the most obtuse part of that code snippet - why are we only keeping the last three parameters?

lbriner3y ago

I think a contrived example. In truth, there is much more likely to be a way of tidying this up with a nice reuseable function like "get_querystring_params" which returns an array and then take the first 3 with a comment like "only the first 3 parameters are used for the search".

Taking a subet of query params smells in its own right so, again, might be a bad example.

he00013y ago· 1 in thread

This is so subjective. Some people do want to write such code as that is “cleaner” because it’s compact. Some wants to explain every single step because that’s “cleaner”. Some tries to do something in between and it’s somehow “cleaner”. But in the end, it’s mostly subjective.

overgard3y ago

Here's conway's game of life in APL:

    life ← {⊃1 ⍵ ∨.∧ 3 4 = +/ +⌿ ¯1 0 1 ∘.⊖ ¯1 0 1 ⌽¨ ⊂⍵}

Is that shorter than essentially every other language implementation. Yep!

However, to even begin to understand it you have to read an article from the original writer:

https://aplwiki.com/wiki/John_Scholes%27_Conway%27s_Game_of_...

To me, that is objectively, not subjectively, less clear than the longer implementations.

3 more replies

glintik3y ago· 1 in thread

«Every line does only one thing» - that’s not related to real clean code. And there are bunch of languages that’s OK to have few things on the same line - perl, ruby, groovy, scala and even php.

gilch3y ago

And Python!

sdoering3y ago· 1 in thread

Offtopic:

The fact that the site uses a consent solution that fakes a loading screen when trying to configure (read disable) tracking/advertising is an instant bounce for me.

niea_113y ago

It's not fake. In my case, it's not working because of my adblocker. When I disable it, it works.

stagas3y ago

Another useful rule is to think it terms of intentions and split to those individual intentions. You can always reduce code down to a single function call, but was that the original intention? Try to think as a reader that just stumbled on it without any other context. `result = DoEverything(payload)` is often less readable than `step1Result = DoImportantStep1(payload); finalResult = DoImportantStep2(step1Result);` if Step1 and Step2 mirror the actual process that goes in your mind when solving that particular problem, so when re-visiting you can understand what's going on faster, without having to visit the implementation of the single `DoEverything` function.

Edit: To clarify a bit more, in contrast to the rule of six, i'd definitely keep a line that is more complex than usual but conveys the intention of my thinking, rather than splitting it to multiple lines and losing that important information, losing the original intention.

rzimmerman3y ago

“Rule of six” is generally interesting - I came upon the concept when reading the book “Nightfall” by Isaac Asimov as a kid. There’s a line in the book about the number of stars in the sky, and how people can’t really grasp numbers more than 5-10. It got me thinking about trying to visualize a set of 3, 4, or 5 distinct objects without splitting them into groups. I genuinely can’t do it for more than 5 or 6 of something.

I also remember reading about a study where chess masters and non-experts were asked to memorize chess boards. Average people could only remember 5-7 piece locations where chess masters could remember the entire board. But when the piece layout was random (rather than from real chess matches) the experts weren’t much better than the non-experts. It’s speaks to the abstractions our brain creates to deal with limited working memory.

That cumbersome line of python is a good example. As an experienced python person, I immediately found myself giving names to the chunks to understand it.

Overall very good advice. Your code should explain the steps it takes to solve a problem (or in a more functional language, explain the solution), not be as terse and clever as possible. Keystrokes are cheap; thinking is expensive.

WastingMyTime893y ago

It’s funny because what I found confusing initially reading the code is the behaviour of split and I still do after the article. I know see that this is because the article uses a magic value and magic values are the bane of readability.

See, the first split use made me think it always returned the left and right part after splitting at the first match - 0 being left and 1 right. This is not the case. The code implicitly relies on this being a url.

But then the second split is accessed with the weird [-3:] which I have to assume to mean the last 3 elements. I assumed then that split must return a list but started wondering: why 3 elements only? I still don’t know. I wasn’t helped by the single letter named variables either.

I think people might want to focus on the basics before venturing into grand consideration about splitting lines and putting code in function. The one liner with proper names is too long but understandable:

  last_three_url_param_values = lambda(query_string: query_string.split(‘=‘)[1], url.split(‘?’)[1].split(‘&’)[-3:])

longrod3y ago

If you are going for human readability then making your code expressive is the only way. Abstract away the code parts under a layer of very simply named functions/classes and boom! even a child will be able to understand what's going on.

Obviously, that isn't always possible. I find this approach especially useful in writing e2e browser tests. You write an abstraction over the testing framework's (playwright, puppeteer etc) interaction and then use that in your tests.

So instead of writing:

    await page.click(".play-button");

You do:

    await app.play();

This also has the benefit of extreme reusability. Doesn't work for everything though.

dan-robertson3y ago

I want code that is easy to read, write, update, and debug. It isn’t obvious that this means it should be ‘clean’, or indeed, what ‘clean’ is. Some other things described as clean code (eg uncle Bob) seem pretty bad to me. But then lots of people who complain about that also suggest things that seem bad. Perhaps lots of these things are too insignificant compared to other business or design decisions that one can’t really learn from experience or past successful or failed projects.

userbinator3y ago

Counterpoint: I'm sure most of us wrote far shorter sentences when we were learning our first (human) language, or perhaps even subsequent ones; yet now I suspect we can all read and write sentences with dozens of words.

Yet when it comes to programming languages, the majority of "advice" seems to be about absolute dumbing-down and propagating an attitude of "it's too hard, you can't possibly learn, just give up"?

I've always wondered about this dichotomy. Some languages like the APL family appear to have gone far into the "it's a language, to be learned like any other" territory, while more "mainstream" ones are drifting further in the opposite "don't even bother trying harder" direction.

Kernighan's Lever: http://www.linusakesson.net/programming/kernighans-lever/ind... (look at the rest of his site; he has clearly leveraged that attitude with great success)

(I looked at the one-line example in Python and, despite having very little experience in the language, it was actually faster to read and understand as a whole than the 3-line version.)

hirundo3y ago

I know it was just used as an example, but in reality when I see code like this I think "I really don't want to reinvent URL parsing for this, I'll import the library for that instead," resulting in much cleaner code.

But I do agree on keeping individual lines, if not entire statements, small and simple. The ruby chainsaw is a great tool for that. I find a chain of simple statements arranged in a flow to be more readable than using lots of intermediate variables.

quickthrower23y ago

This seems a little light to me, doesn’t give me the feeling of being written by a veteran coder. Unless the idea is to dumb it down for a particular audience.

The real answer is write code like you write words: Rework it to make sense to the reader. How many newlines you need as a hint for your editor to wrap and where you put them will fall out of that.

Or autoformat! I love autoformatters!

Edit: edited to make easier to parse mentally.

davesque3y ago

I'm sure there are specific programs that would benefit from a treatment from these rules. However, there's one thing pretty fundamental to this article that I have a hard time agreeing with. And that is the notion that there are these three different memory types, two of which can only store "4 to 6" things.

I'm inclined to believe that there are probably many gradations of long vs. short term memory in the structure of the brain. In fact, I bet the gradations even vary by topic and of course depend on what sorts of tasks a person is accustomed to performing from day to day.

I imagine that the "4 to 6" figure fell out of a study that aggregated a large amount of data collected across subjects and that the figure itself can't capture much of the nuance or even the nuance of cohorts.

In other words, it may very well be that a large percentage of people who work professionally as software developers are capable of keeping more than 4 to 6 "facts" about code they're looking at in their head. But that they would also appear to have the same capacity as random people when it comes to arbitrary facts that one would be asked to memorize in a psychological study.

tgv3y ago

The starting assumption is highly dubious: "Short lines of code require less brainpower to read than long ones."

I'm not going to nitpick the incredibly bullshitty term "brainpower" and what is less and if that's actually advantageous, but if you write short lines of code, you're going to write more lines, which requires "more brainpower" to understand. You don't simply "chunk" lines in memory. If that were true, you could just as easily chunk function calls.

That memory plays a role is fairly certain. There is a pretty hard finding from psycholinguistics: it's hard to understand nested structures. The sentence "the rat the cat the cook hit chased escaped" is much harder to understand than it's right-branching equivalent "the cook hit the cat that chased the rat that escaped". However, reading code is not the same as reading natural language.

If you want to know if what you wrote is understandable, try reading your code without falling to back to remembering why you wrote it. Try to read what you wrote. Wait a few days if your recollections get in the way.

d_burfoot3y ago

Split Into Multiples Lines has a real problem, which appears in the example code.

Let say you have a long code block that includes the revised snippet:

> query_params = s.split('?')[1].split('&')[-3:]

> mylist = map(lambda x: x.split('=')[1], query_params)

> ...

> (some more complex transformations, that only depends on mylist)

When you're reading the later stages of the code, you still have to maintain a memory of what "query_params" does, even though it's no longer relevant. That actually increases the burden on your working memory. The one-liner is more complex to understand initially, but it self-documents that the only info that is relevant to the downstream is the result of the map(...).

In general, the more variables that are declared in a code block, the more effort it is to understand, and the effect is probably superlinear with the number of variables. I'd say if you have to declare more than 5-6 variables, you should split into a separate function.

jollybean3y ago

I object to the 're-write as function'.

Functions come with abstraction overhead. You don't know who will consume them, so you may have to put up type checks, null checks other BS.

Also - functions split up the logic all over the place, it's confusing.

I think what we need are 'nested functions' which serve to kind of create a scope pushed to the stack - with an implicit 'return' - which we can then 'collapse' in the GUI etc..

I mean, it's purely cosmetic from a CS point of view, but it might help to organize things a bit better and hand off abstractions in long function implementations.

Huge projects with 1 or 2 line functions drive me crazy - you have to constantly jump around all over place to figure out what's going on. I actually believe it's a historic anti-pattern.

I make functions when we need 1) used in different places 2) meaningful abstraction.

Otherwise, well documented longer functions for me.

ppierald3y ago

If there is some legitimate reason (say performance) to keep a tighter form (inline assembly, Python 1-liner, whatever), then making the unfurled equivalency as a comment nearby to allow the next developer to have a fighting chance would be really helpful. Also, error handling tends to be not included in the 1-liners.

rekrsiv3y ago

The original code is perfectly readable until it does something completely unexpected, and the human parser has to start over to make sure they didn't miss anything. But unfortunately, the context for that "get the last 3 parts specifically" is never explained, so the entire line never makes sense. The human has to think a lot to come up with an (hopefully correct) explanation for the "why".

The solution isn't to extract every token from the expression to separate lines, but to document the "why" of the unexpected token. That can take many forms: a new variable with a meaningful name, a new function with a meaningful name, or a meaningful comment that warns the reader about the upcoming reason for getting just the last 3 parts.

irrational3y ago

This is the main reason I don’t like arrow functions in JavaScript. People overuse them to create “clever” code - lots of things going on in a single line. Then they try to claim that by having everything on a single line the code is easier to read and understand.

cc1013y ago

If I use elaborate camel-case variable names, it seems to reduce the load on my short-term memory because I don't have to remember what a variable name represents. It's meaning is there when I need it and can be forgotten otherwise.

Waterluvian3y ago

I flexibly agree with the “does one thing” approach. But what a “thing” is can be up to you.

Sometimes my one thing is “turns a Json file into an in memory dictionary” which might be three operations on one line.

xigoi3y ago

When I clicked on “More settings” in the cookie dialog, it displayed a loading animation (ignoring my prefers-reduced-motion setting) and got stuck. Just straight-up user-hostile design.

jwilliams3y ago

All comes down to good naming in the end. The craft is finding both compact and specific names.

I think the mantra for all names to be short can be counterproductive here. If the code span of a variable is short, a long name can be fine (and very clarifying, perhaps even resulting in a comment not being needed).

Shorter names for longer spans are much better. But you’d hope they’re the very obvious subject of that span.

dasil0033y ago

Although I agree the original line is a bit long, and the first refactoring is a clearly more readable, but after that it starts to feel like bike-shedding. FWIW I don't believe in refactoring things into tiny methods that are just used once—it's a lot of boilerplate which makes zero sense if you are not going to reuse it, but it's not the hill I'm going to die on.

Overall a lot of this boils down to minor style issues. I care very little if you give me 5 short lines with named intermediate steps versus a dense one-liner, however I do care very much if your code leverages pure functions, minimizes cyclomatic complexity, encapsulates messy bits, and has some form of test coverage. The former might take me a minute or two longer to grok (depending on my personal context), but the latter compounded over a wide surface area can lead to a completely unmaintainable system and a pathological fear of touching anything.

hedora3y ago

I think this article is missing the forest for the trees.

I've found that dividing software into layers, and making sure that each file relies on the same set of invariants from its dependencies, and also maintains a (different) consistent set of invariants for its callers works much better.

For instance, I'd prefer a function that takes a string and confirms it is a valid URL.

That would delegate to URL character esacaping logic and DNS validation. (Are & or ? valid DNS name characters? Will they be in the future? I neither know nor care.)

On top of that, there would be a parser for key=value config file lines.

Then, the example in the article becomes something like:

keyvalue = parseConfLine(input)

URL(keyvalue.value).params[-3]

Plus a few more lines to confirm key is as expected and that value has enough query parameters.

Alternatively, I'd use a perl oneliner with a regexp. I see no purpose for code that lands in the middle ground between these extremes.

mkoubaa3y ago

A reviewer can usually tell which code is easier to understand side by side, even when it's yourself as the reviewer.

Applying rules like these to your code may or may not result to cleaner code, but that's a testable hypothesis.

I've seen all too often some clean code recommendation or other applied to code and it gets harder to understand. And the person doing the refactoring (often myself) gets caught in sunk cost.

Now my recommendation is always:

1. Use your intuition to predict if a change makes code cleaner.

2. Try to make that change, and be open to doing things a little differently that you first imagined.

3. Test your hypothesis to see what others think. Decide what to do, but be mentally willing to throw it away.

4. Repeat

Articles like this are good resources to help train your intuition, but there is no substitute to developing your personal and team "flavor profile" for what styles suit your way of thinking.

GuB-423y ago

Writing "clean" code is more of an art form, you can't really have easy rules.

I think the general idea is that clean code is short code, that's the base guideline. Generally shorter code does less things, reducing cognitive load. It may also have performance benefits. It also takes less space on-screen, which is also a good thing: less scrolling, ability to use bigger, more readable fonts, etc... And as explained, short-term memory is limited. Short code also tends not to repeat itself, another common advise.

But that's the baseline, all the art is in appropriate breaking of that guideline, to have short code that doesn't look like it came out of a minifier.

Splitting lines makes longer code, bad, but sometimes it is justified. So what is your justification? The article focuses on "one liners" being hard to understand, but really, it depends on many things. For example you may use a longer form if you think that it is an essential part of your code and it is critical that you should pay attention to it. On the other hand, you can use a shorter form if it is a common pattern, what is "common" depends on who is going to read your code, or the project you are working on. For example, bit manipulation can make a good part of your code base, or be a one-off thing and it will have an influence on how you write that code.

Moving code into functions is generally a good thing if that function is used often (shorter code). I think it is the origin for the term "refactoring": factoring ax+bx+cx+dx becomes x(a+b+c+d), only a single "x" remains and it is shorter. But if that function is only called once, of if the operation is hard to extract from its context, it can lead to longer, harder to understand code, and again you have to exercise judgment. For example you may want to write a specific function because it is a tricky, specific part that you want to separate from the boilerplate. There are interesting considerations to using functions, because it actually reorders code, for example "a(){do_x}; do_y; a(); do_z" is written as x,y,z and does y,x,z, which is often, but not always unintuitive.

catlifeonmars3y ago

The examples and improvements in the article feel obvious like “common sense” — which is a good thing. I’m not 100% sold on the reasoning though. It kind of feels like a just-so explanation without much justification

soulofmischief3y ago

> STM and WM are small. Both can only store about 4 to 6 things at a time!

I'm sorry but this is such an asinine statement. Your brain doesn't store "things" and the number of working items depends on so many factors the complexity of the information, the level of association between items, the attention span of the individual, which can be trained, and a multitude of other things.

Neuroscience is a useful tool for self-programming but you must be careful peddling absolutist statements like this which can do more harm than good.

geewee3y ago

I'm really tired of hearing the 4 items +/- 2 being parroted around in cases like these. The studies that come to that number are basically "Remember these completely arbitrary things such as numbers or words in order". That's nothing like reading lines of codes where you have variable names, and you're able to construct meaning and relationship between the things in your mind.

Sure, it might be relevant if all variables were named "x", "xx", "xx", - but they're not.

avnigo3y ago

A map lambda example is what I had in mind when reading the article. I'm not a big fan of the temporary variables, though.

Admittedly the example below is not a perfect solution, but that's where I thought the article was heading when splitting that code over multiple lines for readability.

  map(
      lambda x: x.split('=')[1],
      (url
          .split('?')[1]
          .split('&')[-3:]
      )
  )

Is this still too unreadable or more messy?

shadowofneptune3y ago

As other people have noted, this seems like a criticism of expression-heavy languages. I'm not sure the working memory idea really is a good argument for short lines. Assembly language is entirely short statements, and has only a few operands per line, but is so tedious to read because of how much state/working memory is occupied. Complex expressions can actually reduce the mental overhead by reducing the number of used variables to a minimum.

Beltiras3y ago

Can't get past the obnoxious cookies. Anyone have the text?

schemathings3y ago

For the example in the text I'd typically just include a one line comment above to show what an example string would look like and leave the code as is

# URL with params https://news.ycombinator.com/item?id=32963021&something=valu...

map(lambda x: x.split('=')[1], s.split('?')[1].split('&')[-3:])

AtlasBarfed3y ago

The issue is that short code lines increases the length of code aka wastes vertical screen reasl estate aka visible code, so you're overburdened short term memory has to context switch to scroll.

"Simple, put code in small methods"

Oh great, now I do a nav jump or a string search as a context switch rather than scroll.

Comments? increase vertical screen pollution.

Proper chunking is hard.

Maybe APL was right.

djmips3y ago

Coming from a background in lower level languages where you can only express one simple thing per line my tendency has always been to be more verbose than my colleagues. The worst time I ever had was when I had to work on someone's Perl code that one time. So I really like this heuristic to make code more readable.

ravenstine3y ago

The main reason I stopped using the old school for loop in JavaScript is that it's doing too much in a single line. If I can't do for-of, I much prefer a while loop because it does effectively the same job as for-in but each step gets its own line. I find it easier to follow at a glance.

exabrial3y ago

The suggestions here are so not 1337. The whole point of writing code is to show off how much smarter you are. During code reviews, you can teach everyone else a lesson; you’re basically doing them a favor by making them read your 1337 code. If they can’t read your code they aren’t your equal.

Lame.

schwartzworld3y ago

> Is that hard for you to read? Me too. There's a good reason why. You have to know what map, lambda, and .split() are.

This is pretty weak. Not knowing what a function or language feature does, doesn't make it inherently unreadable.

nmz3y ago

This is forth code 101, factorization is an absolute must when writing forth code.

notjustanymike3y ago

Engineers would benefit from talking to designers more often. The rule of 5 +/- 2 has been around in UX design forever. When you write code for others you're designing a human interface for solving a problem.

nottorp3y ago

My, the very dark pattern in the cookie dialog... closed the page when manage cookies didn't load in 20 seconds. I bet there's an explicit delay in there.

readthenotes13y ago

The author misunderstands Miller's research on working memory, often reported as "7 +/- 2".

But, Miller states that limit is valid only for unrelated items.

iLoveOncall3y ago

Readable code and clean code are two different things and I think this article does a prety poor job at writing clean code.

whiddershins3y ago

Code has a typo ‘slit’ instead of split after bringing up the concept of moving something into a function.

gauddasa3y ago

The magic number for programmers is 7. 4 to 6 is for the rest of the world.

NonNefarious3y ago

And also: Use tabs.

j / k navigate · click thread line to collapse

333 comments

219 comments · 80 top-level

kouteiheika3y ago· 34 in thread

In this case there's, I think, a better alternative; the equivalent-ish code in Ruby for the example code here would be something like this:

   values = s
      .partition('?')[-1]
      .split('&')
      .map { |key_value| key_value.partition('=')[-1] }

This is one of the reasons why I vastly prefer Ruby over Python for most data processing tasks. I wish more languages would support this style of programming.

davnicwil3y ago

> the code can't be read top-to-bottom

Where this breaks down is when the code you're trying to split out is not at a different level of abstraction, and how it works is meaningful to the surrounding code in the calling function.

lamontcg3y ago

Each function becomes something new that needs to stick in your brain.

I have an actual track record of taking code that someone had MORF'd to hell and rewriting it, and making it about 40% shorter, with much fewer concepts to process.

5 more replies

wpietri3y ago

This is my experience too. I love hiding detail for readability, but you have to hide the right details!

kristopolous3y ago

The problem is those who create the messes to need that advice probably will take that advice and create even bigger messes

In practice, pithy adages don't get us any closer to sanity

Really the only news you can use is

1. Try to modify things with code you didn't write by not simply throwing parts away

2. If you think it's really difficult to deal with, figure out if other people agree with you.

3. understand why everybody thinks this

4. If you are doing that in your own code, then stop doing that.

You have to viscerally understand why a practice is bad and how doing it affects other people.

This is how you can intuitively avoid such practices in the future. Not through things that rhyme or acronyms that spell words but through social intelligence. It's fundamentally behavior

danmaz743y ago

I understand the idea, but it requires:

* great ability at naming methods - which isn't very common * massive discipline at RE-naming methods when they get changed even slightly to do something more

1 more reply

shellback33y ago

I agree, my thoughts were along this line especially when I noticed the 'magic' number -3.

khendron3y ago

marginalia_nu3y ago

Emphasis on well-named. Naming things is hard.

Maybe not relevant in simple toy examples, but you don't have to look far until to find a function that isn't so easy to name.

3 more replies

rocqua3y ago

On the whole I think such functions are valuable. But they do have downsides.

1 more reply

patrick4513y ago

layer83y ago

mdaniel3y ago

    .map { | *String* key_value | key_value.partition... }

where String shows up in light grey text indicating that IJ knows `key_value` is a String

2 more replies

WastingMyTime893y ago

I have far less issue with the piping operator in Ocaml (|>) which works exactly like a shell pipe because the code is in sequential order and that makes a huge difference.

Python like JS weren’t designed as functional languages and it shows in their syntax. Still grateful for the added functionality however.

e_i_pi_23y ago

Definitely a good point, refactoring is great and has a bunch of benefits but can also be overdone and create problems. I've heard/read Sandi Metz talk about this as "The Wrong Abstraction"[1].

[1]: https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction

chrisweekly3y ago

Yes! IME (24y and counting in the profession) devs reach too quickly for DRY while neglecting its counterbalancing principle: AHA (Avoid Hasty Abstractions).

1 more reply

xigoi3y ago

irq-13y ago

WalterBright3y ago

D does too:

    import std.algorithm, std.array, std.stdio;
    // Print sorted lines of a file.
    void main()
    {
     auto sortedLines = File("file.txt")   // Open for reading
                        .byLineCopy()      // Read persistent lines
                        .array()           // into an array
                        .sort();           // then sort them
     foreach (line; sortedLines)
         writeln(line);
    }

somehnguy3y ago

Looks similar to Streams in Java. We try to use that style where appropriate as it is much more readable and compact than imperative style imo.

vbezhenar3y ago

jonnycomputer3y ago

I like this style too, though it can make debugging trickier.

wizofaus3y ago

1 more reply

atoav3y ago

I build most of my classes that way in python.

    class Motor:
        def __init__(self):
            self.max_speed = 10.0
            self.clockwise = True
            self.controller = Controller()

        def set_max_speed(self, max_speed: float=10.0) -> 'Self':
            self.max_speed = max(0.0, max_speed)
            return self

        def start(self) -> 'Self':
            self.controller.start()
            return self

        etc..

Then you can use it as follows:

    motor = Motor().set_max_speed(5.0).start()

There is nothing stopping you from building iterators that work the same way

BiteCode_dev3y ago

I don't know, it doesn't seem very far form the python version:

    values = (
        key_value.partition('=')[-1]
        for key_value in 
        s.partition('?')[-1].split('&')
    )

oriolid3y ago

leobg3y ago

Any way to do that in Python? Basically an anonymous function across multiple lines, which can be collapsed in the IDE view?

gilch3y ago

1 more reply

vorticalbox3y ago

Not exactly the same but in python you can use pipe to create pipelines

https://pypi.org/project/pipe/

watwut3y ago

No, this pipeline is not more readable. It is a annoying to have to constantly have to read it. It makes it harder to figure out what larger algorithm and design is.

KerrAvon3y ago

This style is increasingly common in Swift, especially for Combine pipelines and SwiftUI modifiers.

_8j503y ago

Sub routines are underused. Make it a function right where it is?

tomlin3y ago

Ruby maps are so ugly.

  .map { |key_value|
  key_value.partition('=')[-1] }

samtheprogram3y ago

This sounds like a Windows user who can’t stand macOS because they don’t know where anything is.

Your post downvote edit assumes your opinion here is objective. It isn’t.

1 more reply

xtracto3y ago

I'm curious , can you show an example of how to achieve the same in another language that you will consider pretty?

noncoml3y ago· 12 in thread

We break everything down and then we reach one of the most difficult problems in software engineering: Coming up with good and short names for all these extra intermediate variables and functions.

mike_hock3y ago

lbriner3y ago

I wouldn't attack the example too much, it does seem a little contrived to make the point - the very thing I don't like about contriving examples!

1 more reply

overgard3y ago

I know a lot of programmers are against comments, but I also think this is exactly the kind of code where a comment is handy..., the purpose of the [-3:] part wasn't obvious to me at all.

2 more replies

onion2k3y ago

short names

patrick4513y ago

Short names are easier to read, because they fit on fewer lines. Doubly so if the statement fits on one line.

3 more replies

jstimpfle3y ago

kevin_thibedeau3y ago

1 more reply

leetcrew3y ago

mixedCase3y ago

It takes some deliberate practice and being able to create decent contexts within your code.

nicwolff3y ago

Crisis == opportunity ツ

Although `last_3_query_params` would be more precise, and something that explains why TF you'd want that would be better... ツ

quickthrower23y ago

zhte4153y ago

Just be consistent, whatever it is.

standardUser3y ago· 10 in thread

f1shy3y ago

ethbr03y ago

I was taking my first multi-threaded resource allocation course when I first ran into the famous Kernighan quote.

> “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”

It clicked and instantly disabused me of the notion that smart people write code that's any smarter than the minimum required to solve the problem at hand.

1 more reply

whynotminot3y ago

Comments being required are also another smell that the code doesn't explain itself. I know this is said so often it's a cliche, but it really is true.

I think both of these things point back to the same problem: out of control data structures.

guenthert3y ago

3 more replies

syntheticcdo3y ago

As an example, given:

  const sales = [ {month: "Jan", day: 1, total: 120 }, ... ]

You could determine, say, the highest sales day of a given month as follows:

  const highestSalesDayByMonth = _.chain(sales)
    .groupBy("month")
    .mapValues((salesForMonth) => _.maxBy(salesForMonth, "total"))
    .mapValues("total")
    .value()

  // highestSalesDayByMonth = { Jan: 140, Feb: 90, ... }

Naturally, minimizing the complexity of the iteratee functions and carefully naming of their arguments is very important to ease debuggability.

ParetoOptimal3y ago

Concision can be used for emphasis as verbosity can be used to obscure.

> Instead, I find myself having to re-write ultra-dense blobs of code in order to debug or even simply understand what's going on.

Is this problem because their method is inherently nee complex or is it due to lack of familiarity?

Perhaps they debug that code differently then you do and it's incompatible with your previous mental model?

professorTuring3y ago

So you can call those: “50 cent expressions”

They help no one but their ego…

f1shy3y ago

ParetoOptimal3y ago

At least in Haskell, I feel like this doesn't hold true.

Typically more concise code takes advantage of core language abstractions.

It is actually simpler unless you are unfamilar with core language abstractions, but I'd argue that's a you problem.

RajT883y ago

I had a professor who did that. He'd write a solution to the assignments he was giving us, and then spend another 5-8 hours on it trying to make it fit on a single overhead slide.

He was a much beloved professor.

jstimpfle3y ago· 9 in thread

Karellen3y ago

-- Fred Brooks, The Mythical Man-Month, 1975

-- Abelson & Sussman, The Structure and Interpretation of Computer Programs, 1984

sidlls3y ago

The first quote is an over-simplification that often does not hold in practice: one usually needs both. The exceptions are mostly trivial programs.

3 more replies

Shorel3y ago

Well, dealing with graph algorithms, none of them are obvious from the representation. Which is usually a matrix.

In fact, Gauss Jordan is also not obvious, just seeing a matrix.

overgard3y ago

In that context, I don't think this is superficial at all. If your code is hard to read, I suspect your design is hard to read too.

jstimpfle3y ago

It is superficial. No amount of trying to pretty up the code can fix underlying design deficiencies. That's what I said, so we're not even disagreeing here :)

1 more reply

29athrowaway3y ago

What is easier to read:

a) 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1

b) 20

If all the code in your project was written like a), how would you feel? Does it make your job easier or harder?

I'll tell you how most people feel when they read code that looks like a):

- The author didn't care about other maintainers.

- The author is selfish and does not have empathy for others.

- The author ruined my fucking day.

- Team members are competing by sabotaging each other's productivity.

- If I clean this, by the time I am done, the author would have pushed 10 more commits that look exactly like this and eventually become my boss.

- The author is wasting everyone's time.

- The author is forcing others to volunteer to clean up after them.

- Why does management tolerate code following the a) style? A simple intervention would make it go away and my job would be so much better.

- It's sad that everyone is too busy looking at Jira and nobody cares about the actual fucking product.

- This is slowing everyone down and I have stuff to do.

- The code is error prone, one day I'll break it.

- Why should I contribute quality code if low quality is acceptable?

"Ah ah ah, you didn't say the magic word!! ah ah ah!" Don't be the fucking Dennis Nedry of the team. Format your code, make it readable by your team and your future self.

Do you want everyone to love you? Write code like this:

https://norvig.com/spell-correct.html

mod3y ago

If your comment were code, I think it's an example of a)

1 more reply

teo_zero3y ago

Your example is artificially designed to make your point, but imagine a case like this:

  # Add the left & right margins
  width = calculatedWidth + 1 + 1

The "+1+1" might indeed be more readable than "+2".

2 more replies

jstimpfle3y ago

What the f* is wrong with you?

1 more reply

upsideDownBlue3y ago· 5 in thread

- The size of the short-term store is normally said to be 7 plus or minus 2 (the 'magic' number 7)

- A single chunk is usually considered to take up a 'slot' in short term memory

If anyone wants papers/sources for the above, let me know.

ethbr03y ago

Not high priority, but I'd love any references or names / key words I could look into it with.

I'm traditional wide comp sci by academic training, but spend my day job as a low-code enabler for non-programmers with varied backgrounds.

The working memory model explains and fits well with what I see them get and struggle with in day to day work, and I'd welcome references I could use to optimize my approach.

ooloncoloophid3y ago

An overview of the model and its history: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4207727/

The Wikipedia entry for WM is also very good: https://en.wikipedia.org/wiki/Working_memory

pramodbiligiri3y ago

Check out Anders Ericsson’s book on Deliberate Practice or Barbara Oakley’s A Mind for Numbers.

gilch3y ago

This means it should be the rule of five. We can't count on everyone having full capacity all the time.

ooloncoloophid3y ago

3pm3y ago· 5 in thread

Reminded me of 'Object Calisthenics' by Jeff Bay. Basically an exercise for a toy project where you adhere to 9 rules:

1. Only One Level Of Indentation PerMethod

2. Don’t Use The ELSE Keyword

3. Wrap All Primitives And Strings

4. First Class Collections

5. One Dot Per Line

6. Don’t Abbreviate

7. Keep All Entities Small

8. No Classes With More Than Two InstanceVariables

9. No Getters/Setters/Properties

https://williamdurand.fr/2013/06/03/object-calisthenics/

BlargMcLarg3y ago

>Wrap All Primitives And Strings

>Don’t Abbreviate

Meaning the variable name will end up long anyway. With tuples, you get deconstruction without the hassle, too.

3pm3y ago

1 more reply

overgard3y ago

Oof, these all seem absurd to me.

> 1. Only One Level Of Indentation Per Method

> 2. Don’t Use The ELSE Keyword

Not using the else statement just obscures the fact that there's a branch in the code. Obscuring something important seems to be the opposite of what you should do.

> 3. Wrap All Primitives And Strings

> 4. First Class Collections: Any class that contains a collection should contain no other member variables

> 5. One Dot Per Line... Basically, the rule says that you should not chain method calls.

This is the first one I roughly agree with, but I wouldn't consider it a hard rule. Chaining .map and .filter together for instance is a very common pattern.

> 6. Don’t Abbreviate

> 7. Keep All Entities Small... No class over 50 lines and no package over 10 files

> 8. No Classes With More Than Two Instance Variables... I thought people would yell at me while introducing this rule, but it didn’t happen

They were being polite. I'll do it for them. What the fuck?

> 9. No Getters/Setters/Properties ... My favorite rule. It could be rephrased as Tell, don’t ask.

My brain feels like it's going to explode.

> It is okay to use accessors to get the state of an object, as long as you don’t use the result to make decisions outside the object.

Why else would you want to get the state of an object?

> Any decisions based entirely upon the state of one object should be made inside the object itself.

If your classes are 50 lines long, I guarantee you that other classes will be making decisions on other objects behalf.

> Then again, they violate the Open/Closed Principle.

3pm3y ago

You seem to be criticizing the 'rules' as if they are suggested for production code. You couldn't be seriously thinking someone suggest maximum-of-2-fields as some sort guideline for the real world.

bruce3434343y ago

Draconic

didibus3y ago· 4 in thread

What the author is missing is that easy to read/reason/understand about is within the context of making a change to the code to fix a bug, add a feature or make some non-functional improvement to it.

This is what most of the "easy to read" articles forget.

Show me why it is easier to fix a bug, add a feature or make a non-functional improvement to the code with their style than without.

How easily can you now introduce new behavior before, in the middle, after, and anywhere in-between?

I wish more people focused on "easy to change/modify" then simply on "easy to read/understand".

michaelcampbell3y ago

edgyquant3y ago

If you are unit testing you should not have to worry about tweaking a function and it breaking everywhere else.

Supermancho3y ago

> If you are unit testing you should not have to worry about tweaking a function and it breaking everywhere else.

"should" is a word loaded with authority.

Why?

2 more replies

sidlls3y ago

aaronbrethorst3y ago· 4 in thread

My opinion is that maintainable code is written first for reading by humans and second for executing by computers.

Edit: Incidentally, this also applies to my commit messages. I’m writing them primarily for my future self so that I can figure out WHY I made a change, not WHAT the change was.

Shorel3y ago

My opinion is that you can write code that's easy to understand, and it is also good for the computer to run.

One thing is not in contradiction with the other.

It could lower the reusability of the code, by not having many abstractions, but it will be easy to understand, concise, and it will do what it was written for very well.

moffkalast3y ago

It should be written so it's easy for humans to understand first.

If that's too slow, then it can be optimized so it's fast, if a bit less readable.

9 times out of 10 it won't be too slow in the first place these days, unless you know beforehand that you need maximum speed for valid reasons.

2 more replies

Karellen3y ago

> My opinion is that...

You make it sound like you came up with that all by yourself

aaronbrethorst3y ago

Nah, I stand on the shoulders of generations of developers, just like everyone else here.

I didn’t claim my opinion was novel, just that it’s mine. I hope others share my opinion, because I’d find codebases that fit my criteria easier to maintain than many other types.

Also, do you agree or disagree with any of the ideas I put forth?

1 more reply

bloaf3y ago· 4 in thread

I have pretty mixed feelings about this. Personally I find it much easier to debug code that:

1) fits entirely on my screen and

2) doesn't involve much state modification

Every intermediate variable is a chance for me to miss some modification (e.g. it was passed to a func that modifies its arguments) and consequently misunderstand what is happening.

https://gist.github.com/ZeroBomb/8ac470b1d4b02c11f2873c5d4e0...

readthenotes13y ago

Do you really write a very long comment after every function call?

And have you looked at code that's over a year old and modified by other people to see how poorly those comments now match the code?

bloaf3y ago

No. I did that for people having a first exposure to this non-idiomatic currying/function chaining.

In actual code I would have put most of those on one line.

strager3y ago

How do you debug that code?

bloaf3y ago

This particular case is special because it uses 100% library functions. Typically you're composing your own functions, so you just... put breakpoints in your functions.

If you want logging, I've added an example of how to auto-log the composed functions to the gist.

hardwaregeek3y ago· 4 in thread

BurningFrog3y ago

My opinion is that each line of code should be easily understandable. Without that, code is very hard to work with.

You're right that other code problems can be worse. But that's no excuse to avoid doing the basics.

To clean up system design issues, you must first know what a better system design would be. It's not enough to realize that what you have is bad.

I do this a lot, and part of my approach is to always be incremental. Improve one detail/aspect at a time. The worst, very tempting, idea in this field is to throw everything away and start over...

> I also struggle to balance it with getting feature work done

FWIW, I like to spend 1/3 of my time cleaning up and refactoring.

theptip3y ago

I’ve also (more common) encountered engineers that don’t actively try to be clever by being terse, but also don’t put their mind to writing clearly.

Put differently, I think one has to actively try to write easy-to-read code.

brabel3y ago

Exactly. This post is helpful to beginners, sure.... but after some experience, the problem you describe becomes much more pressing.

9dev3y ago

RajT883y ago· 4 in thread

If I'm writing it for myself, and only ever myself, I'll use the more clever powershell ways of doing things. Expressions like:

1..10 | % {$_}

If you're coming from another language, you're going to have to run it to understand it, or look it up. That is time lost.

blown_gasket3y ago

I primarily write in PowerShell for end-user shell tools and Go for network services.

Where-Object is going to let you cut down on the number of lines of code compared to foreach() and for(), and in my opinion will make the code more readable.

$vms | Where-Object -Property Name -match "sql"

$vmOutput = @()

for($i = 0; $i -lt $vms.count; $i++) {

    if($i.Name -match "sql"){

        $vmOutput += $i

    }

}

$vmOutput = @()

foreach($vm in $vms){

    if($vm.Name -match "sql"){

        $vmOutput += $vm

    }

}

PeterWhittaker3y ago

> PowerShell is a shell that uses pipelines like nix shells but it has everything as an object unlike nix shells. So you get to take advantage of that.

ParetoOptimal3y ago

Some things are short and self explanatory though. `1..10` is just syntax sugar for a stream or list from 1-10 right?

RajT883y ago

1..10 is powershell range operator, and % is an alias for foreach-object.

im3w1l3y ago· 4 in thread

Give mysterious things room. In this case the most mysterious is [-3:]. That, together with the split, should have it's own line or maybe even multiple (function declaration, comment).

raldi3y ago

Right?! That should instead be -len("foo") or -NUM_PREFIX_PARAMS

xigoi3y ago

That doesn't improve anything in terms of knowing why the number is there.

1 more reply

ParetoOptimal3y ago

> [-3:].

That kind of index notation really isn't mysterious if you write a lot of python in my experience.

im3w1l3y ago

iafiaf3y ago· 3 in thread

Early in my career, I took to heart such books and articles and often felt guilty and lessor-programmer when I cut corners. Here's my 2 cents now:

- Some of this is the coding equivalent of "6 rules for financial freedom" or "6 ways to find your dream soulmate". Generic advice that doesn't reflect highly nuanced reality.

- Clode "cleanliness" is a moving target. For a coder's mental health and value proposition for his project, he/she should know what code can afford to stay dirty.

BlargMcLarg3y ago

Don't forget the most prominent part: 'your clean' and 'my clean' can differ greatly.

29athrowaway3y ago

Jonathan Blow is a creative, productive and overall smart guy, but reading his code will make you want to slam your head against the wall.

What irritates me the most are the long, non-linear comments full of distracting noise. It's like reading a choose your own adventure novel.

jstimpfle3y ago

gilch3y ago· 3 in thread

The article starts with some reasonable premises, but the conclusion does not follow.

I'm reminded of Doug McIlroy's challenge to Knuth.[1] It's worth a read. Would you rather have 6 lines of dense shell, or 10 pages of Fabergé egg? I'll take the shell, thanks.

[1]: http://www.leancrew.com/all-this/2011/12/more-shell-less-egg...

[2]: https://github.com/jsoftware/jsource

overgard3y ago

gilch3y ago

I think language popularity in industry mostly comes down to path dependence[1]. It doesn't say as much as you seem to think.

APL and derivatives are still used extensively in finance, a highly competitive field, to say the least. That's where you find the jobs.

[1]: https://en.wikipedia.org/wiki/Path_dependence

ParetoOptimal3y ago

> I think the comparative rarity of APL compared to every other programming language in existence says a lot

That's just the whole "popularity means it must be good" argument, which I disagree with.

1 more reply

retrocryptid3y ago· 3 in thread

Wowfunhappy3y ago

A nice feature of working memory is that although it's limited to around 6† "things", each thing can be any size, if your brain considers it a single unit. This is called "chunking".

So, it's easier to remember the three numbers 34, 765, 812 than the eight numbers 3, 4, 7, 6, 5, 8, 1, 2.

Refactoring code into separate functions with descriptive titles is probably a lot like combining numbers.

---

† Well, the article says 4–6; I'd always heard the average was around 7.

pavon3y ago

I'm working on a project that is following Uncle Bob's Clean Code guidelines of striving to having functions be ideally 3 lines or less, and nor more than say 7. I have mixed feelings about it.

1 more reply

artemonster3y ago

1 more reply

foolfoolz3y ago· 3 in thread

the only known metric for code complexity is as number of lines grows complexity grows

marginalia_nu3y ago

I'll just leave this here:

https://github.com/KxSystems/kdb/blob/master/c/c/odbc.c

2 more replies

joshuacc3y ago

I’m not sure what you’re trying to say, but that’s not true at all. There are other metrics for code complexity, including fairly simple but useful ones like number of logical branches.

f1shy3y ago

Irony, right?

jiggawatts3y ago· 2 in thread

Now I know where Rust got some of its syntax from...

The problem is that the author is using abstractions at the wrong level, with or without his fixes. The correct solution would be something like:

    var uri = new Uri( "http://foo/demo?test=a&blah=b%20c" );
    var map = System.Web.HttpUtility.ParseQueryString( uri.Query );
    
    Console.Out.WriteLine( "is blah equal to 'b c'?\n{0}", map["blah"] == "b c" );

The above example is C#, but similar code can be written in any language. It's simple, direct, and doesn't violate the "rule of six". It can be read like English:

1. Construct a URI from a given string.

2. Parse the query part of the URI into a map.

3. Test if the 'blah' value in the query is "b c" as expected, with the escaped space decoded properly.

Too3y ago

The python equivalent is in the urllib.parse module, part of standard library.

steveklabnik3y ago

… and Ruby got it from Smalltalk. :)

overgard3y ago· 2 in thread

Even once I moved to C, when I was an amateur I still had a tendency towards one line per thing happening. I really hated code like

    while(i++ < 10) { doSomethingWith(i); }

(Actually, I still do).

2) It's hard to add error handling to that type of code. When a lot of things happen in a complex expression, you're depending on the entire expression working.

Luckily I've grown out of that phase, although ironically now my much more mature code looks a lot like the very simplistic code I wrote as a teenager.

jstimpfle3y ago

3) It's also harder to edit with an editor (like vim), and (IMO) harder to read. I never even do "int x, y = 3;". I always put each variable declaration on its own line.

mafuy3y ago

That avoids bugs and misunderstandings, too.

You know this, but for those unaware, in the previous example, x is not initialized to 3. Similarly, in "int* p1, p2;", p2 is an int, not an int*. Easy to misread.

happyweasel3y ago· 2 in thread

ParetoOptimal3y ago

> Develop bottom-up (reusable classes) instead of large-scale up-front design.

I tend to really hate the UX of bottom-up designed API's and find them incoherent.

MH153y ago

twblalock3y ago· 2 in thread

If you don't understand that, multiple smaller lines won't help you, because you just don't know what you are doing.

If the article's author had a legitimate bone to pick, they would have better examples.

overgard3y ago

macintux3y ago

> In addition, that code example is easily testable.

I'm skeptical, because typically a line like that is embedded in the middle of a larger function.

Extracting the logic into a dedicated, pure function helps with testing.

gilch3y ago· 2 in thread

At least for the contrived example from the article, the solution isn't to break up the code, but to use denser code. Use a regex.

Does anybody really think that e.g. sregex[1] is better than just learning and using the regex language directly? Because that's where this kind of thinking leads.

[1]: https://github.com/jwiegley/emacs-release/blob/master/lisp/o...

toiletduck3y ago

I know it's not the point the author is trying to make, but I couldn't help get the feeling this example isn't good enough to carry the point.

from stdlib:

  from urllib.parse import urlparse, parse_qsl

  url = 'https://www.example.com/some_pathsome_key=some_value&foo=bar'
  parsed_url = urlparse(url)
  values = [v for _, v in parse_qsl(parsed_url.query)]
  print(values)

which I guess you could oneliner back to this..

  [v for _, v in parse_qsl(urlparse(url).query]

cavisne3y ago

I think for any code thats meant to be read and maintained by someone else a regex is a bad idea.

You are saving a few lines on the surface, but adding a potential backtracking bug in the future.

skitter3y ago· 2 in thread

    s.split('?')[1]
     .split('&')[-3:]
     .map(lambda x: x.split('=')[1])

Unfortunately, that's not how Pythons map(), len() and such were designed.

gilch3y ago

Idiomatic Python wouldn't use a map here, but a generator expression:

    (x.split('=')[1] for x in s.split('?')[1].split('&')[-3:])

Removing the lambda cuts down on the noise considerably.

And honestly, with this many splits with fixed indexes, I'd probably use a regex. Now there's a dense language for you.

chii3y ago

i don't really see why or how a generator expression is any easier to read than a chained call like the OP's example.

thisismyswamp3y ago· 2 in thread

Skimmed the article and the comments and got no answer - can someone tell me what the rule of six is?

_dain_3y ago

>That gives us a rule for deciding if a line of code is too complex:

>A line of code containing 6+ pieces of information should be simplified.

he put it in bold.

krapp3y ago

The rule is literally described, in bold text, within the article.

Try actually reading instead of skimming next time. Or at least skim more slowly.

1 more reply

dcow3y ago· 1 in thread

This is why setting an arbitrarily short max line length matters. And consequently why auto-formatters suck.

So if you want people to implicitly start structuring their code as advised in this post, set a 80 or 100 char line length. And adopt a fuzzy “one statement per line” philosophy.

alpaca1283y ago

To be fair considering such visual details is a complex task and probably hell to implement.

lbriner3y ago· 1 in thread

If I've only helped one person today, it was worth it ;-)

layer83y ago

  DoThis();
  DoThat();
  DoTheOtherThing();

Instead you usually have something like:

  x = DoThis(a, b, c);
  y, z = DoThat(c, x, a);
  w = DoTheOtherThing(a, z, x, y, b);

…and on top of that have to add error handling for those calls.

jonnycomputer3y ago· 1 in thread

Then you have to name things. And naming things sucks, especially because not every intermediate has an obvious name for it, distinct enough to distinguish it from the next intermediate chunk.

overgard3y ago

Naming things is hard, but it's also important.

Krasnol3y ago· 1 in thread

Your cookie banner "manage settings" thing never stops loading.

tmtvl3y ago

Yeah, I bypassed it with Firefox's Reader Mode, but the original page slowed FF down to a crawl.

meitros3y ago· 1 in thread

This seems like an interesting heuristic for anything that could automatically either generate or "format" code - a little more semantic than just relying on a text parser

djmips3y ago

That is a cool idea. Show the code the way you understand it best without even having to change the underlying text.

charles_f3y ago· 1 in thread

Clean code is not just a few rules about how to write a line. You can write nice lines that still don't make sense and amount to shit code

ARandomerDude3y ago

xfz3y ago· 1 in thread

I can't access the linked page without accepting cookies.

gilch3y ago

Can't you just use an incognito tab? It'll delete all the cookies when you're done.

kazinator3y ago· 1 in thread

This rewrite is more performant than the original:

  query_params = s.split('?')[1].split('&')[-3:]
  map(lambda x: x.split('=')[1], query_params)

gilch3y ago

What? No it isn't! You didn't parse that correctly.

Same with the lambda definition: it's evaluated only once. It's just the lambda body that gets reevaluated each loop, and only evaluated for the first time on the first loop.

1 more reply

michaelwww3y ago· 1 in thread

If you're like me and like to step through code with a debugger, shorter lines are better for setting breakpoints and checking values.

convolvatron3y ago

this is important and true.

but I really wish debugger evolution hadn't stopped at the line.

artemonster3y ago· 1 in thread

kazinator3y ago

This is from Brian Kernighan (the 'K' in the "K&R C Book" and AWK), known as Kernighan's law:

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

pavon3y ago· 1 in thread

Never cleaned up the most obtuse part of that code snippet - why are we only keeping the last three parameters?

lbriner3y ago

Taking a subet of query params smells in its own right so, again, might be a bad example.

he00013y ago· 1 in thread

overgard3y ago

Here's conway's game of life in APL:

    life ← {⊃1 ⍵ ∨.∧ 3 4 = +/ +⌿ ¯1 0 1 ∘.⊖ ¯1 0 1 ⌽¨ ⊂⍵}

Is that shorter than essentially every other language implementation. Yep!

However, to even begin to understand it you have to read an article from the original writer:

https://aplwiki.com/wiki/John_Scholes%27_Conway%27s_Game_of_...

To me, that is objectively, not subjectively, less clear than the longer implementations.

3 more replies

glintik3y ago· 1 in thread

gilch3y ago

And Python!

sdoering3y ago· 1 in thread

Offtopic:

The fact that the site uses a consent solution that fakes a loading screen when trying to configure (read disable) tracking/advertising is an instant bounce for me.

niea_113y ago

It's not fake. In my case, it's not working because of my adblocker. When I disable it, it works.

stagas3y ago

rzimmerman3y ago

That cumbersome line of python is a good example. As an experienced python person, I immediately found myself giving names to the chunks to understand it.

WastingMyTime893y ago

  last_three_url_param_values = lambda(query_string: query_string.split(‘=‘)[1], url.split(‘?’)[1].split(‘&’)[-3:])

longrod3y ago

So instead of writing:

    await page.click(".play-button");

You do:

    await app.play();

This also has the benefit of extreme reusability. Doesn't work for everything though.

dan-robertson3y ago

userbinator3y ago

Yet when it comes to programming languages, the majority of "advice" seems to be about absolute dumbing-down and propagating an attitude of "it's too hard, you can't possibly learn, just give up"?

Kernighan's Lever: http://www.linusakesson.net/programming/kernighans-lever/ind... (look at the rest of his site; he has clearly leveraged that attitude with great success)

(I looked at the one-line example in Python and, despite having very little experience in the language, it was actually faster to read and understand as a whole than the 3-line version.)

hirundo3y ago

quickthrower23y ago

This seems a little light to me, doesn’t give me the feeling of being written by a veteran coder. Unless the idea is to dumb it down for a particular audience.

The real answer is write code like you write words: Rework it to make sense to the reader. How many newlines you need as a hint for your editor to wrap and where you put them will fall out of that.

Or autoformat! I love autoformatters!

Edit: edited to make easier to parse mentally.

davesque3y ago

tgv3y ago

The starting assumption is highly dubious: "Short lines of code require less brainpower to read than long ones."

d_burfoot3y ago

Split Into Multiples Lines has a real problem, which appears in the example code.

Let say you have a long code block that includes the revised snippet:

> query_params = s.split('?')[1].split('&')[-3:]

> mylist = map(lambda x: x.split('=')[1], query_params)

> ...

> (some more complex transformations, that only depends on mylist)

jollybean3y ago

I object to the 're-write as function'.

Functions come with abstraction overhead. You don't know who will consume them, so you may have to put up type checks, null checks other BS.

Also - functions split up the logic all over the place, it's confusing.

I think what we need are 'nested functions' which serve to kind of create a scope pushed to the stack - with an implicit 'return' - which we can then 'collapse' in the GUI etc..

I mean, it's purely cosmetic from a CS point of view, but it might help to organize things a bit better and hand off abstractions in long function implementations.

Huge projects with 1 or 2 line functions drive me crazy - you have to constantly jump around all over place to figure out what's going on. I actually believe it's a historic anti-pattern.

I make functions when we need 1) used in different places 2) meaningful abstraction.

Otherwise, well documented longer functions for me.

ppierald3y ago

rekrsiv3y ago

irrational3y ago

cc1013y ago

Waterluvian3y ago

I flexibly agree with the “does one thing” approach. But what a “thing” is can be up to you.

Sometimes my one thing is “turns a Json file into an in memory dictionary” which might be three operations on one line.

xigoi3y ago

When I clicked on “More settings” in the cookie dialog, it displayed a loading animation (ignoring my prefers-reduced-motion setting) and got stuck. Just straight-up user-hostile design.

jwilliams3y ago

All comes down to good naming in the end. The craft is finding both compact and specific names.

Shorter names for longer spans are much better. But you’d hope they’re the very obvious subject of that span.

dasil0033y ago

hedora3y ago

I think this article is missing the forest for the trees.

For instance, I'd prefer a function that takes a string and confirms it is a valid URL.

That would delegate to URL character esacaping logic and DNS validation. (Are & or ? valid DNS name characters? Will they be in the future? I neither know nor care.)

On top of that, there would be a parser for key=value config file lines.

Then, the example in the article becomes something like:

keyvalue = parseConfLine(input)

URL(keyvalue.value).params[-3]

Plus a few more lines to confirm key is as expected and that value has enough query parameters.

Alternatively, I'd use a perl oneliner with a regexp. I see no purpose for code that lands in the middle ground between these extremes.

mkoubaa3y ago

A reviewer can usually tell which code is easier to understand side by side, even when it's yourself as the reviewer.

Applying rules like these to your code may or may not result to cleaner code, but that's a testable hypothesis.

I've seen all too often some clean code recommendation or other applied to code and it gets harder to understand. And the person doing the refactoring (often myself) gets caught in sunk cost.

Now my recommendation is always:

1. Use your intuition to predict if a change makes code cleaner.

2. Try to make that change, and be open to doing things a little differently that you first imagined.

3. Test your hypothesis to see what others think. Decide what to do, but be mentally willing to throw it away.

4. Repeat

Articles like this are good resources to help train your intuition, but there is no substitute to developing your personal and team "flavor profile" for what styles suit your way of thinking.

GuB-423y ago

Writing "clean" code is more of an art form, you can't really have easy rules.

But that's the baseline, all the art is in appropriate breaking of that guideline, to have short code that doesn't look like it came out of a minifier.

catlifeonmars3y ago

soulofmischief3y ago

> STM and WM are small. Both can only store about 4 to 6 things at a time!

Neuroscience is a useful tool for self-programming but you must be careful peddling absolutist statements like this which can do more harm than good.

geewee3y ago

Sure, it might be relevant if all variables were named "x", "xx", "xx", - but they're not.

avnigo3y ago

A map lambda example is what I had in mind when reading the article. I'm not a big fan of the temporary variables, though.

Admittedly the example below is not a perfect solution, but that's where I thought the article was heading when splitting that code over multiple lines for readability.

  map(
      lambda x: x.split('=')[1],
      (url
          .split('?')[1]
          .split('&')[-3:]
      )
  )

Is this still too unreadable or more messy?

shadowofneptune3y ago

Beltiras3y ago

Can't get past the obnoxious cookies. Anyone have the text?

schemathings3y ago

For the example in the text I'd typically just include a one line comment above to show what an example string would look like and leave the code as is

# URL with params https://news.ycombinator.com/item?id=32963021&something=valu...

map(lambda x: x.split('=')[1], s.split('?')[1].split('&')[-3:])

AtlasBarfed3y ago

The issue is that short code lines increases the length of code aka wastes vertical screen reasl estate aka visible code, so you're overburdened short term memory has to context switch to scroll.

"Simple, put code in small methods"

Oh great, now I do a nav jump or a string search as a context switch rather than scroll.

Comments? increase vertical screen pollution.

Proper chunking is hard.

Maybe APL was right.

djmips3y ago

ravenstine3y ago

exabrial3y ago

Lame.

schwartzworld3y ago

> Is that hard for you to read? Me too. There's a good reason why. You have to know what map, lambda, and .split() are.

This is pretty weak. Not knowing what a function or language feature does, doesn't make it inherently unreadable.

nmz3y ago

This is forth code 101, factorization is an absolute must when writing forth code.

notjustanymike3y ago

nottorp3y ago

My, the very dark pattern in the cookie dialog... closed the page when manage cookies didn't load in 20 seconds. I bet there's an explicit delay in there.

readthenotes13y ago

The author misunderstands Miller's research on working memory, often reported as "7 +/- 2".

But, Miller states that limit is valid only for unrelated items.

iLoveOncall3y ago

Readable code and clean code are two different things and I think this article does a prety poor job at writing clean code.

whiddershins3y ago

Code has a typo ‘slit’ instead of split after bringing up the concept of moving something into a function.

gauddasa3y ago

The magic number for programmers is 7. 4 to 6 is for the rest of the world.

NonNefarious3y ago

And also: Use tabs.

j / k navigate · click thread line to collapse