Taming names in software development (opens in new tab)

(simplethread.com)

77 pointsjetheredge3y ago43 comments

43 comments

What this misses — and every other article I’ve seen about naming misses — is that the truly hard part of good naming is none is the stuff that was mentioned.

The hard part is this: taking the deserved level of care with naming often has to be done in a context with other humans who wrongly think that it’s simply not that big a deal, and their annoyance with having it brought up becomes an often unspoken source of friction. This not only leads to rancor but also has a chilling effect.

Even making the unspoken spoken often doesn’t help. The response will be like: oh yeah, choose some good names for us, knock yourself out, it’s great that somebody cares (but let’s not stop calling the timestamp a counter, or the database temmp_v3_old_Udpate_RstdnewV2, because reasons).

It can depend on the team, but in my experience successfully getting past these issues is usually way harder than any of the other factors mentioned.

drekipus3y ago

This came up at my work Christmas party. One colleague on my team (1) hates my reviews because the naming things I point out are way too pedantic. Another colleague (2) in our same team loves my reviews because it makes them think about how it's read and understood.

It's difficult to convey why naming is important. (1) also feels that the code needs a lot more comments and berates others for not commenting their code. I have a hard time pointing out the irony to them.

durandal13y ago

One thing that I've found over and over again, is that if something is hard to name, there is often a problem with the abstraction itself. Most common issue is that the thing that resists naming is doing multiple things and that it should be split up in smaller parts that focuses on something small that can be simply named.

2 more replies

ipaddr3y ago

And this is why code reviews while helpful can be wasteful. Reviews should be based on existing standards not preferred methods of each reviewer. Some people feel they need to comment so they over comment. Some feel they need to prove how much they know. Others may mark you down for doing it how the previous review recommended.

1 more reply

thaumasiotes3y ago

> but let’s not stop calling ... the database temmp_v3_old_Udpate_RstdnewV2

I worked briefly on a codebase that was otherwise produced by Chinese programmers (in China).

It did feel somewhat surreal seeing variable names that were in English but misspelled, particularly when e.g. several classes all had a field of the same name, except that in one of them, the name was misspelled.

I assume it didn't bother them because (a) they weren't English speakers anyway, and (b) they autocompleted everything. Why would it matter that `update_history` happens to be spelled `udpate_history` in one out of five classes? You just type `u` and pick the right field.

dmurray3y ago

I've seen the same thing many times with English-speaking programmers. Plenty of people just don't spot that 'udpate_history' is misspelt.

2 more replies

mysterydip3y ago

I used to think descriptive variable names were a waste of time (a,b,c, even things like booya, foo, etc), until I had to maintain my own code from years ago. I literally had no one to blame but myself for the extra deciphering work I had to do.

DrBazza3y ago

Naming is simple when the function or method is simple.

Complex naming is often, but not always, a good indicator of complexity.

Unless of course, it's Java... https://projects.haykranen.nl/java/

beeforpork3y ago

How locally a variable is used should matter for selection of a name, too. I don't find i, j, k, o, p, k, v at all offensive if their scope is just a few lines of code. Usage is often idiomatic (e.g., k, v for iterating a map or i for an integer loop variable) and using a longer name would just make it less idiomatic and less obvious.

anon1211223y ago

This makes a lot of sense.

I also prefer for-loops that uses "i". It is instantly clear that "i" is the current index used by the loop. Even though it is a single character variable name, it has a specific meaning by convention.

If I see a variable named "people_result_list_index" it actually hurts the readability. I don't know that it actually is local to the for-loop, as it could have been defined anywhere in the code or even passed as an argument to the function. It actually hurts readability and adds complexity.

Using "i", "k", "v" and other single character variables outside for-loops is often not advisable. An exception to this could be "x", "y" and "z" if they refer to positions in 2D / 3D space. Personally I would probably wrap them in a structure, so that you could refer to them as pos.x and pos.y. But I wouldn't hold it against someone if they thought the code was readable without it. It is basically part of the domain knowledge. Other domains may have similar exceptions. The "R" value in terms of growth rate comes to mind.

TLDR; Single character variables can make sense in the right context, when they are used as part of a convention or domain terminology

js83y ago

Personally, I found there are two important aspects of things being named that are hard to express in names:

1. How important the thing is when it comes to big picture. Is this function where the meat of the program is, or it's just a technicality? Is this the main data structure or just something temporary? Good names should tell me what to focus on when reading the code.

2. Whether the name describes only the thing as it is, or actually prescribes what its use is. For example, LinkedList is descriptive because it tells only that the thing is a data structure, but it's up to you how to use it. On the other hand, CustomerRecord is prescriptive - it might be just a bunch of strings, but it also tells me what the intended use is, which is not necessarily contained in the code itself - it might be just some boilerplate to manage it in the database.

kitd3y ago

In JavaScript, you’ll often see i, j and subsequent letters as iteration variables. i is not descriptive, and j is somehow even less so.

Using i & j etc for indices dates back to older versions of FORTRAN where the variable type depended on the first character of the name, with i- n reserved for integers.

Quite why the convention has persisted for so long is one of SW Engs little mysteries.

furyofantares3y ago

Surely it predates that as well, I assume FORTRAN chose ijklmn as the integers due to i,j,k being used as indices in mathematical convention, (as well as n for sequences and m,n for matrices.)

maccard3y ago

> Quite why the convention has persisted for so long is one of SW Engs little mysteries.

I like it - it's a convention that has clear context once it's initially understood. It saves the argmuent of "is it an index, counter, loop iterator, <something else>" - it's the i/j/k'th loop's index.

xg153y ago

> I understand exactly what BasicReviewableFlaggedPostSerializer is on my first time seeing it.

Good for the author, I certainly wouldn't. Things I'd have to research in the codebase before understanding the name are:

- what is Post?

- what do Flagged and Reviewable mean here? Are they attributes of the Post or the serializer?

- what does Basic mean? Again, what is this referring to? Is it indicating that the class is some kind of base class for an object hierarchy?

TheFattestNinja3y ago

"In software, really good names are meaningful, descriptive, SHORT, consistent, and distinct." (emphasis mine)

I hate this general reccomendation style that names need to be short. This only made sense in the old times of programming where you had to actually type them. The reality of IDE's bringing all forms of intellisense and autocomplete means you almost never type a name out, thus being short brings no benefit if not "habit". You should really try to have understandable names, detailed names, but not care about shortness. "timeout" is a good variable name if you language has some great type system and your coding works with that. "timeoutInSeconds" is a better one if you are just using an int/long to distinguish it from "timeoutInMillis" and avoid silly mistakes.

wging3y ago

I disagree. The utility of short names is not just that it takes less effort to write them. They're also far easier to read and understand. IDE autocompletion doesn't help with that, nor does any other tooling, really. Since code is read much more frequently than it's written (including but not limited to any time that related code has to be changed), names should be as short as possible without sacrificing clarity. Excessively long names are harder to parse, and can slow you way down when trying to understand code.

(I'm arguing against what I see as your central point, but to be fair, 'without sacrificing clarity' is doing a bit of work in the paragraph above... your example is actually a good case of a bit of additional length being actually worth it. I would say timeoutMs or timeoutSecs are good shorter alternatives, "ms" and "secs" being widespread and clear abbreviations. You're completely right that 'timeout' is insufficiently clear for a purely numeric type, though I'd disagree that you need all that much type system magic to make it OK. For example calling a `java.time.Duration` `timeout` seems fine.)

rTX5CMRXIfFG3y ago

The problem with `timeoutMs` or `timeoutSecs` is that if you have a policy that you shouldn't contract words (possibly founded on a first principle that you value clarity in your coding standards), then you're going to spend time justifying why a pull request gets rejected when someone names a type `SearchCntrlr` or `SubmtBtn`. Before you know it, you'll have spent hours just debating and getting no work done, whereas you wouldn't have the problem if you just spelled out `timeoutInMilliseconds` or `timeoutInSeconds` fully.

I mean, what value do the shortenings `ms` and `secs` provide, anyway? Saving keystrokes? You could still type our `timeoutms` and the IDE's autocomplete would suggest it for you, right?

1 more reply

ParetoOptimal3y ago

> thus being short brings no benefit if not "habit".

IPersonallyFind tonsOfCode likeThisReally hardToParse, especially when the least important parts have the longest names.

As a result code like that can be overwhelming and sometimes make me dread working with it.

Falkon13133y ago

I find they also often indicate over-abstraction or over-complicated generic stuff that is often kind of irrelevant to the domain.

Equipment_Maintenance_Criteria isn't super short, but it appears to actually mean something. Definitely tells you more than just "Selection" would. But all too often it's called something like AbstractServiceFactoryBuilderManagerLocatorPlugin which really doesn't tell you anything at all and leaves you at the end knowing less than you did when you started reading it.

tartoran3y ago

Yes, long names can be taxing too if taken to extremes. I use descriptive names that spell out the domain or business logic so code becomes as close to self documenting as it gets. However, locally when I have to reference these multiple times I use a short alias, usually the acronyms of the long names so it’s the best of both worlds: don’t have to carry around the long names everywhere but still have a fallback on them when I forget what they represent.

ipaddr3y ago

do_you_prefer_underscores? iKindOfDo i_kind_of_do

1 more reply

hakunin3y ago

Long names should be a signal that you’re breaking away from current context or doing something unusual. Unnecessary length and redundant context makes names more difficult to discern. I wrote about it a bit more in the “what?” section of this post: https://max.engineer/maintainable-code

blowski3y ago

Perhaps "short" is a shorter way of saying "quick to parse". Even with IDEs, I still want that, lest I end up having to choose between names like this all the time:

- MinimumPriceCalculatorFactory

- MinimumPriceCalculatedPriceFactory

The extra couple of seconds every time becomes distracting and irritating.

jffhn3y ago

In small methods I tend to use shorter names, even very short non-descriptive names, because there is less context so less chance of confusion, and it makes it easier to see what's going on in a glimpse (and check it matches method name).

On the other hand, if some public method does multiple things that need to be known to decide where it can be called, I put them all in its name.

furyofantares3y ago

> I understand exactly what BasicReviewableFlaggedPostSerializer is on my first time seeing it.

I don't. I think I figured it out after reading it half a dozen times (except for Basic, no clue there) before working out that Post is probably a noun. So even this requires context to just read and know what it does, my first read of it I only knew what Serializer meant.

furyofantares3y ago

Or maybe I'm still not getting it, my read is there are posts, they can be flagged, flagged posts can be reviewed, and this is a "basic" serializer for flagged posts that have yet to be reviewed.

But why you'd need such a specialized serializer is beyond me, (let alone presumably less basic one as well) it seems like such drastic overkill that maybe I still don't get what the name means.

anon1211223y ago

> Or maybe I'm still not getting it, my read is there are posts, they can be flagged, flagged posts can be reviewed, and this is a "basic" serializer for flagged posts that have yet to be reviewed.

I think you are right. The name is pretty clear to me, but that may be because I have worked on similar code bases where this naming is used by convention. Reading code requires knowing the domain and I'm not sure if a shorter name is more clear. You need domain knowledge to know what a post is, and what it means that it has been flagged.

You probably made a very accurate guess based on your knowledge of forums and moderator systems. This may not be apparent to all, and shorter names will probably not help much. In addition, if they shortened the name to "Serializer", "PostSerializer" or even "FlaggedPostSerializer" it could conflict with other serializers in the project.

> But why you'd need such a specialized serializer is beyond me,

I totally agree with your point. They may have their reasons, but it seems to me that a "ReviewableFlaggedPost", "FlaggedPost" and "Post" should have very similar needs and could be solved by structuring them differently (perhaps by using composable classes that can each take care of their own serialization)

Regarding the use of "Basic", it also triggers a "code smell" reaction from me. It may make sense to them, and it's hard for me to make any definitive comments without knowing the rationality behind it. My guess is that they have different types of responses based on the same "post" object. "Basic" may include a subset of the "Full" response, such as id and title only.

In those cases I tend to prefer separate DTOs, like "PostSummaryDTO" and "PostDTO" that can be re-used by composability for different responses (flagged for review etc.). This may of course not be the best choice for all usages, so I would need to know more to say something conclusive about this particular case

1 more reply

pastacacioepepe3y ago

Once you understand the domain you're working on and you've architectured your solution in a way that makes sense, only then naming in your code will get right and without much additional effort. Forget about naming, it is a side effect of your understanding of the issue at hand.

hbrn3y ago

Exactly. Variable naming is not an isolated skill you can hone, and the fact that people are treating it like one means they are missing the point.

You won't become a great author by perfecting your grammar instead of storytelling.

aliedginess3y ago

In previous projects, data dictionaries helped name things like database tables and columns. In one project, DBA team used ERwin (data modeling software) to maintain a data dictionary and data model.

Are data dictionaries still in use today? Are there open source examples, books, etc. to learn from?

jacrys3y ago

There are only two hard things in software development:

* naming things

* cache-invalidation

* off-by-one errors

khanzaib3y ago

Ap ko kakdbekkjdnsicdjd9jje disixjdjd sjxjenx xo ospenfneosjdbe do lcdlnrnepdcd

j / k navigate · click thread line to collapse

43 comments

natch3y ago

What this misses — and every other article I’ve seen about naming misses — is that the truly hard part of good naming is none is the stuff that was mentioned.

It can depend on the team, but in my experience successfully getting past these issues is usually way harder than any of the other factors mentioned.

drekipus3y ago

durandal13y ago

2 more replies

ipaddr3y ago

1 more reply

thaumasiotes3y ago

> but let’s not stop calling ... the database temmp_v3_old_Udpate_RstdnewV2

I worked briefly on a codebase that was otherwise produced by Chinese programmers (in China).

dmurray3y ago

I've seen the same thing many times with English-speaking programmers. Plenty of people just don't spot that 'udpate_history' is misspelt.

2 more replies

mysterydip3y ago

DrBazza3y ago

Naming is simple when the function or method is simple.

Complex naming is often, but not always, a good indicator of complexity.

Unless of course, it's Java... https://projects.haykranen.nl/java/

beeforpork3y ago

anon1211223y ago

This makes a lot of sense.

TLDR; Single character variables can make sense in the right context, when they are used as part of a convention or domain terminology

js83y ago

Personally, I found there are two important aspects of things being named that are hard to express in names:

kitd3y ago

In JavaScript, you’ll often see i, j and subsequent letters as iteration variables. i is not descriptive, and j is somehow even less so.

Using i & j etc for indices dates back to older versions of FORTRAN where the variable type depended on the first character of the name, with i- n reserved for integers.

Quite why the convention has persisted for so long is one of SW Engs little mysteries.

furyofantares3y ago

Surely it predates that as well, I assume FORTRAN chose ijklmn as the integers due to i,j,k being used as indices in mathematical convention, (as well as n for sequences and m,n for matrices.)

maccard3y ago

> Quite why the convention has persisted for so long is one of SW Engs little mysteries.

xg153y ago

> I understand exactly what BasicReviewableFlaggedPostSerializer is on my first time seeing it.

Good for the author, I certainly wouldn't. Things I'd have to research in the codebase before understanding the name are:

- what is Post?

- what do Flagged and Reviewable mean here? Are they attributes of the Post or the serializer?

- what does Basic mean? Again, what is this referring to? Is it indicating that the class is some kind of base class for an object hierarchy?

TheFattestNinja3y ago

"In software, really good names are meaningful, descriptive, SHORT, consistent, and distinct." (emphasis mine)

wging3y ago

rTX5CMRXIfFG3y ago

I mean, what value do the shortenings `ms` and `secs` provide, anyway? Saving keystrokes? You could still type our `timeoutms` and the IDE's autocomplete would suggest it for you, right?

1 more reply

ParetoOptimal3y ago

> thus being short brings no benefit if not "habit".

IPersonallyFind tonsOfCode likeThisReally hardToParse, especially when the least important parts have the longest names.

As a result code like that can be overwhelming and sometimes make me dread working with it.

Falkon13133y ago

I find they also often indicate over-abstraction or over-complicated generic stuff that is often kind of irrelevant to the domain.

tartoran3y ago

ipaddr3y ago

do_you_prefer_underscores? iKindOfDo i_kind_of_do

1 more reply

hakunin3y ago

blowski3y ago

Perhaps "short" is a shorter way of saying "quick to parse". Even with IDEs, I still want that, lest I end up having to choose between names like this all the time:

- MinimumPriceCalculatorFactory

- MinimumPriceCalculatedPriceFactory

The extra couple of seconds every time becomes distracting and irritating.

jffhn3y ago

On the other hand, if some public method does multiple things that need to be known to decide where it can be called, I put them all in its name.

furyofantares3y ago

> I understand exactly what BasicReviewableFlaggedPostSerializer is on my first time seeing it.

furyofantares3y ago

Or maybe I'm still not getting it, my read is there are posts, they can be flagged, flagged posts can be reviewed, and this is a "basic" serializer for flagged posts that have yet to be reviewed.

But why you'd need such a specialized serializer is beyond me, (let alone presumably less basic one as well) it seems like such drastic overkill that maybe I still don't get what the name means.

anon1211223y ago

> Or maybe I'm still not getting it, my read is there are posts, they can be flagged, flagged posts can be reviewed, and this is a "basic" serializer for flagged posts that have yet to be reviewed.

> But why you'd need such a specialized serializer is beyond me,

1 more reply

pastacacioepepe3y ago

hbrn3y ago

Exactly. Variable naming is not an isolated skill you can hone, and the fact that people are treating it like one means they are missing the point.

You won't become a great author by perfecting your grammar instead of storytelling.

aliedginess3y ago

In previous projects, data dictionaries helped name things like database tables and columns. In one project, DBA team used ERwin (data modeling software) to maintain a data dictionary and data model.

Are data dictionaries still in use today? Are there open source examples, books, etc. to learn from?

jacrys3y ago

There are only two hard things in software development:

* naming things

* cache-invalidation

* off-by-one errors

khanzaib3y ago

Ap ko kakdbekkjdnsicdjd9jje disixjdjd sjxjenx xo ospenfneosjdbe do lcdlnrnepdcd

j / k navigate · click thread line to collapse