The hard part is this: taking the deserved level of care with naming often has to be done in a context with other humans who wrongly think that it’s simply not that big a deal, and their annoyance with having it brought up becomes an often unspoken source of friction. This not only leads to rancor but also has a chilling effect.
Even making the unspoken spoken often doesn’t help. The response will be like: oh yeah, choose some good names for us, knock yourself out, it’s great that somebody cares (but let’s not stop calling the timestamp a counter, or the database temmp_v3_old_Udpate_RstdnewV2, because reasons).
It can depend on the team, but in my experience successfully getting past these issues is usually way harder than any of the other factors mentioned.
It's difficult to convey why naming is important. (1) also feels that the code needs a lot more comments and berates others for not commenting their code. I have a hard time pointing out the irony to them.
I worked briefly on a codebase that was otherwise produced by Chinese programmers (in China).
It did feel somewhat surreal seeing variable names that were in English but misspelled, particularly when e.g. several classes all had a field of the same name, except that in one of them, the name was misspelled.
I assume it didn't bother them because (a) they weren't English speakers anyway, and (b) they autocompleted everything. Why would it matter that `update_history` happens to be spelled `udpate_history` in one out of five classes? You just type `u` and pick the right field.
Complex naming is often, but not always, a good indicator of complexity.
Unless of course, it's Java... https://projects.haykranen.nl/java/
I also prefer for-loops that uses "i". It is instantly clear that "i" is the current index used by the loop. Even though it is a single character variable name, it has a specific meaning by convention.
If I see a variable named "people_result_list_index" it actually hurts the readability. I don't know that it actually is local to the for-loop, as it could have been defined anywhere in the code or even passed as an argument to the function. It actually hurts readability and adds complexity.
Using "i", "k", "v" and other single character variables outside for-loops is often not advisable. An exception to this could be "x", "y" and "z" if they refer to positions in 2D / 3D space. Personally I would probably wrap them in a structure, so that you could refer to them as pos.x and pos.y. But I wouldn't hold it against someone if they thought the code was readable without it. It is basically part of the domain knowledge. Other domains may have similar exceptions. The "R" value in terms of growth rate comes to mind.
TLDR; Single character variables can make sense in the right context, when they are used as part of a convention or domain terminology
1. How important the thing is when it comes to big picture. Is this function where the meat of the program is, or it's just a technicality? Is this the main data structure or just something temporary? Good names should tell me what to focus on when reading the code.
2. Whether the name describes only the thing as it is, or actually prescribes what its use is. For example, LinkedList is descriptive because it tells only that the thing is a data structure, but it's up to you how to use it. On the other hand, CustomerRecord is prescriptive - it might be just a bunch of strings, but it also tells me what the intended use is, which is not necessarily contained in the code itself - it might be just some boilerplate to manage it in the database.
Using i & j etc for indices dates back to older versions of FORTRAN where the variable type depended on the first character of the name, with i- n reserved for integers.
Quite why the convention has persisted for so long is one of SW Engs little mysteries.
I like it - it's a convention that has clear context once it's initially understood. It saves the argmuent of "is it an index, counter, loop iterator, <something else>" - it's the i/j/k'th loop's index.
Good for the author, I certainly wouldn't. Things I'd have to research in the codebase before understanding the name are:
- what is Post?
- what do Flagged and Reviewable mean here? Are they attributes of the Post or the serializer?
- what does Basic mean? Again, what is this referring to? Is it indicating that the class is some kind of base class for an object hierarchy?
I hate this general reccomendation style that names need to be short. This only made sense in the old times of programming where you had to actually type them. The reality of IDE's bringing all forms of intellisense and autocomplete means you almost never type a name out, thus being short brings no benefit if not "habit". You should really try to have understandable names, detailed names, but not care about shortness. "timeout" is a good variable name if you language has some great type system and your coding works with that. "timeoutInSeconds" is a better one if you are just using an int/long to distinguish it from "timeoutInMillis" and avoid silly mistakes.
(I'm arguing against what I see as your central point, but to be fair, 'without sacrificing clarity' is doing a bit of work in the paragraph above... your example is actually a good case of a bit of additional length being actually worth it. I would say timeoutMs or timeoutSecs are good shorter alternatives, "ms" and "secs" being widespread and clear abbreviations. You're completely right that 'timeout' is insufficiently clear for a purely numeric type, though I'd disagree that you need all that much type system magic to make it OK. For example calling a `java.time.Duration` `timeout` seems fine.)
I mean, what value do the shortenings `ms` and `secs` provide, anyway? Saving keystrokes? You could still type our `timeoutms` and the IDE's autocomplete would suggest it for you, right?
IPersonallyFind tonsOfCode likeThisReally hardToParse, especially when the least important parts have the longest names.
As a result code like that can be overwhelming and sometimes make me dread working with it.
Equipment_Maintenance_Criteria isn't super short, but it appears to actually mean something. Definitely tells you more than just "Selection" would. But all too often it's called something like AbstractServiceFactoryBuilderManagerLocatorPlugin which really doesn't tell you anything at all and leaves you at the end knowing less than you did when you started reading it.
- MinimumPriceCalculatorFactory
- MinimumPriceCalculatedPriceFactory
The extra couple of seconds every time becomes distracting and irritating.
On the other hand, if some public method does multiple things that need to be known to decide where it can be called, I put them all in its name.
I don't. I think I figured it out after reading it half a dozen times (except for Basic, no clue there) before working out that Post is probably a noun. So even this requires context to just read and know what it does, my first read of it I only knew what Serializer meant.
But why you'd need such a specialized serializer is beyond me, (let alone presumably less basic one as well) it seems like such drastic overkill that maybe I still don't get what the name means.
I think you are right. The name is pretty clear to me, but that may be because I have worked on similar code bases where this naming is used by convention. Reading code requires knowing the domain and I'm not sure if a shorter name is more clear. You need domain knowledge to know what a post is, and what it means that it has been flagged.
You probably made a very accurate guess based on your knowledge of forums and moderator systems. This may not be apparent to all, and shorter names will probably not help much. In addition, if they shortened the name to "Serializer", "PostSerializer" or even "FlaggedPostSerializer" it could conflict with other serializers in the project.
> But why you'd need such a specialized serializer is beyond me,
I totally agree with your point. They may have their reasons, but it seems to me that a "ReviewableFlaggedPost", "FlaggedPost" and "Post" should have very similar needs and could be solved by structuring them differently (perhaps by using composable classes that can each take care of their own serialization)
Regarding the use of "Basic", it also triggers a "code smell" reaction from me. It may make sense to them, and it's hard for me to make any definitive comments without knowing the rationality behind it. My guess is that they have different types of responses based on the same "post" object. "Basic" may include a subset of the "Full" response, such as id and title only.
In those cases I tend to prefer separate DTOs, like "PostSummaryDTO" and "PostDTO" that can be re-used by composability for different responses (flagged for review etc.). This may of course not be the best choice for all usages, so I would need to know more to say something conclusive about this particular case
You won't become a great author by perfecting your grammar instead of storytelling.
Are data dictionaries still in use today? Are there open source examples, books, etc. to learn from?
* naming things
* cache-invalidation
* off-by-one errors