https://github.com/sindresorhus/ama/issues/10#issuecomment-1...
TL;DR: Small modules are easy to reason about, and encourage code reuse and sharing across the entire community. This allows these small modules to get a tremendous amount of real world testing under all sorts of use cases, which can uncover many corner cases that an alternative naive inlined solution would never have covered (until it shows up as a bug in production). The entire community benefits from the collective testing and improvements made to these modules.
I also wanted to add that widespread use of these small modules over inlining everything makes the new module-level tree-shaking algorithms (that have been gaining traction since the advent of ES6 modules) much more effective in reducing overall code size, which is an important consideration in production web applications.
Yes they are, in the same way that a book in which every page consists of a single word is easier to understand than one with more content per page.
By focusing on the small-scale complexity to such an extreme, you've managed to make the whole system much harder to understand, and understanding the big picture is vital to things like debugging and making systems which are efficient overall.
IMHO this hyperabstraction and hypermodularisation (I just made these terms up, but I think they should be used more) is a symptom of a community that has mainly abandoned real thought and replaced it with dogmatic cargo-cult adherence to "best practices" which they think will somehow magically make their software awesome if taken to extremes. It's easy to see how advice like "keep functions short" and "don't implement anything yourself" could lead to such absurdity when taken to their logical conclusions. The same mentality with "more OOP is better" is what lead to Enterprise Java.
Related article that explains this phenomenon in more detail: http://countercomplex.blogspot.ca/2014/08/the-resource-leak-...
Other languages handle this with compilers, may be strongly typed, and/or have features that don't allow nulls. Javascript doesn't exactly have that luxury. So maybe it makes sense in Javascript, where it wouldn't in other languages (though one could argue this is a flaw in the language design).
edit: to be clear I'm not really defending the practice, but trying to give a little perspective.
In practice, many, if not most, of these one-line modules don't even work correctly, and it is difficult to get people to care to collaborate on fixing them instead of just replacing them with a new one-line module that works slightly better as the concept of complex collaboration to perfect one line of code is so foreign to a generation of developers who would rather just throw code to the world and never look back (an issue made all the worse as tiny bugs in these one-liners can almost seem dangerous to fix when you aren't really sure how they are being used: maybe it actually is better to not fix anything and just have people rely on the replacement better module :/).
#include <math.h> and "-lm"
with
#include <sin.h> #include <cos.h> #include <tan.h> #include <sinh.h> …
and "-lsin -lcos -ltan -lsinh …"
Which is nuts no matter what language you are coding in.
What's funny to me is that leftpad(null, 6, ' ') outputs " null".
A more apt metaphor would be separating a book into many small paragraphs that each serves a single purpose makes the book easier to understand. Regardless, the book metaphor misses a lot of the nuance.
Of course taking the approach to an extreme would be detrimental. However, a vast majority of these small modules that are actually widely used consist of functions that may seem trivial at first sight, but actually contain a lot of corner cases and special considerations that a naive inline implementation could miss.
Sure, there are also plenty of trivial one-line modules on npm that don't fit into a such a description, but those are simply side effects of the unprecedented popularity of the platform and its completely open nature, and shouldn't be used to infer any kind of general trend towards "hypermodularisation" because very few reasonable developers would ever import them into their projects.
No, that would not be an apt metaphor for the problem he is describing.
> Regardless, the book metaphor misses a lot of the nuance.
Not if you are trying to understand the point he is trying to make.
Is there any cost to creating modules, uploading them to npm and using them in other projects ? Clearly there is, as is shown by the world-wide breakage from yesterday. This is probably only one of such failure modes.
The point is, breaking everything into micro-modules can have benefits and costs. Ultimately, it is an engineering trade-off.
Now, it is possible that npm modules are at the right level of abstraction, even though other module systems in the world are not this granular. If this is due to the special nature of JavaScript then the point must be argued from that perspective.
Well put. I've noticed a curious blind spot in how people account for complexity: we count the code that 'does' things much more than the code that glues those things together. This distorts our thinking about system complexity. A cost/benefit analysis that doesn't consider all the costs isn't worth much.
An example is when people factor complex code into many small functions, then say it's much simpler because smaller functions are easier to understand. In fact this may or may not be true, and often isn't. To get a good answer you must consider the complexity you've added—in this case, that of the new function declarations and that of the calls to them—not just the complexity you've removed. But it's hard to consider what you don't see. Why is it so easy not to see things like this? I think it's our ideas about programming, especially our unquestioned assumptions.
The complexity of glue code goes from being overlooked to positively invisible when it gets moved into things like configuration files. Those are no longer seen as part of the system at all. But of course they should be.
https://youtu.be/l1Efy4RB_kw?t=343
He refactored a Java application into 1400 classes!
Frankly some of the comments I'm seeing really do reinforce his point: people really don't know how to program. I suspect this is it the flipside of Javascript being popular and accessible - people can churn out products without really knowing what they are doing...
left-pad isn't even slightly abstract.
The hyperabstraction is in how even tiny functions like this, isPositive, isArray, etc. are being abstracted and turned into individual modules.
Small modules are easy to reason about
They really aren't, when I'm reading is-positive-integer(x) and wonder if 0 is positive I need to hunt down the definition of positive through two packages and as many files. And it gets wrose if both your code and one of your dependencies required 'is-positive-integer' and I have to also figure out which version each part of the code base is using.If you had written (x > 0) I would have known immediately, it also wouldn't be doing the same thing as is-positive-integer(x) but how many calls to is-positive-integer are actually correct in all the corner cases that is-positive-integer covers?
And then there's the other problem with dependencies: you are trusting some unknown internet person to not push a minor version that breaks your build because you and Dr. is-positive-integer had different definitions of 'backwards compatibility'.
function isPositiveInteger(x) {
return x > 0 && Number.isInteger(x);
}
and always pass it a number. (Of course, this shouldn’t even be a function, because JavaScript is confused about the definition of “integer” – maybe you really want `x >>> 0 === x` – and apparently is-positive-integer is confused about the definition of “positive”, so it’ll be clearer to just put things inline.) JS does not have an Integer type, so (x > 0) does not work
It depends on what the surrounding code actually does wheter (x > 0) works or not. Depending on the version of is-positive-integer you are using a positive instance of the class Number may return true, false or cause an error.PS. never mind that the implementation of is-positive-integer is extremely inefficient.
How does that even remotely applies to the "is positive integer" test, and even more so to "it's positive" and "it's integer"?
What's next? is-bigger-than-5? word-starts-with-capital-letter? add-1-to-a-number?
Having pre-written modules that account for all edge cases that have been battle-tested by hundreds, thousands, or even millions of users does have merit.
The main problem here is with the Javascript language, which makes code evolve to things like 'is-positive-integer'.
The issue I have with this is readability. Type '>' and I know exactly what it does, I know what implicit conversions are involved, and how it would react to a null object. Type 'isPositiveInteger' and I need to check. I can not read your code fluently anymore.
We're doing a GSoC for efficient templating of real number though.
This seems to be a very common example in educational materials about how to create a function in some programming language. Perhaps they should all be prefaced with "NOTE: you should not actually make such a function, because its triviality means that every time you use it you will be increasing the complexity of your code", or a longer discussion on when abstraction in general is appropriate and when it is not.
Ditto for any other abstraction --- I've found there's too much of an "abstraction is good so use it whenever you can" theme in a lot of programming/software design books and not enough "why abstraction can be bad" sort of discussion, which then leads to the "you can never have too much" mentality and the resultant explosion of monstrous complexity. Apparently, trusting the reader to think about the negatives and exercise discretion didn't work so well...
That's the whole point. If you use is-positive-integer, you won't have to think about the negatives!
Of course we have a number of packages that are truly trivial and should probably be inlined. Npm is a completely open platform. Anyone can upload any module they feel like. That doesn't mean anyone else will feel the need to use them.
I think you'll find that the vast majority of small modules that are widely used are ones that also cover obscure corner cases and do benefit from widespread use and testing.
module.exports = exec("===");
module.exports = bigger(a, b){return a > b};
module.exports = () => return "than";
module.exports = (a) => a.search('ABCD...');
.. or something like that. These are all plugins of course.
I feel like you haven't spent enough time with Unicode.
Someone might do something like ~~n to truncate a float. That works fine until you go above 32 bits. Math.trunc should be used instead. Someone might do something like n >>> 0 === n, and using any bitwise operators will always bake in the assumption that the number is 32 bits. Do you treat negative 0 as a negative integer? Detecting negative 0 is hairy. So, to avoid bad patterns, it makes sense.
For is-bigger-than-5(x), x > 5 is not hairy. For add-1-to-a-number(n), n + 1 is not hairy.
For word-starts-with-capital-letter(word)? That one is actually pretty hairy. There are programmers that would write a regular expression /^[A-Z]/, or check the charCode of the first character being within the A-Z range, amongst other solutions. An appropriate solution, however, would be (typeof word == "string" && word.length && word[0] == word[0].toUpperCase() && word[0].toUpperCase() != word[0].toLowerCase()), because the toUpperCase and toLowerCase methods are unicode-aware, and presently you can't access unicode properties using Javascript's flavor of regular expressions.
i++
Discoverability though is so poor that most of those modules are most likely just used by the author and the author's co-workers.
If a typical npm user writes hundreds of packages, how the hell am I supposed to make use of them, when I can't even find them? Npm's search is horrendous, and is far from useful when trying to get to "the best/most supported module that does X" (assuming that random programmers rely on popularity to make their choice, which in itself is another problem...).
Intuitively, this thread wouldn't exist if your assertion were correct.
Why search through millions of unknown packages when a standard library would have it all right there?
We can't exactly search the code to find out what it does, otherwise you'd basically be reading all the code to discover it's true behavior, which negates the usefulness of modules in the first place...
Any search will have to rely on developer-made documentation, and/or meta data. This is great in theory, but documentation is rarely well maintained, and/or someone changes the code but neglects to update the documentation.
This leaves us with the situation we have today. A search that somewhat works kindof, and mostly you rely on googling the feature you need and looking for what seems to be the most popular choice.
I'm not sure how this situation can be made better, especially if we continue down a path of having all these "one-liner" libraries/modules that anyone can make and stick out there into the unsuspecting world. When I need a square root function, and my search returns 800 one-liner modules, how am I supposed to pick one that I know is reasonably well done, and does what it says it will do, without looking under the hood - you'll end up just picking whatever seems to be popular...
This is why types are better than textual documentation - the system enforces that they're kept up to date.
That would be very cool, but I'm not sure how much easier it would actually turn out to be. Also to do anything useful you'd probably have to restrict the language to be non-Turing-complete.
Would I write tests for my own leftpad implementation if I were not farming it out to a dependency? Its possible. More likely I would want to understand the intent behind using leftpad in my application or library, and have test for "formatTicket" or whatever it is that I'm actually doing. But for all the talk about tiny modules that cover surprisingly tricky edge cases, this is not one of them.
https://github.com/sindresorhus/ama/issues/10#issuecomment-1....
If anything looking at sindresorhus's activity feed: (https://github.com/sindresorhus) perfectly supports the author's point. Maybe some people have so little to do that they can author or find a relevant package for every single ~10 line function they need to use in their code and then spend countless commits bumping project-versions and updating package.json files. I have no idea how they get work done though..
"...LOC is pretty much irrelevant. It doesn't matter if the module is one line or hundreds. It's all about containing complexity. Think of node modules as lego blocks. You don't necessarily care about the details of how it's made. All you need to know is how to use the lego blocks to build your lego castle. By making small focused modules you can easily build large complex systems without having to know every single detail of how everything works..."
By LOC he's referencing 'ye ole Lines of Code paradigm, and trying to make the point that in the end it just doesn't measure up to the prime directive of "containing complexity".
... and that's where I beg to differ.
What I think is being completely overlooked here is the net cost of abstracting away all that complexity... It's performance.
Every extra line of code, extraneous function call, unnecessary abstraction ( I'm looking at you promise ), every lazy inclusion of an entire library for the use of a shortcut to document.getElementById -- these all add unnecessary costs from the network down to the cpu.
And who pays these costs?
The end users of your slow, memory hogging, less-than-ideal set of instructions to the cpu that you call an application... but hey, it's easier for you, so there's that.
Little-known fact: the *y in old signs is actually a 'þ', which is an Old English letter named 'thorn' and pronounced like modern 'th.' Thus, the old books & signs were really just saying, 'the.' Incidentally, þ is still used in modern Icelandic.
Completely off-topic, but I þought you might be interested.
Rather than copy paste someone else's unmaintained thing and handle all the bugs, or write their own code and unit tests, they use the same library as a few thousand other people?
If it's a popular issue, lots of people had the same issue, many will be nice enough to add their edge cases and make the answer better, most will not. Same goes for contributing to a package.
With a package you would be able to update when someone adds an edge case, but it might break your existent code and that edge case may be something that is not particular to your system.
If you don't want to get too deep in the issue, you can copy paste from SO, just the same you can just add a package.
If you want to understand the problem, you can read the answers, comments, etc. With the package you rely on reading code, I don't know how well those small packages are documented but I wouldn't count on it.
The only arguments that stands are code reuse and testability. But code reuse at the cost of the complexity the dependencies add, which IMO is not worth the time it'll take you to copy and paste some code from SO. Testability is cool but with an endless spiral of dependencies that quite often use one or more of the different (task|package|build) (managers|tools) that the node ecosystem has, I find it hard to justify adding a dependency for something trivial.
How do I systematically make sure that I have the latest version of every stackoverflow code snippet? If it's a new post, it may not have all the edge cases fixed yet. So now I have to check back on each of the X number of snippets I've copied.
In the npm approach, I can easily tell if there's a new version. For prod, I can lock to a specific version, but in my test environment, I can use ^ to get newer versions and test those before I put them in production.
If the edge case of new version of a package breaks my code, I've learned that I'm missing a unit test. Plus, the question isn't whether this bad thing might happen on occasion, the question is whether this approach is, on balance, superior to cutting and pasting random code snippets into my code. I think the downside of the npm approach is less than the downside of the copypasting from stackoverflow approach.
And every moderately useful npm package I've looked at has very good to great documenation.
https://news.ycombinator.com/item?id=11349606
Basically, how does tree shaking deal with dynamic inclusions? Are dynamic inclusions simply not allowed? But in that case, what about eval? Is eval just not allowed to import anything?
I've been reading posts like these, but they are pretty unsatisfying regarding technical detail: https://medium.com/@Rich_Harris/tree-shaking-versus-dead-cod...
Maybe someone else was wondering the same thing, so I decided to post it here before wandering off to the ES6 spec to figure it out.
I don't want to claim that it would be a directly calculated career move, more like starting a blog: you admire a good blog, you want to be more like the blogger, you start your own one. On the dim chance that it will become both good and not abandoned after the third post. Nanomodules can be just like that, the air guitar of would-be rockstar coders. This is certainly not the only reason for their existence (and even the air-guitar aspect has a positive: low barrier of entry that might lead to more serious contribution), but the discussion would be incomplete if it was ignored.
Step 1: Create a culture in which "having open source contributions" is a requirement to entering said culture.
Step 2: Remove all friction from introducing open source contributions into the culture.
Step 3: Watch the Cambrian explosion.
Step 4: (Two years later) Point to the Cambrian collapse and how the new hot thing will solve everything.
I don't know what sort of shit show Step 4 will turn into, but it will also definitely be the result of folks taking a simple and good rule of thumb (this time, it won't be "small and composable"), and implementing it without ever stopping to think what problems it solves and doesn't solve.
On the contrary, I would certainly mark "maintains moderately popular open source project" as a calculated career move, just like how prostitutes would dress up right for the cliente. Sorry for the crude example but I am not placing the blame on the devs here but the absurd notion of employers considering a GitHub repo the 'in' thing before actual capability. Candidate - Brah I got 10 of those packages with 10mil downloads in total but I can't code for shit. Hiring manager - Since we are suckers who look at stats, you're hired!
The problem with this nanomodules approach is that it results in complacent and naive developers that eschew the basics and just publish random crap. Anything can make it to be a NPM package and be judged by...popularity? Since when is code turning into pop music :)?
in python: https://docs.python.org/3.3/library/stdtypes.html#str.ljust
There is always a balance between getting some extra productivity by using somebody else's code, or getting the knowledge, expertise and control of writing it yourself. Far too often I see young programmer's choose something off the shelf so that they don't have to understand it. This is a recipe for disaster -- for your project, for your team and for yourself.
IMHO, as programmers, we should err on the side of writing code. Often it is obvious that you should use standard libraries, or a particular framework or or another reuse library. But where it is not really obvious, I would start with writing your own code and see where it gets you. You can always replace it later if it doesn't work out. I think you will find that you and your team will be miles ahead if you take this approach.
Anyone that has been around long enough has war stories about getting two relatively simple pieces of software working with each other. In my experience, integration problems are often the most difficult to deal with.
Strictly speaking, you're definitely right. Each dependency is an additional point of failure, but so is each additional line of code you inline.
The benefits of these small modules is that they're very thoroughly tested by the entire community. I'd say for most cases, they will be much more robust than any solution an individual developer can spontaneously whip up.
Of course, these modules don't exist in a vacuum, and infrastructure failures such as this one do pose another point of failure you wouldn't have with inlining code, but I think in this particular case it had more to do with the failure of npm to adhere to package management best practices to not include an unpublish feature.
If someone can't make the right trade-off regarding "should I import a module called "isArray" or "is-positive-integer", then they should not be programming...
This is madness.
[1] https://github.com/sindresorhus/float-equal/blob/master/inde...
and no, the edge cases argument does not apply to either of them. Wow.
Flip-side: it isn't easy to reason about a large and complicated graph of dependencies.
"It's all about containing complexity." - this completely ignores the complexity of maintaining dependencies. The more dependencies I have the more complicated my project is to maintain.
Dependencies are pieces of software managed by separate entities. They have bugs and need updates. It's hard to keep up to date.
When I update a piece of software I read the CHANGELOG, how am I expected to read the CHANGELOG for 1,000 packages?
Depending on a bigger package (handled by the same entities, who write one changelog, in the same form) is more straight forward.
I'm not saying this is wrong - but there's a balance here, and you must not ignore the complexity of increasing your number of dependencies. It does make things harder.
Not to mention the cognitive overhead of stopping programming, going to NPM, searching/finding/installing the module, then reading the documentation to understand its API. Isn't it simpler to `while (str.length < endLength) str = padChar + str;`? How can there be a bug in that "alternative naive inlined solution"? Either it works or it doesn't!
But naturally, with any code reuse there's a benefit and a cost to instance of internal or external reuse.
The Benefits for external reuse include ideal reliability as your describe as well as not having to create the code. The costs for external reuse include having your code tied to not just an external object but also the individuals and organizations creating that object.
I think that means that unless someone takes their hundreds of modules from the same person or organization and is capable of monitoring that person, that someone is incorporating a layer of risk to their code that they don't anticipate at all.
Risk of indirectly hosing a project with N module owners providing dependencies: 1-((1-H)^N)
Let's say H is very small, like 0.05% of module owners being the type who'd hose their own packages.
3 module owners: 0.15% chance your own project gets hosed
30 module owners: 1.49% chance your own project gets hosed
300 module owners: 13.93% chance your own project gets hosed
Keep in mind it's not just your dependencies, but your entire dependency chain. And if you think a module owner might hose some modules but not others, maybe H is actually the number of modules in which case 300 starts getting pretty attainable.
Upshot:
Not everyone is trustworthy enough to hang your project on. The more packages you include, the more risk you incur. And the more module owners you include, definitely more risk.
The micromodule ecosystem is wonderful for all the reasons described, but it's terrible for optimizing against dependency risk.
Takeaways:
Host your own packages. That makes you the module owner for the purposes of your dependency chain.
If you're not going to do that, don't use modules dynamically from module owners you don't trust directly with the success of your project.
I love collaborative ecosystems, but some people suck and some perfectly non-sucky people make sucky decisions, at least from your perspective. The ecosystem has to accommodate that. Trust is great...in moderation.
Yes, at some point the complexity cost of gluing together the standard library functions to do something becomes greater than the lookup cost of finding a function that does what I want; but I am saying that adding more functions is not costless.
The derision is unwarranted, due to a failure in critical thinking from otherwise smart people.
There's even some guy calling for a "micro-lodash". To me, as a Python engineer, lodash [1] is already a tiny utility library.
I guess it's also about the fact that JS is a pretty bad language. That you need a one-line `isArray` dependency to `toString.call(arr) == '[object Array]'` is crazy.
That said, lodash core is only 4KB, and lodash gives you the ability to build it with only the categories or exact functions you want, so I don't understand what the purpose of a "micro-lodash" would be.
That's not a Lego block; that's an excuse.
isArray(arr)
turns into toString.call(arr) == '[object Array]'
...then I guess that's more reasonable than if you use something sane.Too many modules do not necessarily become a good thing. They may appear to get rid of complexity but in reality, you will have to face the complexity some level above and in fact, the sheer number of small modules will most probably add more complexity of themselves.
Nope.