Unit tests run fast. If they don’t run fast, they aren’t unit tests.
Other kinds of tests often masquerade as unit tests. A test is not a unit test if:
1. It talks to a database.
2. It communicates across a network.
3. It touches the file system.
4. You have to do special things to your environment (such as editing configuration files) to run it.
Tests that do these things aren’t bad. Often they are worth writing, and you generally will write them in unit test harnesses. However, it is important to be able to separate them from true unit tests so that you can keep a set of tests that you can run fast whenever you make changes.
[0]: https://www.google.com/books/edition/Working_Effectively_wit...
No-one said that integration tests can't also be very valuable.
From the little context I get that you write integration tests, and that is fine. They are useful, valuable! But they are not unit-tests.
edit: on re-reading, I get the feeling that for you "integration tests" are a synonym for "end to end tests". But -at least in most literature- end-to-end tests are a kind of integration-test. But not all integration tests are end-to-end tests. In my software, I'll often have integration tests that swap out some adapter (e.g. the postgres-users-repository, for the memory-users-repository, or fake-users-repository. Or the test-payment for the stripe-payment) but that still test a lot of stuff stacked on top of each-other. Integration tests, just not integration tests that test the entire integration.
Tests that do these things aren’t bad. Often they are worth writing, and you generally will write them in unit test harnesses.
You hinted at the value of separating unit tests and integration tests with your observation about 40-minute unit test runs being way too slow. The process friction it creates means people will check in “obviously correct” changes without running the tests first.
Feathers continues:
However, it is important to be able to separate them from true unit tests so that you can keep a set of tests that you can run fast whenever you make changes.
You want your unit tests to be an easy habit for a quick sanity check. For the situation you described, I’d suggest moving the integration tests to a separate suite that run at least once a day. Ripping that coverage out of your CI may make you uncomfortable. That’s solid engineering intuition. Let your healthy respect for the likelihood of errors creeping in drive you to add at least one fast (less than one-tenth of a second to run is the rule of thumb from Feathers, p. 13) test in the general area of the slower integration tests.
The first one may be challenging to write. From here forward, it will never be easier than today. Putting it off is how your team got to the situation now of having to wait 40 minutes for the green bar. One test is better than no tests. Your first case with the fixture and mocks you create will make adding more fast unit tests easier down the road.
Yes, just as it’s possible to make mistakes in production code, it’s certainly possible to make mistakes in test code. Unit tests are sometimes brittle and over-constrain. Refactoring them is fair game too and far better than throwing them away.
I ask because in my team we also a long time made the destinction between unit/integration based on a stupid technicality in the framework we are using.
We stopped doing that and now we mostly write integration tests (which in reality we did for a long time).
Of course this all arguing over definitions and kind of stupid but I do agree with the definition of the parent commenter.
I very much agree that there's benefit to considering the pace of feedback and where it falls in your workflow; immediate feedback is hugely valuable, but it can come from unit tests, other tests, or things which are not tests.
Meanwhile, some tests of a single unit might have to take a long time. Exhaustive tests are rarely applicable, but when they are it's going to be for something small and it's likely to be slow to run. That should not be in your tightest loop, but it is probably clearer to evict it simply because it is slow, rather than because it is not a unit test for being slow.
In my definition, it would still be a unit test if it is fast and it talks to a database, a network server, or the file system, so long as such communication is done entirely in a "mock" fashion (it only does such communication as the test sets up, and it's done in such a fashion that tests can be run in parallel with no issues).
Instead, I classify units of logical coherence and write unit tests for those. I write financial trading systems - not hft, but still latency sensitive. I will test an order through our pipeline as a unit of work. This will necessarily touch multiple classes. So as an example, a test will cover orders which are accepted and then filled, orders which are rejected, and so on.
Many people would classify these as integration tests, and to be fair, I don't really care what you name them. To me these are much more valuable than the traditional "one class, one test" mechanism because it means I am free to refactor the internals of our pipeline as much as I want with very low impact on the test code.
One of the whole points of test code, that I think has been lost, is that it should be there to give you confidence in the correctness of your application under change. Writing "one class, one test" is a bad way to achieve this.
1. Is it kinda slow? (The test suite for a single module should run in a few seconds; a large monorepo should finish in under a minute)
2. Is there network access involved? (Could the test randomly fail?)
3. Do I need to set up anything special, like a database? (How easy is it for a new developer to run it?)
If the answer to any of those is Yes, then your test might fall in the liminal space between unit tests and integration tests -- they're unit-level tests, but they're more expensive to run. For example, data access layer tests that run against an actual database.
On the other hand, even if a test touches the filesystem, then it's generally fast enough that you don't have to worry about it (and you did make the test self-cleaning, right?) -- calling that test "not a unit test" doesn't help you. Likewise, if the database you're touching is sqlite, then that still leaves you with No's to the three questions above.
>3. It touches the file system.
These are BS. Maybe they made sense in the beforetimes when when we didn't have Docker containers or SSDs, but nowadays there's no reason you can't stand up a mini test database as part of your unit test suite. It's way simpler than mocking.
The problem is, and there will be people that disagree with me, is that unit tests make refactoring of other people's code a lot harder.
STAY WITH ME!
If the unit tests were "good" and helped document what the code does, then they don't. You won't believe this, but in dogmatic high-breadth-coverage (low depth coverage), there are tons of test code that is SO TIED TO IMPLEMENTATION rather than interface than any monkeying of the presumed encapsulated logic breaks the unit tests, so you have double the things to fix.
You'll never believe what happens next. Some developer in some Agile thing that got assigned 2 unicorn shits for the task panics because the unit tests are SERIOUSLY slowing down his "velocity". So what does he do? Delete tests, change tests to make them work at any costs.
But expensive to write. Especially if you want them to be fast and cheap to run.
I for one do not believe in Unit Tests and try to get LLM tooling to write them for me as much as possible.
Integration Tests however, (which I would argue is what this story is actually praising) are _critical components of professional software. Cypress has been my constant companion and better half these last few years.
1) Cases where you have some sort of predefined specification that your code needs to conform to
2) Weird edge cases
3) Preventing reintroducing known bugs
In actual practice, about 99% of unit tests I see amount to "verifying that our code does what our code does" and are a useless waste of time and effort.
If you rephrase this as, "verifying that our code does what it did yesterday" these types of tests are useful. When I'm trying to add tests to previously untested code, this is usually how I start.
1. Method outputs a big blob of JSON
2. Write test to ensure that the output blob is always the same
3. As you make changes, refine the test to be more focused and actionableThat’s my experience too, especially for things like React components. I see a lot of unit tests that literally have almost the exact same code as the function they’re testing.
When I was learning unit testing, my mentor taught me this strategy when fixing production bugs. First, write the unit test to demonstrate the bug. Second, fix the bug.
Imagine the dumbest function you have to write: a product A and a street address as input, and the shipping cost as an output.
How many test cases would you write to be absolutely sure that function actually does what you want it to do, and be confident it doesn't have weird exceptions that the LLM injected randomly ? I'd assume you'd still vet the code written by the LLM, but if it's hundreds of rambling lines doing weird stuff to get the right result, is it really faster than writing it yourself ?
It seems to be taught in a fucked up way though where you imagine you want a car object and a banana object and you want to insert the banana into a car or some other kind of abstract nonsense.
For example, the first result on Google states that a unit test calls one function, while an integration test may call a set of functions. But as soon as you have a function that has side effects, then it will be necessary to call other functions to observe the change in state. There is nothing communicated by calling this an integration test rather than a unit test. The intent of the test is identical.
I feel like we did testing a disservice by specifying the unit to be too granular. So in most systems you end up with hundreds of useless tests testing very specific parts of code in complete isolation.
In my opinion a unit should be a "full unit of functionality as observed by the user of the system". What most people call integration tests. Instead of testing N similar scenarios for M separate units of code, giving you NxM tests, write N integrations tests that will test those for all of your units of code, and will find bugs where those units, well, integrate.
I very vividly remember writing a test for a ~40 loc class of pure functions. I started out thinking the exercise was a waste of time. This class is simple, has no mutable state, and should have no reason to change. Why bother testing it?
By the time I was done writing the test I had found three major bugs in that 40 loc, and it was a major aha moment. Truly enlightening.
It took a couple years to be accepted.
I built a property based testing library for ActionScript 3 (a fun journey in itself, with full test case reduction).
I was testing my testing library, and tried one of the most basic tests:
For any object A
A == decode(encode(A))
And discovered the fun of floating point values not being perfectly representable as strings.The more significant one came from testing a UI library we'd built for TVs (so you has up, down, left, right as movement). We had in the spec that if you moved focus by pressing right, pressing left would take you back to the thing you were on before. The test looked something like
For an arbitrary series of API calls generating the UI:
For an arbitrary list of movements the user makes:
If the focus changes, pressing the opposite direction moves your focus back where you came from
Now, this was actually very easy to write as a test, but it's extremely powerful. It found a bug in an interesting corner case, so I fixed it. Fixing the bug broke an existing unit test. I checked and the unit test correctly tested something in the spec.The spec was inconsistent, but because we'd tested explicit examples it had never been spotted. I've been a convert since.
I've never implemented property tests in an existing project without finding some bug.
I don't recommend building your own property testing library unless you really want to, I highly recommend in python hypothesis: https://hypothesis.readthedocs.io/en/latest/
So that puts it at about 1 bug per 20 LoC
Some estimates are something as high as 75 bugs per 1,000 LoC but that many bugs don't make it out to customers because of QA / Developer actions. So yeah, right on the money.
If you need to mock out 80% of a system to make your unit test work, then yes, it's potentially pointless. In that case I'd argue that you should consider rewriting the code so that it's more testable in isolation, that will also help you debug more easily.
What I like to do is write tests for anything that's just remotely complex, because it make writing the actual code easier. I can continuously find mistakes by just typing "tox" (or whatever tool you use). Or perhaps the thing I'm trying to write functionality for is buried fairly deep in an application, then it's nice to be reasonably sure about the functionality before testing it in the UI. Unit tests just makes the feedback loop much shorter.
Unlike others I'd argue that MOST projects are suited for unit testing, but there might be some edge cases where they'd provide no value at all.
On caveat is that some developers write pretty nasty unit tests. Their production code is nice and readable, but then they just went nuts in the unit tests and created a horrible unmaintainable mess, I don't get why you'd do that.
This is also where the dogma of “only test public methods” fails. If your public method requires extensive mocking but the core logic you need to protect is isolated in a private method that requires little mocking, the most effective use of developer resources may be to just test your private method.
> On caveat is that some developers write pretty nasty unit tests. Their production code is nice and readable, but then they just went nuts in the unit tests and created a horrible unmaintainable mess, I don't get why you'd do that.
I have also seen this a lot and usually it’s when people try to add too much DRY to their unit tests. I recall being as a junior dev told by our lead that boilerplate and duplication in tests is not strictly a bad thing, and I have generally found this to be true over the years. Tests are inherently messy and each one is unique. Trying to get clever with custom test harnesses to reduce duplication is more likely to lead to maintainability issues than it is test nirvana. And if your code requires so much setup to test, that is an indicator of complexity issues in the code, not the test.
You're looking at the tested code as immutable. If you're not allowed to touch the code being tested, then yes, you'll sometimes need to test private methods, and that is fine. "Don't test private methods" is actually more about how to architect the primary code, not a commandment on the test code. If you find that you're having to do extensive mocking to call a public method in order to test the functionality in some private method, that's a major smell indicating that your code could be organized in a better way.
While there is nothing wrong with testing an internal function if it helps with development, so long as it clearly identifiable as such, you still need the public interface tests to ensure that the documented API is still conformant when the internals are modified. Remember that public tests are not for you, they are for future developers.
This is where Go did a nice job with testing. It provides native language support for "public" and "private" tests, identifying to future developers which can be deleted as implementation evolves and which must remain no matter what happens to the underlying implementation.
When I did unit tests in C++, I found a simpler (and better) solution: Shrink the class by splitting it up into multiple classes. Often the logic in the private methods could be grouped into 1-3 concepts, and it was quite logical to create classes for each of them, give them public methods, and then have an instantiation of that class as a private member.
Now all you need to do is write unit tests for those new classes.
Really, it led to code that was easier to read - the benefit was not just "easier to test". Not a single colleague (most of whom did not write unit tests) complained.
I've yet to run into a case where it was hard to test private behavior via only public methods that couldn't be solved this way.
You can split things out decompose some things even if its just some util functions and start sending out chunks of code for review. It doesn't even necessarily have to be seperate files.
Your reviews will be faster and smoother too.
The first is getting over the hurdle of trusting that a unit test is good enough, a lot of them only trust an end-to-end test which are usually very brittle.
The second reason is, I think, a lot of them don't know how to systematically breakdown test into pieces to validate e.g. I'll do a test for null, then a separate test for something else _assuming_ not null because I've already written a test for that.
The best way I've been able to get buy-in for unit testing is giving a crash course on a new structure that has a test suite per function under test. This allows for a much lower loc per test that's much easier to understand.
When they're ready I'll give tips on how to get the most of their tests with things like, boundary value analysis, better mocking, IoC for things like date time, etc.
The idea that unit testing should be the default go to test I find to be horrifying.
I find that unit test believers struggle with the following:
1) The idea that test realism might actually matter more than test speed.
2) The idea that if the code is "hard to unit test" that it is not necessarily better for the code to adapt to the unit test. In general it's less risky to adapt the test to the code than it is the code to the test (i.e. by introducing DI). It seems to be tied up with some sort of idea that unit testability/DI just makes code inherently better.
3) The idea that integration tests are naturally flaky. They're not. Flakiness is caused by inadequate control over the environment and/or non-deterministic code. Both are fixable if you have the engineering chops.
4) The idea that test distributions should conform to arbitrary shapes for reasons that are more about "because google considered integration tests to be naturally flaky".
5) Dogma (e.g. uncle bob or rainsberger's advice) vs. the idea that tests are investment that should pay dividends and to design them according to the projected investment payoff rather than to fit some kind of "ideal".
Honestly, this pedantry around "unit tests must only test one thing" is counter-productive. Just test as many things as you can at once; it's fine. Most tests should not be failing. Yes, it's slightly less annoying to get 2 failed tests instead of 1 fail that you fix and then another fail from that same test. But it's way more annoying to have to duplicate entire test setups to have one that checks null and another that checks even numbers and another that checks odd numbers and another that checks near-overflow numbers, etc. The latter will result in people resting writing unit tests at all, which is exactly what you've found.
If people are resisting writing unit tests, make writing unit tests easier. Those silly rules do the opposite.
Breaking a test down helps to clarify what you're testing and helps to prevent 80 loc unit tests. When I test for multiple things, I look for the equivalent of nunit's assert.multiple in the language that I'm in.
The approach I advocate for typically simplifies testing multiple scenarios with clear objectives and tends to make it easier when it comes time to refactor/fix/or just delete a no longer needed unit test. The difference I find, is that now you know why, vs having to figure out why.
I definitely think its ok for the overall standard of test code to be lower than production code though (I guess horrible unmaintanable tests is maybe a bit much). A few reasons I can think of off the top of my head:
- You can easily delete and rewrite individual tests without any risk
- You don't ship your tests, bugs and errors in tests suites have a way smaller chance of causing downstream issues for customers (not the same as no chance but definitely a lot smaller)
- I'd rather have a messy, hard to understand test than no test at all in most cases. That isn't true of production code at all, there are features that if they can't be produced in a coherent way with the rest of the codebase just don't have the value add to justify the maintenance burden.
For example, in double(x) -> y you can use types to say x belongs to the set of all integers and y must also must be in that set, but that’s about all you can say in Python.
Unit testing lets you express that y must be an even number with the same sign as x. It is like formal verification for the great unwashed, myself included.
demands that people rewrite all their production code in service of unit tests are probably a big reason of why a lot of programmers don't unit test.
> On caveat is that some developers write pretty nasty unit tests. Their production code is nice and readable, but then they just went nuts in the unit tests and created a horrible unmaintainable mess, I don't get why you'd do that.
probably they write bad unit tests because they can't rewrite all their code but they have a mandate that all changes must be unit tested.
if strict purity could be relaxed and programmers were allowed to write more functionalish unit tests with multiple collaborators under test then there would likely be less resistance to testing and there shouldn't be any mocking-hell tests written.
higher level functional/integration tests also shouldn't be missed since your unit tests are only as good as your understanding of the interfaces of the objects and people write buggy unit tests that allow real bugs to slip between the cracks.
As for the quality of tests, that's usually a combination of factors and capacity is one of them. At the end if PO's don't see business value in tests, they won't be prioritized.
If you're finding more mistakes by running unit tests than by thinking through and re-reading your code, you're not finding most of your mistakes. Because you're not understanding your own code. How can you even write great unit tests if you don't understand what you're doing?
There are, of course, times when writing the tests first can help you think through a problem -- great! Especially when thinking through how some API would look. But TDD as a methodology gets a hard reject from me.
I certainly reject the argument "unit testing is too hard" -- then your code is bad and you should focus on fixing it. Well-written code is automatically easy to unit test, among 60 other benefits. That's not a reason to avoid unit testing.
I still hate writing them and it grates on my aesthetic sense to structure code with consideration to making it testable, but if we want to call ourselves engineers we need to hold ourselves to engineering standards. Bridge builders do not get to skip tests.
Bravo. We need more of this mindset in the world, and also more collective will to encourage it in one another.
YOU are the kind of engineer I want writing the code that goes in my Dad's pacemaker or the cruise control in my wife's car.
In any critical system work, there are multiple layers and you can't really skip any of them.
It's also sort of meaningless to talk about such testing without requirements and spec to test against. Traceability is as much a part of it as any of the testing.
By the time you get to the "thick book/full run" as you put it, there has typically been a metric crapload of testing done already.
Unit tests - actually, all automated tests - are comparatively cheap. The developer can run them immediately.
All code will have bugs. The "trick" to building a productive development pipeline is to catch as many of those bugs as possible as early as possible, and thereby reduce both the temporal and monetary cost of resolving them.
I'm not even looking for a particularly high level of test coverage, just a basic "I wrote an API, here's a test (integration, unit, doesn't matter) for the happy path"-level of coverage would be great.
On the opposite end, I worked at places that wanted unit tests for every new function, even if it was something simple (like a getter or setter) used elsewhere. That's also terrible.
Even better than manually written unit tests are automatically generated property-based tests (which can be unit or integration tests). One can literally run millions of tests in a day this way, far, FAR more than could ever be manually verified. All because computation is so darned cheap now.
Updated the original to clarify. Hope that helps!
Either I am writing really good code so there are no bugs, or I am really bad a writing unit testing code to find those bugs.
Honestly, having literally had a scenario 20 minutes ago where I wrote a test for what I figured was absolutely trivial code, and having it _fail_ on me and pick up a bug that I hadn't considered (and this is not the first time this has happened) I would strongly suggest it's the latter.
Do your unit tests the output and side effects exactly, or do they just make sure the function returned without error?
Just because function/method/whatever has 100% coverage, doesn't mean you have tested all the potential scenarios.
Similarly I've seen people add tests that will ensure that code coverage doesn't go down, but it doesn't actually do anything to help anyone. I'd argue that the issue is that have random coverage goals is a problem on its own, but it's the only way to force some people to write even to most basic of tests.
1. Decide that was the goal. Start measuring. Who knows, maybe I have to do nothing.
2. Discover that I seriously underestimated the number of bugs I produce. No one is available to review my code, changing languages (to something with a stronger type system) was out of the question. Only option appears to be methodical testing of every line of code.
3. Print (on dead trees - this was decades ago) all my code. Manually test all of it, running a highlighter down the listing. Continue to the entire code listing has a solid line of code down the left. It worked! Bugs per delivered lines of code dropped off a cliff. But geezz it took a loooong time, longer than writing the code. Finding an input sequence the exercised some code was surprisingly hard. And it was boring. Still success - and people who used it immediately noticed the improvement in quality and commented.
4. Then new features have to be added. Does this mean I have to test it all again? Surely not - I'll just test the bits I changed. Result: bugs per line of code rapidly start to ramp up again.
5. So I test everything again. That works, but it's horribly inefficient. I can spend days releasing a few line change. I can't get small changes anything like a reasonable time frame.
6. The solution is obvious to a programmer: automate your work, which in this case translates to writing code to do the tests. So I write unit tests for new code. It ends up being slower than doing manual tests :( Code size doubles. It works in keeping the bug count down, but can I afford to keep doing this time wise?
7. Then I add features to new code with unit tests. Initially this is painful - I move at perhaps 1/2 the speed because now I have to change at least twice the amount of code (actual and unit-test). Still, it's success bug count wise, and running unit tests is much, much faster than manually testing.
8. Keep doing this, notice that despite me having to change twice the lines of code (actual code and tests) when I'm adding new features I'm producing more debugged lines of code than before. Even more interesting, I'm fearlessly making much larger changes now. Turns out I'm using the unit tests as guard rails. I no longer minimise my changes to reduce the odds of introducing a bug.
9. Finally, notice that unit testing has changed the way I write code. And it's for the better. Code that's easy to test is also easy to understand. For example, it's much easier to test a pure function than something with side effects, so you minimise side effects as much as possible. You make your interfaces (which are the thing you focus your testing on) as small as possible. Testing deep inside a complex module is difficult and the torturous unit test code you have to write to do that is hard to understand. So you split things into smaller modules, each of which has those clean interfaces, to give your tests greater visibility. Turns out writing code so unit tests can understand it is oddly similar to writing code so that humans can easily understand it.
So it turns out unit testing is a win in every way, when done well. Well, except for the "it's boring" bit. (Numerous comments here hint at copilot being a real help.)
But writing code that's amenable to unit tests isn't something you do naturally. Fortunately just getting practice at writing unit tests is enough to teach the skill. Sadly, that takes time and frustration. While you learn your productivity will drop for a while. And worse, when writing new code adding unit tests is always slower than the old way. The pay back only comes when you later make changes.
in corporate codebases, overwhelmingly, unit tests are just mocked tests that enforce a certain implementation at the class or even individual method/function level and pretend that it works, making it impossible to refactor anything or even fix bugs without breaking tests
such tests are not just useless, they're positively harmful
https://gist.github.com/androidfred/501d276c7dc26a5db09e893b...
Have seen all to many times I've broken a unit test in a code base that I did not intend to break, just to have an aha moment that I would have introduced a bug had that test not been present.
Unit tests are a trade off between development speed and stability (putting aside other factors, such as integration tests, etc). In large corporate settings, that stability could mean millions of dollars saved per bug.
That example you provided is a poor one and not really consistent with your point that unit tests are useless - the point is being made that that specific test of UserResource is useless, which I also agree with. Testing at the Resource level via integration test and Service level via unit test is probably sufficient.
Nightmares... =)
If you work with malicious on incompetent staff at times (over 100 there is always at least 1)... it is the only way to enforce actual accountability after a dozen people touch the same files over years.
"The sculpture is already complete within the marble block, before I start my work. It is already there, I just have to chisel away the superfluous material." ( Michelangelo )
Admittedly, in-house automated testing for cosmetic things like GUI or 3D rendering pipelines is still nontrivial.
Best of luck =)
"Look, if it only fails with requirement changes, then maybe we're better off not having them."
This only made people uncomfortable. They just don't like walking down the mental path that leads them to the conclusion that their high coverage unit tests are not worth the tradeoff. Or even that there is a tradeoff present at all.
Meanwhile, PRs constantly ask for more coverage.
Not very often, but sometimes someone will mention: "but unit tests are the specification of the code"
test ShouldReturnOutput {
_mock1.setup( /* complicated setup code returning mock2 */ );
_mock2.setup( /* even more complicated setup code */ );
let output = _obj.Method( 23.5 );
Assert( output == 0.7543213 );
}
/* hundreds of lines above this test case */
setup {
if ( _boolean ) {
_obj = new obj(_mock1);
}
else {
_obj = new obj(new mock());
}
}
I'm just not sure I can get there.A coworker once thought unit tests were dumb, and ended up writing code that repeated the call to an application 10x for the same info. This didn’t result in a changed UI because it was a read, but it’s not good to just suddenly 10x your reads for no good reason.
TFA also describes discovering weird side effect race conditions as a result of unit tests.
not sure what this is referring to, but I'll give an example
say you have a requirement that says if you call POST /user with a non-existing user, a user should be created and you should get a 2xx response with some basic details back
you could test this by actually hitting the endpoint with randomly generated user data known to not already exist, check that you get the expected 2xx response in the expected format, and then use the user id you got back to call the GET /user/userId endpoint and check that it's the same user that was just created
this is a great test! it enforces actual business logic while still allowing you to change literally everything about the implementation - you could change the codebase from Java Spring Boot to Python Flask if you wanted to, you could change the persistence tech from MySQL to MariaDB or Redis etc etc - the test would still pass when the endpoint behaves as expected and fail when it doesn't, and it's a single test that is cheap to write, maintain and run
OR
you could write dozens of the typical corporate style unit test i'm referring to, where you create instances of each individual layer class, mocks of every class it interacts with, mocked database calls etc etc which 1) literally enforce every single aspect of the implementation, so now you can't change anything without breaking the tests 2) pretend that things work when they actually don't (eg, it could be that the CreateUserDAO actually breaks because someone stuffs up a db call, but guess what, the CreateUserResource and CreateUserService unit tests will still pass, because they just pretend (through mocks) that CreateUserDao.createUser returns a created user
They are often sufficient for a great deal of projects. If all it takes to convince you it's "good enough," are a handful of examples then that's it. As much as you need and no less.
However I find we programmers tend to be a dogmatic bunch and many of us out there like to cling to our favoured practices and tools. Unit tests aren't the only testing method. Integration tests are fine. Some times testing is not sufficient: you need proof. Static types are great but fast-and-loose reasoning is also useful and so you still need a few tests.
What's important is that we sit down to think about specifying what it means for our programs to be, "correct." Because when someone asks, "is it correct?" You need to as, "with respect to what?" If all you have are some hastily written notes from a bunch of meetings and long-lost whiteboard sessions... then you don't really have an answer. Any behaviour is, "correct," if you haven't specified what it should be.
Unit tests or not, so much code I interact with is like this. This is part of why I love integration tests. It's usually at the point of integrating one thing with another that things go bad, where bugs occur, and where the intention of APIs are misunderstood.
I like unit tests for the way that they encourage composition and dependency injection, but if you're already doing that, then (unit tests or not) I prefer integration tests. They might not be as neat and tidy as a unit test OR as a e2e test, and they might miss important implementation edge cases, but well made integration tests can find all sorts of race conditions, configurations that we should help users avoid, and much much more precisely because they are looking for problems with the side effects that no amount of pure-function unit-tested edge-cased code will make obvious or mitigate.
Integration tests are like the "explain why" comments that everyone clamors for, but in reproducible demo form. "Show me" vs "tell me"
If they fail, they prove there's a bug (in either the test or the code.)
This is like literally any other kind of test.
There are many places in programming where you don't care to prove properties of your program to this level of rigor; that's fine -- sufficiency is an important distinction: if a handful of examples are enough to convince you that your implementation is correct with regards to your specifications, then it's good enough. Does the file get copied to the right place? Cool.
However there are many more places where unit tests aren't sufficient. You can't express properties like, "this program can share memory and never allows information to escape to other threads." Or, with <= 10 writers all transactions will always complete. A unit test can demonstrate an example of one such case at a time... but you will never prove things one example at a time.
It’s easy to write a test that doesn’t provide useful information across time. Harder to write a test that does.
But it doesn't really matter if you want to call a given test an "integration" test or a "unit" test. The point of any test is to fail when something breaks and pass when something works, even if the implementation is changed. If it does the opposite in either of those cases, it's not a good test.
The point of any test is to document API expectations for future developers (which may include you).
That the documentation happens to be self-validating is merely a nice side effect.
I think it's hard to beat e2e testing. The thing is e2e tests are expensive to write and maintain and in my opinion you really need a software engineer to write them and write them well. Now manual e2e testing is cheap and can be outsourced. All the companies I've worked for in the US have had testing departments and they did manage to write a few tests but they were developers and so to be frank they were really bad at writing them. They did probably 80 or 90% of their testing manually. At that point who we kidding. Just say you do manual testing, pay your people accordingly and move on.
That sounds like you work alone and haven't worked for a long time on a code base with unit tests. Or the unit tests are bad.
Then i would realize that their definition of a real issue was completely removed from any business or user impact, but geared more towards their understanding of the process detail in question.
I would argue that there certainly are some good places for unit tests, like if you have some domain-driven design going and can have well defined unit-tests for your business logic, but this usually is the smallest part of the codebase.
Mocking things that talk to databases etc. usually gives a false sense of security while that thing could break for a whole number of reasons in the real world. So just dropping the mock here and testing the whole stack of the application can really do wonders here in my experience.
Yes, exactly what I thought, that's what you would hear from somebody who has experience working on large code bases with many contributors.
Or he is actually not realizing unit tests bring to attention code that is impacted by the change... Or his tests just do for dynamically typed language whatever static tying does on compilation :)
> Now, what happens when MyThread::singlepassThreadWork() uses a member variable of MyThread like foobar and we delete the MyThread object while the thread is still running? The destruction sequence is such that MyThread is deleted first and after that, the destructor of its parent object Thread runs and the thread is joined. Thus, there is a race condition: We risk accessing the vector foobar in singlepassThreadWork() after it was already deleted. We can fix the user code by explicitly stopping the thread in its destructor
What does it mean when they say 'the destructor of its *parent* object Thread runs'? I've always thought that when you inherit from one class to another and then instantiate an object of said class, they're just one object, so what do they mean when they make the distinction between 'parent' and 'child' object? When you have inheritance of say two classes, those would be two distinct objects instantiated in memory? Is there something I'm missing?
Anyway I think it is odd design to stop the thread in the destructor. You'd normally stop the thread first and then destroy the object, not the other way around?
But, I would probably do that by having a class that contains both the thread and the data that the thread needs to access. Then its destructor could first join the thread and then clean up the data. For example, instead of a WorkerThread that contains a vector of WorkItem, have a BackgroundWorker that contains a Thread and a vector of WorkItem.
Take a look at "Destruction sequence" but basically the destructors are chained together and called one after another to free all resources rather than forming one destructor for the derived object. That being said it is still effectively one object in memory.
Not everything needs this level of rigor but there are plenty of cases where the tests are very cheap to write and reason about (for many pure functions) or are worth the cost as they validate critical behavior. Unit tests also add some design pressure to keep more logic pure/side-effect free; sure, it may take a bit more work to factor your code accordingly to keep i/o interactions separated to the shell of the application but I find this to be a useful pressure.
I've found that if I'm encountering pain when writing unit tests, then the pain is due to one of the following things:
1. The code is growing too complex and I need to decompose the logic or refactor the tests
2. The code has grown too many unintentional side effects and I need to move those side effects to discrete components
3. The code under test has fundamental side effects and those side effects require testing, thus the unit tests need to be converted to an integration test
4. The code under test is sufficiently complex that it demands full system/acceptance testing
There are some cases where refactoring the tests is generally too painful and I'll throw away all the tests entirely, maybe sprinkle in a few tests for logic that seems critical, and move on. Tests can accumulate technical debt, but in contrast to implementing code it's pretty cheap to cut your losses on tests and wipe them out.
I see a lot of people conflating unit testing with the idea that all code must have tests, and there's a ton of code that's phenomenally painful to test and can be easily checked by the developer. Tests should be a supporting tool an an augment to the developer practices; it's better to have some tests that work well and throw out the ones that are miserable to write rather than require 95% test coverage, drown in testing, and throw out all tests entirely.
For business logic I would change the tests first so that it represents the new expected result. Then you refactor the code until the tests pass.
If the business logic actually changes, the tests should break IMO because they are there to ensure that the business logic remains consistent. When you test the business logic (without testing the implementation) the code becomes much safer to modify and refactor.
But in reality, unit testing every single function and method is where the vast majority of the benefit lies. Details really matter.
It took me some time to learn this, even after being told. It's the same for most people. This little post will probably convince no one.
But maybe remember it when you finally get there yourself :)
Very much no, that's the bad kind of unit test that locks your code into a specific structure and makes it a pain to update because you also have to change all the related tests even if the actual interface used by the rest of the codebase didn't change. I would call this the rookie mistake of someone new to unit tests.
You want to encapsulate your code with some sort of interface that matches the problem space, then test to that interface. How its internals are broken down don't matter: it could be one big function, it could be a dozen functions, it could be a class, it as long as the inputs and outputs match what the test is looking for you can refactor and add/remove features without having to spend extra time changing the tests. Makes it much less of a pain to work with in general.
One way of looking at it I've used before with coworkers: For this new feature you're writing, imagine a library for it already exists. What is the simplest and most straightforward way to use that library? That's your interface, the thing you expose to the world and what you run your tests against.
This is what unit testing originally meant: semantic units, not code units.
It's like app Hungarian notation vs system Hungarian notation, the original idea got overtaken by people who didn't understand the idea and only mimicked the surface level appearance.
To me, this is actual rookie mentality. You end up testing the same thing multiple times over different lines of code, mocking and providing various sets of testing data... When you could just test specified and/or observable behaviour of your system, and achieve the exactly same result with fewer tests.
There are many ways of doing things, and I guess we do unit tests differently.
> When you could just test specified and/or observable behaviour of your system, and achieve the exactly same result with fewer tests.
In my experience, it turns out to be very difficult to test a specific behaviour 5-10 layers deep from an external interface. Also, when one of those intermediate layers changes, you tend to have to rewrite many of those tests.
Then I joined a project where they were just starting to add tests to an existing project, and the lead developer was adamant on the following philosophy: "Virtually all tests will run the full program, and we'll mock out whatever slows the tests down (e.g. network access, etc)". I whined but I had to comply. After a year of this, I give him credit for changing my mindset. The majority of bugs we found simply would not have been found with unit tests. On the flip side, almost none of the unit test failures were false alarms (function signature changed, etc).
Since then, I've dropped categorizing tests as unit tests vs other types of tests. Examine the project at hand, and write automated tests - preferably fast ones. Focus on testing features, and not functions.
If any of those conditions doesn't hold the cost/benefit certainly and even sometimes the absolute utility goes way down.
If I have to mock anything, in particular, or more generally care at all about any implementation details (ie side effects) then I just think might as well make this a full on automated functional test then.
As soon as fake code is introduced into the test its utility rapidly decays in time as the things it fakes themselves change.
If you follow SOLID principles to the extreme, you'll find that your code is separated into logic code that is pure and easy to unit test, and IO code that is very simple and can be tested by a relatively few number of integration tests.
I agree preferable but sometimes you want to test the logic of the code thats actually making decisions about how and when the IO is called.
You can do it with integration tests of course but in more complex environments with lots of complex IO dependencies mocking is cheaper. Its also hard to simulate specific failures in integration tests like a specific request failing. Pretty much mocking with extra steps.
So mocking has its place as well.
They confidently broke everything historically and looking forward. Then blamed it on me because it was my test suite that didn’t catch it. The language should not have been broken.
Everything only works if you understand what you are doing so every argument should be posed as both sides.
Their code changed behavior and good unit tests catch change in behavior. Someone somewhere is probably depending on that behavior.
There are multiple problems with unit tests, as they are implemented in the industry. And to make the unit tests usable and productive you need to make them so productive that it can offset those problems.
First of all, for unit tests to work everybody has to contribute quality unit tests. One team member writing unit tests well for his part of functionality is not going to move the needle -- everybody has to do this.
Unfortunately, it is rarely the case that all team members are able to write quality code this is the case for unit tests.
Usually, the reality is that given deadlines and scope, some developers will deprioritize focusing on writing good unit tests to instead deliver what business people do really care about -- functionality. Give it enough time and unit tests can no longer be trusted to perform its job.
Second, it is my opinion that refactoring is extremely important. Being able to take some imperfect code from somebody else and improve it should be an important tool in preventing code rot.
Unfortunately, unit tests tend to calcify existing code making it more expensive to change the functionality. Yes, more, not less expensive. To move a lot of stuff around, change APIs, etc. you will usually invalidate all of the unit tests that work around this code. And fixing those unit tests in my experience takes more effort than refactoring the code itself.
Unit tests are good for catching errors AFTER you have made the error. But my personal workflow is to prevent the errors in the first place. This means reading the code diligently, understanding what it does, figuring out how to refactor code without breaking it. Over the years I invested a lot of effort into this ability to the point where I am not scared to edit large swaths of code without ever running it, and then have everything work correctly on the first try. Unit tests are usually standing in the way.
I think where unit tests shine is small library code, utilities, where things are not really supposed to change much. But on the other hand, if they are not really supposed to change much there also isn't much need to have unit tests...
The most paradoxical thing about unit tests is that teams that can write unit tests well can usually produce code of good enough quality that they have relatively little use of unit tests in the first place.
What I do instead of unit tests? I do unit tests. Yes, you read that correctly.
The trouble with unit tests is that everybody gets the part of what unit is wrong. Unit does not have to mean "a class". Units can be modules or even whole services.
What I do is I test a functionality that matters to the client -- things I would have to renegotiate with the client anyway if I was to ever change it. These tests make sense because once they are written -- they do not need to change even as the functionality behind them is being completely rewritten. These test for what clients really care about and for this they bring a lot of bang for the buck.
Too many times I’ve made “simple” changes that were “obviously correct” and whose effects were “completely localized” only to wind up eating healthy servings of crow. If correct up-front analysis were possible to do reliably, we would have no need for profilers to diagnose hotspots, debuggers, valgrind, etc., etc.
So I enlist cheap machine support to check my work.
Maybe CPU cycles are cheap, but writing that code is not. Which is exactly the point of my rant.
My position is that it makes much more sense to focus on tests that test observable behaviour that is not supposed to change a lot because it is a contract between the service and whoever the client is.
Writing this code is still expensive, but at least now it is much easier to make sure the return is higher than the investment.
This takes 3 minutes, 1 if you use tmpfs. It only takes <10 seconds if you dont run writing tests.
These actually cover most real world use cases for a query-engine we maintain.
Unit tests have their place for pieces of code that run based on a well defined spec, but all in all this integration or component-level testing is really what brings me the most value always.
You're preaching to the choir. The overwhelming majority of people worship unit tests like dogma. There's almost no point in saying the above. It's like saying it's a little sad to see some people who are so dismissive about eating and breathing to stay alive.
Your next part is the one that's interesting. Mocking 80 percent of a system to get unit tests to work. I've seen so much of this from developers who don't even realize the pointlessness of what theyre doing that it's nuts. They worship test so much that they can't see the nuance and the downside.
Take this article. This article is literally presenting evidence for why unit tests are bad. He literally created an error that would not have existed in the first place we're it not for his tests. Yet he has to spin it in such a strange way to make it support the existing dogma of test test test.
Whether that's because most software isn't tested competently or because software testing practices don't deliver robust software is not yet clear.
I suspect that unit tests, and tests in general, will be considered a historical artifact from the time before we worked out how to write software properly.
For example, we don't generally unit test things that a static type system checks for us. Maybe good enough type systems will remove the rest of them.
Wrt typing, that’s a very narrow set of errors, and I would dare say even a small minority of the things that can and do go wrong in software are type related. That said, effective typing is another orthogonal tool to unit tests that can help create robust software. On that front, what we are missing is a language with robust typing that catches these type errors, but also gets out of developers way the rest of the time.
How this topic can sometimes be about belief is beyond me. It's like if a person found a screw driver and says, I now believe in screw drivers.
The topic of how people believe in unit tests, to me is proof that the world is screwed. We're all screwed and everything is a screw driver.
Pretending to misunderstand clear communication then making smug points about it isnt clever either.
https://www.merriam-webster.com/dictionary/believe%20in definition 2, "to have trust in the goodness or value of (something)".
Words (and phrases) in English usually have more than one meaning. Ranting about correct use of a phrase because you're pretending the only extant meaning is a different one is not clever.
If it helps, think of "unit tests" and "atomic tests". Your goal in writing a unit test is to test the smallest possible amount of logic at a time, with the least possible overhead (i.e., mocking).
The advantages of this approach are many: it helps keep the level of complexity of individual methods low enough to be quickly understandable, documents the interface provided by your methods, ensures that the tests run quickly, and allows new tests to be written with minimal effort.
Obviously there are disadvantages, too. Unit tests - any tests - take time to write. This is sometimes offset by the time saved by catching issues as early in the development cycle as possible, but not always.
For "greenfield" projects especially, I tend to take a different approach than in my other work. For those, I start by "writing the README". It doesn't matter if it's an actual README.md; the point is to write down some examples showing how you think the new functionality should be used. Once that's done, I'll stub out an implementation of that, then refining it with increasing granularity until the overall architecture of the project begins to be defined. Sometimes, that architecture is complex enough that it's worthwhile to break it into smaller pieces and start the process over for those. Other times, I get to a working "happy path" pretty quickly.
Once I have a minimally working feature, I write tests for the public-facing interface. Then the interfaces between domains inside the project. Then unit tests for individual methods. I mostly work in Python, so this is also the point where I pause and apply type annotations, write/expand my docstrings, ensure that my `__all__` objects are set properly, make sure any "internal use" methods of publicly exported types are prefixed with `_`, etc.
On the other hand, when I'm writing a feature or making a change to a more mature codebase, I often _start_ by writing tests. Sometimes that's a new interface that I'll be using elsewhere, so I'll write tests defining that. Sometimes it's a change in behavior on an existing implementation, so I'll write tests for that. Either way, from that point on I repeatedly run _only_ the new tests that I've written as I build out the feature. Only once the feature works and those tests pass do I re-run the whole test suite to check that I've not broken something I hadn't considered. When those pass, I'll go back over my code one more time to make sure that I've added tests for all of the relevant internal stuff before submitting the patch.
The former must run quickly, and it's ok if the exact same test is run over and over. The latter need not run quickly, but benefits if new tests can be created and run, or if the tests incorporate randomness so they don't do the same thing each time they are run.
Here, it seems he was using tests intended for the first purpose for the second purpose instead. That can work, as it did here, but I don't think it's optimal. Better to have more exploratory, randomized, property-based tests chugging away in the background to find weird new ways the code can fail.
It would be nice if unit tests were more like interlocking evidence of system correctness, but right now we just have integration tests with poorer coverage for that.
For what it's worth, I find Copilot to be quite an exceptional help in writing unit tests ! A real game changer for me. Not only it takes care on most boilerplate code, but also kind of 'guesses' what case I'm about to write - and sometimes even point me in a direction I would miss otherwise.
It also motivates me to get small pieces working and tested before I get to the finish line. Each successful test is a victory!
I'm happy that this article praises unit tests without forcing a TDD perspective to the reader. It presents it like a tool, not a religion, and that's very refreshing.
We can argue about what granularity they should be, talk about functional programming, debate whether they should hit the database or not, but IMO all of those things miss the point. For me, in order of priority, unit tests provide the following benefits:
1) Make me write better, more decoupled code
2) Serve as documentation as to the intent of the code, and provide some expected use cases
3) Validate the code works as expected, (especially when "refactoring", which is basically how I write all my code even from the start)
4) Help you when deleting code by exposing unexpected dependencies
You can argue against all of those points, and I often will, myself. It depends on the scale, importance, and lifetime of the project as to whether I will write unit tests. But, as soon as I think someone else will work on the code, I will almost always provide unit tests. In that scenario, they:
- Provide a way to quickly validate setup and installation was correct and the application functions
- Signal that the code was "curated" in some way. Someone cared enough to setup the test environment and write some tests, and that gives me a certain comfort in proceeding to work on the code.
- Provide a gateway into understanding why the application exists, and what some of the implementation details are.
So, thinking about the advantages I've outlined above, for me it would be very hard to say I don't "believe" in unit tests. I just don't always use them.
https://multicians.org/thvv/threeq.html
I hope he got permission to reproduce the comic.