undefined | Better HN

0 pointssnovv_crash5y ago0 comments

Yes, of course there are cases where it doesn't work. But what percentage of code is that, really?

0 comments

3 comments · 1 top-level

Chris_Newton5y ago· 2 in thread

I don’t know what percentage of all code that gets written would be better tested in other ways. I doubt anyone else does either.

For me, unit testing (in the typical xunit style associated with TDD) is most useful for basic data processing code with predictable outputs. That might include a wide range of code, from little utility functions on strings to the business rules for whole CRUD applications.

On the other hand, anything with input or output data in an awkward format, anything communicating with any external equipment or remote API, anything involving nondeterminism or heuristics or where the purpose of the code is to perform some calculation where you don’t know the correct answer in advance, these kinds of code don’t tend to fit well with a TDD approach and that style of unit testing as the main test strategy IMHO. Those cover a pretty wide range of code as well.

snovv_crashOP5y ago

Then it's really a matter of the granularity of the tests, right? Maybe they should target a higher level of abstraction, where the nasty details (which can hopefully be simplified in the future) aren't an issue.

If you don't know about what the code is meant to do (non-determinism, heuristics, etc), to me that you're targeting the wrong level of abstraction. At some level you know what it's meant to do, unless you're working on an abstract art project.

TDD done dogmatically is a mess, sure, but then so is anything.

Chris_Newton5y ago

I agree that it may be more useful to test a larger part of the code together instead of trying to do everything at the finest levels of detail. However, my point here is about more than just the level(s) where you perform the tests. It’s also about what kinds of testing you are doing. Here are some of the most common possibilities:

• xUnit-style tests for specific cases

• Property-based testing

• Snapshot-based testing

• Manual testing

• Formal verification

• Peer code review

All of these can be useful under the right circumstances.

If your code produces output that is best checked by human inspection but shouldn’t then change (or change very much) then snapshots may be a good choice.

If your code won’t deterministically produce the same correct output every run but whatever output it does produce should always satisfy certain conditions, maybe property-based testing is a good way to go.

If your code involves communication with external equipment that requires operator interaction to do anything interesting, fully automated testing might simply not be possible. In that case, manual integration testing with someone physically operating the equipment might be appropriate.

Formal verification covers a wide range of possibilities from the likes of basic static type checking all the way up to the use of automated theorem provers in specialised programming languages. It almost always has some extra cost in terms of annotating the code but it can sometimes produce far more powerful evidence of correctness in the general case than any test suite checking individual cases ever could.

Code reviews are pretty much universally good, as long as you’ve got enough people available with the relevant knowledge to do them.

Sometimes, these testing techniques are complementary and using more than one of them together might be beneficial. At other times, a coding style or process that favours one might make another more difficult.

This HN discussion is mainly about the style of “testable” code that TDD tends to produce, with many small units and lots of dependency injection, which is of course very friendly to small-scale unit testing. However, it might also be more difficult to review because of all the configurability and indirection. If it relies on doubles to stand in for external resources, it can end up testing the accuracy of the simulation more than anything else, so adding very little (justified) confidence that the real system is operating correctly. And as a rule of thumb, individual testing of specific cases may be less effective than testing large numbers of generated cases with property testing, which in turn may be less effective than proving that all cases work via rigorous analysis.

And this brings me back to where I came in, which was that prioritising “testable” code in the TDD sense isn’t necessarily a good thing. The real meaning of “testable” depends greatly on what type of code you’re writing and which testing strategies are most helpful.

j / k navigate · click thread line to collapse