TDD is often sold as a fix-all solution, which is incredibly appealing to mgmt and quite fun for most programmers as a new paradigm, allowing for quick adoption.
It also has its uses, especially in the enterprise space where requirements aren't often clear. But I don't know many good programmers that truly stick to the dogma after the honeymoon period. It becomes just another tool in your toolset.
Uncle bob is a salesman, not a "craftsman".
TDD encourages you to write many times the number of lines of code for your tests as the code you're testing, often 10X or more. So when you inevitably decide "there's a much better way to do this" (which happens to me 100% of the time) then you're not only changing your code, you're changing all of those tests. That's a lot of weight that you now have to deal with.
Most of the time, that's enough weight that the better design simply doesn't happen and becomes yet another chunk of tech debt that prevents certain things from happening in the future. I've seen that happen, and it's given me a very bad taste in my mouth for "TDD" because "TDD inertia" is the reason that better designs aren't implemented. If that piece of software lives for a while, and grows in scope, someone is going to have to deal with that, and eventually make the architectural change anyway.
By all means, test your code, of course. If you can find a way to easily generate test cases for an arbitrary code base, by all means, do that, because the test cases are no longer influencing your decision to change how your application works.
Otherwise, being "test-driven" is bad, IMHO. Software development is never as simple as the various dogmas would lead you to believe.
See specifically the "Fig. 5 - Test dependencies" section of the post.
Personally I think mocks are far over-used in tests, and I much prefer the solution kovarex outlines. I think dealing with mocks are the bigger source of test-update friction here than the tests themselves. I was already layering my tests instead of using mocks. As in having no clear line between "unit" and "integration" tests, everything just uses the "real" implementations of things and tests just get naturally more complex the higher up the stack it's testing. The idea of sorting by dependency depth is a cool idea I hadn't ever considered, though, I'll be borrowing that idea.
Isn't the point of TDD that you can change implementation at will, safe in the knowledge that the tests will help guarantee that you're not inadvertently changing behavior?
The better way to test your specification and behavior, IMO, is to use comprehensive E2E tests. The traditional test pyramid is upside down. If you have an E2E test for every user story in your spec, you can develop with confidence that your new code will not disrupt your users' activity. Unit tests are cattle, E2E tests are pets.
If that happens I'd say you're doing TDD too early. TDD can be very useful, but early on in the process of transferring a design to code, all your interfaces are highly unstable and there's a lot of experimentation, so sticking strictly to TDD would naturally be frustrating. I would start a bit later when at least some of the API has stabilized and you don't have to do major changes.
TDD helps with creating user friendly APIs, since you experience it from the user's perspective. And it forces you to actually write testable code and not incur technical debt that's very costly to remove later (having to refactor code to make it testable).
> Most of the time, that's enough weight that the better design simply doesn't happen and becomes yet another chunk of tech debt that prevents certain things from happening in the future.
This reads like you're saying that tests themselves are technical debt...? Because you're going to run into this issue (or should run into it) regardless if you use TDD or not. Eventually you'll want to refactor parts of your codebase and, sure, you'll have to change some of your, hopefully mostly, unit tests. So in that sense you can say that doing TDD early creates a lot of technical debt that needs to be resolved quickly, but like I said above, that doesn't have to be the case.
TDD can be a dogma just like any practice (Agile is my favorite), but it doesn't mean that it's not useful if used correctly.
Kudos to the Factorio team for adopting it, which I think is rare in the gaming industry. The idea alone of testing the complexities of a video game is mind boggling to me as a web developer. Especially in an industry where it's popular to hype and sell broken products with the promise of patches and DLC.
I learned TDD ~20 years ago from Beck's "TDD by Example" book. For me it's far more than "just another tool". I'll certainly do exploratory work with throwaway code. And in the early stages of something, I might be a bit slack. But the lesson over and over for me is that if I don't get to significant test coverage soon, I end up wasting a lot of time on bugs. And TDD is the easiest way to get to test coverage. Test-first ends up feeling like a set of small successes with the red-great-refactor loop. With test-after, going back and adding tests feels more like drudgery, and I'm less likely to have written the code in a testable way.
So TDD is not dogma for me, just the inevitable place I end up if I want to maximize the amount of time getting things done while working on a non-trivial codebase.
1.) I'd much rather lean on the type system to prove things than automated tests if possible. But of course depending on the language that often isn't possible.
2.) I find that only a small fraction of my code really benefits from automated testing. This is the logic/calculation parts of the code. The rest is slinging IO and SQL queries which all ends up being mocked out anyways and the tests just become secondary implementations of the original code.
Once you understand how to design testable code, tdd offers minimal benefits.
The real value comes from thorough, automated testing suites, whatever their origin is.
Automated testing is useful, for sure. However, TDD purists tend to focus on one very specific type of testing: automated unit testing, in the small, where you already know the expected output for given input and can easily specify that output using assertions in code. By its nature, TDD emphasises being testable in that specific sense above all else. I don’t think that is necessarily a good thing, partly because that type of universal unit testing may not be a good strategy for every software system, and partly because other useful properties of the code might be diminished because of the changes needed to make it “testable” in the TDD sense.
Unit tests' inability to handle state is way too often viewed as a problem with the code it can't properly test than the general crappiness of this form of test.
It all comes down to return on investment. For example, I used to agree with TDD that for every bug, first write a test that fails, and then fix the bug. That way you prevent regression.
So I proposed this to my manager, and he responded: we tracked all bugs in our system for the last 10 years, and when you look at fixed bugs that get broken again, it only occured very rarely. So in the end, doing that was not the best investment of effort.
In return, I get to have some extra confidence that a bug doesn't return (which would be embarrassing, even if it's infrequent!) and I get a more thorough test suite that lets me refactor more quickly and aggressively. And if an old bug does come up again, the advantage is not only that the test will catch it before a release, but also that the person fixing the bug won't have to go through the effort of figuring out how to reproduce it from scratch—it's reproduced right in the test suite!
So I am not sure that just looking at how often regressions actually happen in the existing codebase is sufficient to make any real conclusion by itself.
I have often observed an evolutionary behavior on tests: tests that pass easily survive
That sounds very hand wavy.
It is making the assumption that:
- your bug tracking system and people are so good that they have found all the duplicate tickets.
- they don't make mistakes and find all duplicates for tickets.
- your code is so good that it hasn't had any side effects that caused regressions.
- your boss is so good that he has a full grasp on 10 years worth of bugs.
edit: formattingIn my opinion, as soon as a test suite finds a bug it has added a lot of value, even if it's rare.
I've worked at a TDD-focused company and it always felt like I was coding through molasses. Some of us weren't as strict about TDD and I didn't notice a difference in our code quality.
Would you feel that way about the linux kernel for example?
I pointed out the 10 year old Norvig vs Ron Jeffries fight[1] which demonstrates that TDD is useless when you don't already know what you're writing, but they just looked confused.
The other problem here is that unit tests never break (since you've mocked everything that can break) and therefore aren't worth running; it might be more productive to write them for TDD and then just not commit them.
That's because it is
And it's not without drawbacks. Increased development time and the needs to maintain the unit tests are costly, but rewarding.
Also I don't like too much interface and mocking only for the sake of testing. I find it usually breaks when integrated and makes code harder to maintain. Maybe I'm just inexperienced.
An important idea of TDD is that it allows you to discover those "unfulfilled points" in the tests. When writing code (the tests) that use an API, instead of when writing the actual API.
When writing code that uses objects, methods, interfaces and so on, you are in a mindset of writing "what a user of the code would wish there was". This is probably the best place, mindset and moment to define those specs in detail.
They do already do this (hence the blogpost), but it's something kovarex hasn't explored before, so they're pretty new to TDD.
So the effort you put into writing a test for a bug, has most likely a negative return on investment. That time could be better spend somewhere else.
> This is the beautiful thing about having a company that isn't on the stock market. Imagine you have a company that goes slower and slower every quarter, and then you confront the shareholders with the statement, that the way to solve it, is to do absolutely no new features for a quarter or two, refactor the code, learn new methodologies etc. I doubt that the shareholders would allow that. Luckily, we don't have any shareholders, and we understand the vital importance of this investment in the long run. Not only in the project, but also in our skill and knowledge, so we do better next time.
This is reassuring the notion of what I think actually matters, what the real essence is of developing a product, may that be a piece of art and entertainment (like here) or a productivity tool etc.
There are creators and there are consumers. We split them up by developers, designers, domain experts and so on, but what matters is that all the other participants, especially those who can exert power traditionally are not part of the essence and if not being careful and responsible, can easily add complexity and limitations that are entirely accidental and can even be harmful.
This reminds me of the agile manifesto, modern UX approaches and other processes that are driven by creators, but are often and very unfortunately being bent over backwards to fit into hierarchical power structures.
> TDD actually is the constant fast swithing between extending the tests and making them pass continously. So as you write tests, you write code to satisfy them basically at the same time. This allows you to instantly test what you write, and mainly use tests as specifiation of what the code should acctually do, which guides the thought process to make you think about where you are headed to, and to write code that is more structured and testable from the very beginning.
The important notion here is that TDD is not about tests and correctness, but about development. It continuously checks assumptions and explores the surrounding code, state and data until a sufficient solution is found.
If we squint a little we can see how closely related TDD with REPL Driven Development is. In essence it is the same thing and even has similar results, where the tests or REPL code can be left as an artifact for further, likely historical understanding.
We know now that neither is sufficient for a high degree of correctness, but they are certainly useful for understanding and development.
Yup, writing tests helps me sleep at night. TDD helps me manage my mental resources and iterate.
(another reason is communication--we code for our colleagues first, then for the machine: https://sonnet.io/posts/code-sober-debug-drunk/)
IIRC smalltalk had a workflow where you'd debug and write your program at the same time. You'd just reach a path that has not been implemented yet, break, implement it and continue.
This is exactly what I do with the PyCharm - hit a breakpoint, write the code as it should be, and execute it in the debug REPL to do basic initial testing [repeat]. Extremely productive.
> that the way to solve it, is to do absolutely no new features for a quarter or two, refactor the code, learn new methodologies
Even in a world without shareholders, we still are building things for users. 6 months without improvements has an effect on them, too. When possible, I think it's better to spread cleanup work out. Even if one spends 80% of the time on cleanup and 20% on features, that is much better for relationships than going dark for a quarter or two. And in my experience, continuing to do productive work during that period makes the behind-the-scenes improvements better.
They did exactly what you suggested! The relationship with the users never went dark. They kept up with bug fixes, and kept feeding new features at a more than acceptable pace. [Edit: Here's the pace of their status updates, as evidence at https://www.reddit.com/r/factorio/?f=flair_name%3A%22FFF%22 ]
I didn't quite believe that Uncle Bob's lessons worked in the real world, but if the Wube team is sold on them, that's about as good as an endorsement as I'll ever get.
I took to TDD pretty easily because I was already used to doing short run-it-and-see-if-it-works cycles. The main difference was that instead of checking via eyeball, I talk the computer to check it. This is slightly slower early on, but so much faster once a program is big enough that manually checking everything would take a while.
We are advancing, but the current state is... not mind-blowing yet (albeit somewhat cool!). See [1] for an example interactive demo and [2] for the corresponding presentation.
But if you look at logic programming with more expressive systems you can have something like what you propose. We describe what we expect to have and the system deduces a result. Not quite what you want but its closer.
Now there is also an ubiquitous logic system that many use: static typing. In a sense you are describing the general properties of something and the compiler infers optimizations based on your assertions. The concrete program is not a line by line translation from your code to machine code, but perhaps looked at in its entirety.
I agree that there is a lot of merit in pushing these things further and further. Right now we're kind of in a stage of patching things together. But I hope and assume that programming becomes more holistic in the future. Ironically we have to look at the past first, there was a lot of momentum in this direction up until the 80's roughly.
Aside from that, to produce a program that did what is actually required would require test cases and functions that cover the set of inputs and outputs. This is trivial for mathematical functions, but impractical (or impossible) for more general applications (e.g. anything dealing with human inputs).
This doesn't become practical until you can do it on a quantum computer with millions of cubits.
TDD is where you both write what you want, and you do the implementation also.
Show examples of French retirement pension computation, and watch if a computer can actually commit petit-suicide.
https://gist.github.com/daveray/1441520
Even if you don’t follow along and try it, you can probably get the gist of how experimental / exploratory you can be.
I feel like testing your behaviors is pretty hard, and even if you unit-test your behaviors, there’s still integration tests.
I only write games as a hobby and never use TDD, even if I’d like to, since the tooling is just either poorly documented or too slow, or both.
Usually this ends with me being frustrated with the slow development cycle and pushes me towards more unconventional methods of developing games in Javascript using Mocha to run the tests directly in the browser.
You need a large QA team (relative to your team size) to test for fun anyway. The game is constantly getting tested and bugs will get logged. The marginal benefit of automated tests is less than other places because of this.
Games have no specification. You have almost no idea of even the genre of game you'll end up with at the end of the dev cycle unless you're making a sequel that has to fit into a mold. Sure you can write tests as you go along and test that enemies with negative health die. The next day someone will suggest "what if they stay alive for a period of time and then explode!" The definition of correct is constantly changing.
Tests ossify functionality. It makes it harder to change things because at the very least you need to also change the test. If you're just changing tests whenever you want to suit your new desires then its hard to build trust in the tests.
Games don't need to be correct. They just need to be fun. This also decreases the marginal benefit of tests compared to other industries.
That said, it would be natural to unit test some data structure or some well defined system. Also, once your game is done, a la Factorio, you can go back and write tests for some refactor because you know the full design specs.
To be fair, I don't pretend to know how CDProject developed their games. They might already have testing and the timetable was the problem.
I’ve seen engines which used them, but not games. The rationale was that it was just too hard, which always felt like a cop-out to me.
I would dearly love to have more automated tests in the game I’m working on now, but I’ve never seen a model of it working well which I could copy, and part of me suspects that it’d be a huge investment of time to figure it out entirely on my own when I’m already vastly overworked as it is.
If anybody has references to indepth case studies of making game engines more friendly toward automated tests, I’d be super interested to see whether there were lessons I could apply toward my own situation!
A simple example might be simply starting and closing the engine after the automated build & package process to make sure the game actually runs, but I've seen things like using bots to emulate player behaviour to smoke test gameplay functionality. No Man's Sky used automated tools to evaluate the procedural-generation algorithms[1]. Here's a more comprehensive example of automated gameplay testing[2].
[1] https://youtu.be/sCRzxEEcO2Y?t=3100 [2] https://www.youtube.com/watch?v=VVq_hgaX8MQ
Continuous integration and testing pipelines in games - case studies of The Talos Principle and Serious Sam https://www.youtube.com/watch?v=YGIvWT-NBHk
Yeah it was a painful experience, but I survived.
Artifact did it too (which is no longer being worked on though).
My team writes distributed systems, which drastically reduces the value of a TDD approach. There’s only so far you can take the technique with a database backed api before it just becomes absurd.
Modern games are notorious for using the first, most loyal customers as beta testers (hello Fallout 76...).
The reason is two-fold... while you absolutely can test some parts of the engine (e.g. collision detection, networking) you can't really "test" stuff that needs a human eye to see if it's working as intended (anything that's rendered) or involves randomness (e.g. fire, fog, water, opponent spawning, loot). That means you have to hire lots of skilled (!) humans, provide them with expensive rigs, and give them time. Which is incredibly expensive.
Game QA is more involved than that, of course. Content needs to go through a signoff and QA process that involves humans (we do that, too).
I interviewed at a game company a few years ago where one of my daily tasks would be to spend an hour just playing the game and seeing if I spot any bugs. I didn't end up taking the job so I didn't see how involved their actual code testing process was, but it was apparent that they actually cared a bit about quality control.
Maybe the devs haven't found a worthy new project yet?
Maybe they just want to leave their code in good shape so they (or someone else) can come back to it at a later time and pick it up relatively quickly.
> Imagine you have a company that goes slower and slower every quarter, and then you confront the shareholders with the statement, that the way to solve it, is to do absolutely no new features for a quarter or two, refactor the code, learn new methodologies etc. I doubt that the shareholders would allow that. Luckily, we don't have any shareholders, and we understand the vital importance of this investment in the long run. Not only in the project, but also in our skill and knowledge, so we do better next time.
This isn't necessarily the full explanation, but it's certainly something to keep in mind.
If you're building a new game and your current GUI paradigm sucks, overhauling it first makes a lot of sense.
And that's got a non-trivial chance of happening.
I think Factorio itself actually runs better on this laptop than that Factorio blog post does.
Possibly the google analytics is doing something heavy (although it doesn't look to be when spot checking with a profiler), but there's otherwise nothing JS that runs continuously.
> (2) This allows you to instantly test what you write, and mainly use tests as specification
> (3) the problem comes when you break something and a lot of tests start to fail suddenly
My favorites. Don't expect to give away entire quarter, but at-least sometime would definitely be nice. All three so fundamental, and often ignored. In my experience, you get these right, it makes developer life so much easier. As someone earlier mentioned in thread, TDD is kind of like REPL driven development.
I think, one immediate benefit of companies focusing on good code is that engineers can aim for much more ambitious projects, and they can be more brave with the codebase. Instead we often end up with 100 over-engineered components with no well defined/enforced contracts, and a set of monolithic tests which runs the entire stack to test the most basic case.
First you develop fast and breaking things with alpha / beta versions. Then you make it stable with bug fixes and minor enhancements. Finally enforce the code with review, unit tests and code coverage, etc.
Theoretically they already have a good game engine (long lasting product) and interested in developing it further. Without enforcement, any future changes have potential to break things. Unit tests (enforcement) reduce that risks and make any changes / refactoring closer with specification.
if your game has a lot of interactions, and you want to make sure that your changes are not causing unintended interactions, tests like these would help a lot during development.
But anyway, I doubt tests help at all in the prototype phase (by any procedure you want to get them). My guess is that they are incredibly harmful.
If only game development worked this way!
--
Dealer: "Hey? You there? I got some meth for you."
Me: "Go away! I don't want any!"
Dealer: "Oh now don't say that. You remember how good it is? I know you do."
Me: "I can't, I can't afford it, the price is too high."
Dealer: "What? Come on, it's free! You already paid for it!.
Me: "I'll lose my job, I can't, just go away!"
Dealer: "Your JOB? This IS your job. Open the damn door, THE FACTORY MUST GROW"
Me: ::cowers in the closet chanting please leave please leave please leave::
--
Also it was never good. It was more like a mind virus. Like the sort of problem or project at work you can't stop thinking about until it's done. Only with Factorio, it's never done. Never.
My best defense against it are other video games I can stop & start when needed. Or booting up my VPN connection and picking a work task from my back log until the cravings go away.
> now there are 9 programmers
Companies on the stock market don't have "9 programmers". They have a lot of teams of 9 programmers. So while it's true that it would probably completely be impossible for a stock market company to completely freeze for a quarter or two, individual teams can still do that.
If the factorio team grows to tens of programmers (it probably won't and probably shouldn't), I would be very surprised if they find the need - and if they manage to - freeze all teams together for a big refactoring round. I am also unsure that it would be the right approach. That observation holds wether they go public or stay private.
It has certain characteristics that align with a game like Factorio or Oxygen Not Included etc. such as visual programming, backpressure, common interfaces, local retention etc.
I can imagine this being applied to distributed/cloud computing as a way to reason about high level interactions and perhaps functional/integrated testing.
It's interesting to think about the other HN thread discussion about comments vs one-time-call function abstractions in this light: https://news.ycombinator.com/item?id=27546135
I'm a big fan of "put code in one place" too. It was the biggest factor that convinced me that JSX was a great idea compared to separating the templating logic out.
I also misread TDD as TTD (related to the trains in factorio) first
IME most developer do not understand each test has four possible outcomes:
* Code is good and test passes
* Code is bad and test fails
These are the only two possible outcomes developers focus on: When I ask what they should do if a test that used to pass now fails, they always tell me stories about how to debug the code under test.
There are two additional possibilities in test:
* Code is bad yet test passes (false negative)
* Code is good yet test fails (false positive)
Again, IME, most people do not look at the test again once it passes for the first time.
As a result, tests which are themselves code, become the largest untested part of the code base. You get these thousands and thousands of lines of untested code yet you have 100% code coverage.
Some of my blog posts on testing:
* Deception in tests considered harmful https://www.nu42.com/2017/02/deception-in-tests-harmful.html
* Know what you are testing: The case of the test for median in Boost.Accumulators C++ Library <https://www.nu42.com/2016/12/cpp-boost-median-test.html>
* Who is testing the tests? https://www.nu42.com/2015/05/who-is-testing-the-tests.html
* Slashing one's feet with tests, or, how to fix 2,950 test failures in one fell swoop https://www.nu42.com/2015/08/fix-2950-test-failures.html
It is called restructuring and big companies do it all the time. Investors allow it, though they are rightly suspicious - sometimes it is good, but often it is change for the sake of change and not change for better.