Or hardware where the advice is usually to mock the device under test. But if you don't own the hardware the most you can do is try and emulate it, and maybe check that your simulated state machine works. In my experience its easier to run with hardware connected and just skip those tests otherwise. There are also extremely subtle bugs that can crop up with hardware interfaces like needing to insert delays into code (eg when sending serial) that will otherwise fail in the real world.
OpenCV has some interesting approaches to this, for example testing storing a video in a certain format, inserting a frame with a known shape (like a circle), then reading back the video and checking that the shape can be detected.
The basic idea is to start with some known-good inputs and outputs, and then generate ways to modify the input that should not change the output.
The OpenCV example is a pretty easy one. You have clear inputs with clearly defined outputs. The only thing you have to do is to create sample data.
Yup. I work in robotics.
I try to isolate the actual hardware interaction layer so that for testing you can mock the driver and hardware in one piece. Of course that does not test the driver. With any luck, the driver is pretty stable once it works, though. And the driver+hardware piece can have it's own (physical) test bench so that at least manual testing is, well maybe not efficient, but at least not painful.
Simulators are great but not always available. Or are too much work to get going.
One configuration often used for robots is the "boneless chicken". Take a bench, and bolt all the guts down to it in a configuration where they are easy to probe. Put the wheel motors someplace safe, with a synthetic load like a pony brake. Of course you can't test the nav stack that way. (I once interviewed a firmware engineer who was coming off of the Juicero shutdown -- say what you want about Juicero, but from the sounds of it their boneless chicken was outstanding, even integrated into the CI automation pipeline. Of course, they didn't have the nav problem).
Speaking of nav, I once saw a warehouse robot company's nav PR test micro-warehouse. Not the full test warehouse, just a 500 square foot or so area dedicated to testing nav PR's. It was integrated with CI automation. I could tell from the accumulated tire marks on the floor that they had nav pretty much nailed.
I have done several robot-to-elevator interfaces (probably more than anyone else). In the end, final testing always required something akin to a few midnight to 4 AM test blocks on the real elevator. And then of course as you point out:
> the whole system has a ton of potential interactions that are hard to write test cases for.
They often don't show up until the system is under load.
When I'm testing thermal cameras there are a sequence of things I can check to ensure that the test worked: was the command sent without errors? Did I get an error back from the camera (e.g. CRC failure)? Does the state of the camera change as I expect it to? If all of those things are correct then the likelihood is that the command sent OK. Of course for states you should check various permutations (e.g. shutter open and shutter closed) to make sure that you don't have a bug in your state reading code :)
Here's a stereo matching example from OpenCV. This is a case when you do have the correct answer, but you don't expect to equal it, and your tolerance to accuracy varies with algorithm:
https://github.com/opencv/opencv/blob/055645080161c6af6083b6...
The hard part is tightening our development feedback cycle. Since we outperform all competitors, we don't have an oracle to test against. We can automate testing with a small sample of input-output pairs, but the brunt of the work is still done by humans trained and paid to judge the quality of the results. It's an awful position to be in.
I have started looking for better ways of doing it, and the most promising I've found so far is metamorphic testing, mentioned in another comment.
Property testing only takes you a short bit here, as far as I've been able to figure out.
(I have also glanced at the techniques used in bioinformatics, since those guys are good at comparing sequences, but that's more specific to our case and not a general solution.)
When I think about my projects "working", I always try to answer the following questions:
1) Is my code doing what I believe it should be doing? That question always have objective answers and is the subject of software engineering testing.
2) Is my solution solving my problem efficiently? Often that's a domain specific question and different domains have different ways of doing quality assurance, there's no silver bullet.
One more category of tests I would add are meta tests (like mutation tests). These are tests which test the tests, seeing if they would actually catch any errors / bugs or just report everything to be alright always.
This is often a good idea but if you only need the flexibility to enable unit testing then it may make your system more complex than it needs to be. Only introduce indirection where it's really needed. See also "test induced design damage" and "write tests, not too many, mostly integration".
The advice to write mostly integration tests is a terrible one. Particularly when they test integrating of everything. When such tests catch bugs, they don't tell where the problem happened. They also take long time to execute.
For example, if you're code zips something, your test could use many zip engines to verify.
1. The argument x must not be 0.
2. The variable x must smaller than the variable y.
3. The list foo must be non-empty.
4. The variable x should have value 'Success' if it had value 'Try' in the beginning of the function call.
These 'invariants', or assertions, can be extremely useful for testing the correctness of the code. Put simply, if an invariant is violated (during unit test, integration tests, or system tests), it indicates that either the design or the implementation is wrong. An article on testing methodology would be more appealing if it had some discussion on exploiting invariants/assertions.
Arguably, invariants are especially powerful in testing distributed systems:
You didn't search much: https://sqlite.org/testing.html