I also don't see how a robust type really helps there. It might even be part of the thing that needs to be refactored. Besides, many languages don't have a very robust type system.
End-to-end tests and type systems certainly have their uses, but for refactoring messy code, I don't think there's a good substitute for thorough unit tests.
Although there are different kinda of refactoring of course. The case I'm referring to was about one very messy module. This makes it very easy to unit test. If instead it's the entire architecture of your application that needs to be refactored, then you're looking at a very different case, and end-to-end tests become more important than edge cases.
End-to-end (e2e) tests the entire stack: deploy the site with database and all, run a script that visits a page, clicks and stuff, and checks if the right things become visible.
Unit tests are the other extreme: you take a single unit of code and test whether it does what it's supposed to, while mocking all communication with other units of code.
Integration tests are in between those two extremes, and probably the least understood as a result (at least by me). They look like unit tests, and don't generally test the UI, but they don't mock the other components, and ideally also set up a database to test against.
I think there's a bit of overlap between unit and integration tests; a sloppy unit test where you don't mock (some) other components but treat them as part of the unit you're testing, start to look like integration tests at some point. If you want a clear demarcation, I think you might consider a database connection essential to count as integration test.
In Ada, the programmer is discouraged from using the Ada equivalent of int directly, and is encouraged to instead introduce a subtype that reflects the specific use of int (including automatic range checking).
This isn't as natural in C++ but is still possible. Boost offers a BOOST_STRONG_TYPEDEF [0] to deliberately introduce an incompatible type. (I do recall having trouble getting it to behave, but it's been a while.)
Whether this makes sense in most mathematical code, I'm not sure, but it seems like it's an option.
[0] https://www.boost.org/doc/libs/1_73_0/boost/serialization/st...
For an example, dig into some crypto libraries. They operate on bytes all over the place performing XORs, etc. A type system isn’t really gonna help you ensure you got the correct number of AES rounds and stitched the blocks in the right order.
IMO the only systems where this “type system eliminates most tests” philosophy seriously works are the ones that don’t do anything other than pass data between components without doing anything beyond calling some serialization methods.
Which is what 90% of programmers on this website are essentially doing. And for the remaining 10% there is likely a better suited, different programming language or type system available. Even if there isn't unit tests would still be very niche and the general case would be that by default you shouldn't be unit testing.
End to end tests are really slow, but if you can get them into the 300-1600 tests per second range then i have no beef. I value tests but I seriously grudge waiting for tests.
Also, have you ever seen a project where the code is a mess but the unit tests are perfect? Even if somehow you could write unit tests that would cover for super bad code (which you can't), it is extremely unlikely that your unit tests would be that amazing.
There's quite a lot you can refactor without breaking unit tests. If you've got a single 200-line function full of nested loops with cryptic variable names, modern IDEs make it really easy to extract those loops to their own functions. Figure out what they're meant to do, give them a descriptive name, and you already improved the code a bit without breaking any unit tests. If your IDE does this well, you could even do this without unit tests, but you really will need those unit tests once you start reorganising the code making use of the excessive number of parameters those extracted functions invariably end up with.
Of course you can write unit tests for super bad code. If it's a function that returns something, it's trivial to unit test, no matter how badly written that function is. If it calls other code, you have to mock those calls and test that those mocks get called under the same conditions. If they mess with global variables, that's terrible, but even that can be mocked.
If the code uses gotos to code outside the module, somebody needs to get shot, and I guess you need a unit testing framework that can mock those gotos. I've never seen one, because nobodoy uses gotos anymore.
With end-to-end tests such as when piloting a browser, it's not really easy to get things like tracebacks into the console output for example.
If you're working on a very specific box that has well defined, well known inputs and outputs then TDD is an excellent tool.
But for anything with a non-specific "We'd like to do X and display the result on Y" it just gets in the way.
I've worked on mobile apps before with a small team, and inevitably, we'll find bugs that show up in the user interface when the user rotates their phone. It's hard to unit test for rotation changes, and it's also hard to code a rotation change into an end-to-end test on mobile. Animations are also something that's difficult to test in an automated fashion, and all the testing in the world won't be as good as showing the animation in front of a designer. So some level of manual testing is needed on mobile.
I've worked on web frontends where there would be bugs with scrolling jumping back and forth. An end-to-end test using Selenium may not catch the issue, but for a user, it can be painfully obvious. Similarly, animations are also hard to unit test on the web. So some level of manual testing is needed on the web.
The only place where I could see manual testing NOT being needed is for backend development, since the input and output to a backend system is much more controlled. You could write an end-to-end test for any scenario a user could throw at your system.
In summary, don't underestimate the value of manual QA!