Testcontainers is the library that convinced me that shelling out to docker as an abstraction via bash calls embedded in a library is a bad idea. Not because containerization as an abstraction is a bad idea. Rather it’s that having a library that custom shell calls to the docker CLI as part of its core functionality creates problems and complexity as soon as one introduces other containerized workflows. The library has the nasty habit of assuming it’s running on a host machine and nothing else docker related is running, and footguns itself with limitations accordingly. This makes it not much better than some non dockerized library in most cases and oftentimes much much worse.
Not sure this matters for the core argument you are making, just thought I’d point it out.
Currently, in my integration testing project, I use Testcontainers to spin up a PostgreSQL database in Docker and then use that for testing. I can control the database lifecycle from my test code. It works perfectly, both on my local PC and in the CI pipeline. To date it also did not interfere or conflict with other Docker containers I have running locally (like the development database).
From what I gather, that is exactly the use case for Testcontainers. How would you solve this instead? I'm on Windows by the way, the CI pipeline is on Linux.
Edit: Just checked the documentation. According to the docs it is using the Docker API.
That's probably why you can and do use it, Jenkins. Jenkins let's you install w/e on the hosts where as more modern systems default context is a docker container or at least speak it natively.
> Can you specify what issues you had?
Some of my devs have coded tests containers into their ITs. These are the only pipelines we can't containerize, because test containers don't seem to work inside docker, and won't work in k8s either.
I do not know how the project has developed but at the time I tried it it felt very orthogonal or even incompabtible to more complex (as in multi language monorepo) projects, cde and containerized ci approaches.
I do not know how this has developed since, the emergence of cde standards like devcontainer and devfile might have improved this situation. Yet all projects I have started in the past 5 years where plain multilingual cde projects based on (mostly) a compose.yml file and not much more so no idea how really widespread their usage is.
That being said, having specific requirements for the environment of your integration tests is not necessarily bad IMO. It's just a question of checking these requirement and reporting any mismatches.
https://github.com/testcontainers/testcontainers-java/blob/m...
https://github.com/testcontainers/testcontainers-java/blob/m...
> The library has the nasty habit of assuming it’s running on a host machine and nothing else docker related is running
To be honest given that most tests run as part of an isolated CI/CD pipeline this is a very reasonable assumption to make.I did not realize Rust wasn’t officially supported until I didn’t go to their GitHub and see in the readme that it’s a community project, and not their „official” one
I don't know this library but it looks like something that I started writing myself for exactly the same reasons so it would be great to know that's wrong with this implementation or why shouldn't I migrate to use that, thanks
- Testcontainers running in a DinD configuration is complex and harder to get right
- Testcontainers needing to network or otherwise talk with other containers not orchestrated by testcontainers
- general flakiness of tests which are harder to debug because of the library abstraction around Docker
In general if anything else in your workflow other than testcontainers also spawns and manages container lifecycle, getting it to work together with testcontainers is basically trying to reconcile two different configuration sets of containers being spawned within Docker. I think the crux of the issue is that testcontainers inverts the control of tooling. Typically containers encapsulate applications, and in this case it's the other way around. Which is not necessarily a bad thing (indeed, I am a huge proponent of using code to control containers like this), but when you introduce a level of "container-ception" by having two different methodologies like this it creates a lot of complexity and subsequent pain.
Compose is much more straightforward in terms of playing well with other stuff and being simple but obviously isn't great for this kind of unit test thing that testcontainers excels at
Pretty much every project I create now has testcontainers for integration testing :)
I setup CI so it lints, builds, unit tests then integration tests (using testcontainers)
https://github.com/turbolytics/latte/blob/main/.github/workf...
Their language bindings provide nice helper functions for common database operations (like generating a connection uri from a container user)
https://github.com/turbolytics/latte/blob/main/internal/sour...
I use them in $day job use them in side projects use them everywhere :)
You mention this as an afterthought but that's the critical feature. Giving developers the ability to run integration tests locally is a massive win in a "ball of mud" environment. There are other ways to accomplish this locally, but the test-infrastructure-as-test-code approach is a powerful and conceptually elegant abstraction, especially when used as a tool to design testcontainers for your own services that can be imported as packages into dependent services.
Test containers provide a middle ground.
For example we have pure unit tests. But also some tests that boot up Postgres. Test the db migration and gives you a db to play with for your specific “unit” test test case.
No need for a complete environment with Kafka etc. It provides a cost effective stepping stone to what you describe.
What would be nice if test containers could create a complete environment, on the test machine and delete it again.
Still a deploy with some smoke tests on a real env are nice.
To test integration with 1 dependency at class level we can use test containers.
But to test the integration of the whole microservice with other microservices + dependencies we use a test environment and some test code. It's a bit like an E2E test for an API.
I would argue that the test environment is more useful if I had to choose between the two as it can test the service contract fully, unlike lower type testing which requires a lot of mocking.
I advocate for not having any integrated environment for automated testing at all. The aim should be to be able to run all tests locally and get a quicker feedback loop.
I dont know why integration testing like this is considered a gamechanger. the testing pyramid is a testing pyramid for a reason and its always considered them important. Sometimes starting with integration tests in your project is right because your dont waste time doing manual point and clicks. Instead you design your system around being able to integration test, this includes when you choose dependancies. You think to yourself "how easily will that be able to be stood up on its own from a command?" If the answer is "not very good" then you move on.
Meanwhile, Testcontainers is done quite well. It's not perfect, but it's sure better than the in-house stuff I built in the past (for the same basic concept).
No, it does not start faster than other Docker containers.
I do challenge the testing pyramid, though. At the risk of repeating my other comment on a different branch of the discussion: the value of integration tests is high, as the cost of integration tests has decreased, it makes sense to do more integration testing, at the expense of unit testing. The cost has decreased exactly due to Docker and mature application frameworks (like in Java: Spring). (See: Testing Trophy.)
https://www.docker.com/blog/docker-whale-comes-atomicjar-mak...
This seems to be quite a contradiction. If it's so easy to just write from scratch, then why would it be scary to depend on? Of course, it's not that easy to write from scratch. You could make a proof-of-concept in maybe an hour... Maybe. But they already took the proof of concept to a complete phase. Made it work with Podman. Added tons of integration code to make it easy to use with many common services. Ported it to several different languages. And, built a community around it.
If you do this from scratch, you have to go through most of the effort and problems they already did, except if you write your own solution, you have to maintain it right from the git-go, whereas if you choose Testcontainers, you'll only wind up having to maintain it if the project is left for dead and starts to bitrot. The Docker API is pretty stable though, so honestly, this doesn't seem likely to be a huge issue.
Testcontainers is exactly the sort of thing open source is great for; it's something where everyone gets to benefit from the wisdom and battle-testing of everyone else. For most of the problems you might run into, there is a pretty decent chance someone already did, so there's a pretty decent chance it's already been fixed.
Most people have GitHub and Dockerhub dependencies in their critical dependency path for builds and deployment. Services go down, change their policies, deprecate APIs, and go under, but code continues to work if you replicate the environment it originally worked in. The biggest risk with code dependencies (for non-production code like test code) is usually that it blocks you from updating some other software. The biggest risk with services is that they completely disappear and you are completely blocked until you fully remove the dependency.
I think people depending on Testcontainers are fine and doing very well with their risk analysis.
I believe the next step, once using test containers, would be automating data generation and validation. Then you will have an automated pipeline of integration tests that are independent, fast and reliable.
I find integration tests that exercise actual databases/Elasticsearch/Redis/Varnish etc to be massively more valuable than traditional unit tests. In the past I've gone to pretty deep lengths to do things like spin up a new Elasticsearch index for the duration of a test suite and spin it down again at the end.
It looks like Testcontainers does all of that work for me.
My testing strategy is to have as much of my application's functionality covered by proper end-to-end integration-style tests as possible - think tests that simulate an incoming HTTP request and then run assertions against the response (and increasingly Playwright-powered browser automation tests for anything with heavy JavaScript).
I'll use unit tests sparingly, just for the bits of my code that have very clear input/output pairs that afford unit testing.
I only use mocks for things that I don't have any chance of controlling - calls to external APIs for example, where I can't control if the API provider will be flaky or not.
Unit tests are great, but if you significantly refactor how several classes talk to each other, and each of those classes had their own, isolated unit tests that mocked out all of the others, you're suddenly refactoring with no tests. But a black box integration tests? Refactor all your code, replace your databases, do whatever you want, integration test still passes.
Unit test speed is a huge win, and they're incredibly useful for quickly testing weird little edge cases that are annoying to write integration tests for, but if I can write an integration test for it, I prefer the integration test.
Even better? Take your integration test, put it on a cronjob in your VPN/vpc, use real endpoints and make bespoke auth credentials + namespace, and now you have canaries. Canaries are IMHO God tier for whole system observability.
Then take your canary, clean it up, and now you have examples for documentation.
Unit tests are for me mostly testing domain+codomain of functions and adherence to business logic, but a good type system along with discipline for actually making schemas/POJOs etc instead of just tossing around maps strings and ints everywhere already accomplishes a lot of that (still absolutely needed though!)
Sure, integration tests "save" you from writing pesky unit tests, and changing them frequently after every refactor.
But how do you quickly locate the reason that integration test failed? There could be hundreds of moving parts involved, and any one of them malfunctioning, or any unexpected interaction between them, could cause it to fail. The error itself would likely not be clear enough, if it's covered by layers of indirection.
Unit tests give you that ability. If written correctly, they should be the first to fail (which is a good thing!), and if an integration test fails, it should ideally also be accompanied by at least one unit test failure. This way it immediately pinpoints the root cause.
The higher up the stack you test, the harder it is to debug. With E2E tests you're essentially debugging the entire system, which is why we don't exclusively write E2E tests, even though they're very useful.
To me the traditional test pyramid is still the best way to think about tests. Tests shouldn't be an afterthought or a chore. Maintaining a comprehensive and effective test suite takes as much hard work as, if not more than, maintaining the application itself, and it should test all layers of the system. But if you do have that, it gives you superpowers to safely and reliably work on any part of the system.
For example, assuming you have a test database with realistic data (or scrubbed production data), write tests that are based on generalizable business rules, e.g: the total line of an 'invoice' GET response should be the sum of all the 'sections' endpoint responses tied to that invoice id. Then, just have a process that runs before the tests create a bunch of test cases (invoice IDs to try), randomly selected from all the IDs in the database. Limit the number of cases to something reasonable for total test duration.
As one would expect, overly tight assertions can often lead to many false positives, but really tough edge cases hidden in diverse/unexpected data (null refs) can be found that usually escape the artificial or 'happy path' pre-selected cases.
Testing that you actually run "sum()" is a unit test.
I’d prefer a dozen well written integration tests over a hundred unit tests.
Having said that, both solve different problems, ideally you have both. But when time-constrained, I always focus on integration tests with actual services underneath.
> Creating reliable and fully-initialized service dependencies using raw Docker commands or using Docker Compose requires good knowledge of Docker internals and how to best run specific technologies in a container
This sounds like a <your programming language> abstraction over docker-compose, which lets you define your docker environment without learning the syntax of docker-compose itself. But then
> port conflicts, containers not being fully initialized or ready for interactions when the tests start, etc.
means you'd still need a good understanding of docker networking, dependencies, healthchecks to know if your test environment is ready to be used.
Am I missing something? Is this basically change what's starting your docker test containers?
Shows how you can embed the declaration of db for testing in a unit test:
> pgContainer, err := postgres.RunContainer(ctx, > testcontainers.WithImage("postgres:15.3-alpine"), > postgres.WithInitScripts(filepath.Join("..", "testdata", "init-db.sql")), > postgres.WithDatabase("test-db"), > postgres.WithUsername("postgres"), > postgres.WithPassword("postgres"), > testcontainers.WithWaitStrategy( > wait.ForLog("database system is ready to accept connections").
This does look quite neat for setting up test specific database instances instead of spawning one outside of the test context with docker(compose). It should also make it possible to run tests that require their own instance in parallel.
pgContainer, err := postgres.RunContainer(
ctx, testcontainers.WithImage("postgres:15.3-alpine"
),
postgres.WithInitScripts(filepath.Join("..", "testdata", "init-db.sql")),
postgres.WithDatabase("test-db"),
postgres.WithUsername("postgres"),
postgres.WithPassword("postgres"),
testcontainers.WithWaitStrategy(
wait.ForLog("database system is ready to accept connections").A better approach is to create a single postgres server one-time before running all of your tests. Then, create a template database on that server, and run your migrations on that template. Now, for each unit test, you can connect to the same server and create a new database from that template. This is not a pain in the ass and it is very fast: you run your migrations one time, and pay a ~20ms cost for each test to get its own database.
I've implemented this for golang here — considering also implementing this for Django and for Typescript if there is enough interest. https://github.com/peterldowns/pgtestdb
Indeed all they do is provide an abstraction for your language, but this is soo useful for unit/integration tests.
At my work we have many microservices in both Java and python, all of which use testcontainers to set up the local env or integration tests. The integration with localstack and the ability to programmatically set it up without fighting with compose files, is somewhat I find very useful.
I ask because I really like this and would love to use it, but I'm concerned that that would add just an insane amount of overhead to the point where the convenience isn't worth the immense amount of extra time it would take.
I built a new service registry recently, its unit tests spins up a zookeeper instance for the duration of the test, and then kills it.
Also very nice with databases. Spin up a clean db, run migrations, then test db code with zero worries about accidentally leaving stuff in a table that poisons other tests.
I guess the killer feature is how well it works.
Are you spinning up a new instance between every test case? Because that sounds painfully slow.
I would just define a function which DELETEs all the data and call it between every test.
- on a Mac
- on a Linux VM
- in a Docker container on a Linux VM, with a Docker socket mounted
The networking for each of these is completely different. I had to make some opinionated choices to get code that could run in all cases. And running inside Docker prevented the test from being able to mount arbitrary files into the test containers, which turns out to be a requirement often. I ended up writing code to build a new image for each container, using ADD to inject files.
I also wanted all the tests to run in parallel and spit out readable logs from every container (properly associated with the correct test).
Not sure if any of these things have changed in testcontainers since I last looked, but these are the things I ran into. It took maybe a month of off and on tweaking, contrary to some people here claiming it can be done in an hour. As always, the devil is in the details.
edit: I did end up stealing ryuk. That thing can’t really be improved upon.
What does that mean in this case? What does a hand rolled version of this look like?
My version is more opinionated than testcontainers and can really only be used inside Go tests (relies on a testing.TB)
Wait what? They think you don't need unit tests because you can run integration tests with containers?
It's trivial to set up a docker container with one of your dependencies, but starting containers is painful and slow.
2) While unit tests are cheaper and quicker than (project-level) integration tests, they also in many cases don't provide results as good a result and level of confidence, because a lot of run-time aspects (serialization, HTTP responses, database responses, etc.) are not as straightforward to mock. There's been some noise about The Testing Trophy, instead of the Testing Pyramid where, in short, there are still unit tests where it makes sense, but a lot of testing has moved to the (project-level) integration tests. These are slower, but only by so much that the trade-off is often worth it. Whether it's worth it, depends heavily on what you're testing. If it's a CRUD API: I use integration tests. If it's something algorithmic, or string manipulation, etc.: I use unit tests.
When I saw the Testing Trophy presented, it came with the asterisk that (project-level) integration testing has gotten easier and cheaper over time, thus allowing a shift in trade-off. Testcontainers is one of the primary reasons why this shift has happened. (And... I respect that it's not for everyone.)
Some references: https://kentcdodds.com/blog/the-testing-trophy-and-testing-c... https://www.youtube.com/watch?v=t-HmXomntCA
But on the test philosophy angle, my take on what's happening is just that developers traditionally look for any reason to skip tests. I've seen this in a few different forms.
- right now containers make it trivial to run all of your dependencies. That's much easier than creating a mock or a fake, so we do that and don't bother creating a mock/fake.
- compiler folks have created great static analysis tools. That's easier than writing a bunch of tests, so we'll just assume static analysis will catch our bugs for us.
- <my language>'s types system does a bunch of work type checking, so I don't need tests. Or maybe I just need randomly generated property tests.
- no tests can sufficiently emulate our production environment, so tests are noise and we'll work out issues in dev and prod.
What I've noticed, though, is that looking across a wide number of software projects is there's a clear difference in quality between projects that have a strong testing discipline and those that convince themselves they don't need tests because of <containers, or types, or whatever else>.
Sure it's possible that tests don't cause the quality difference (maybe there's a third factor for example that causes both). And of course if you have limited resources you have to make a decision about which quality assurance steps to cut.
But personally I respect a project more if they just say they don't have the bandwidth to test properly so they're just skipping to the integration stage (or whatever) rather than convince themselves that those tests weren't important any way. Because I've seen so many projects that would have been much better with even a small number of unit tests where they only had integration tests.
Note, not integration, E2E. I can go from bare vm to fully tested system in under 15 minutes. I can re run that test in 1-5 (depending on project) ...
Im creating 100's of records in that time, and fuzzing a lot of data entry. I could get it to go "even faster" if I went in and removed some of the stepwise testing... A->B->C->D could be broken out to a->b, a->c, a->d.
Because my tests are external, they would be durable across a system re-write (if I need to change language, platform etc). They can also be re-used/tweeked to test system perf under load (something unit tests could never do).
Yeah, those are called end to end tests and you run them after integration tests which you run after unit tests. It sounds to me like they're saying just skip to the end to end tests.
> For many apps and use cases, the overhead in managing container state is worth it.
Yeah, and typically you'd run them after you run unit and integration tests. If I have 10 libraries to test that have database access, I have to run 10 database containers simultaneously every few minutes as part of the development process? That's overkill.
Testcontainers is awesome and all the hate it gets here is undeserved.
Custom shell scripts definitely can't compete.
For example one feature those don't have is "Ryuk": A container that testcontainers starts which monitors the lifetime of the parent application and stops all containers when the parent process exits.
It allows the application to define dependencies for development, testing, CI itself without needing to run some command to bring up docker compose beforehand manually.
One cool usecase for us is also having a ephemeral database container that is started in a Gradle build to generate jOOQ code from tables defined in a Liquibase schema.Especially if there are complex dependencies between required containers it seems to be pretty weak in comparison. But i also only used it like 5 years ago, so maybe things are significantly better now.
However, the new tests could not be run in parallel with the existing ones, as the changes in global state in the database caused flaky failures. I know there will be other tests like them in the future, so I want a robust way of writing these kinds of "global" tests without too much manual labor.
Spinning up a new postgres instance for each of these specific tests would be one solution.
I would like to instead go for running the tests inside of transactions, but that comes with its own sorts of issues.
Integration tests are fine, but they test something else - that your component integrates as intended with <something>, while a unit test moreso tests that your unit behaves in accordance with its specification.
Its pretty much required when you want to setup/teardown in between tests though. This just usually isnt the case for me.
But I need it to catch the bugs you commit to CI so you can fix them right away instead of letting me catch them and report them and wait wreck my productivity.
(this is of course not directed at you personally, feel free to replace you/I/me with whatever names you can imagine!)
That wheel file is only 2.9KB, so I grabbed a copy to see how it works. I've put the contents in a Gist here: https://gist.github.com/simonw/c53f80a525d573533a730f5f28858...
It's pretty neat - it depends on testcontainers-core, sqlalchemy and psycopg2-binary and then defines a PostgresContainer class which fires up a "postgres:latest" container and provides a helper function for getting the right connection URL.
See https://github.com/cogini/phoenix_container_example for a full example. This blog post describes it in detail: https://www.cogini.com/blog/breaking-up-the-monolith-buildin...
You often need to add custom behavior like waiting for the app to load and start serving, healthchecks, etc. Having it all in code is pretty useful, and it's self-contained within the code itself vs having to set up the environment in different places (CI, Github actions, local dev, etc).
The negative is that code isn't portable to prod, it doesn't test your environment as well (important for staging), and you're missing out on sharing some environment settings.
I feel like it definitely has its place in the stack and in certain companies.
Things in the software world are very trendy. If this starts a trend of making people think that they're writing unit tests when they are writing integrations tests, we are fucked.
If I need to change code that you wrote I need a lightning fast way to figure out that I haven't broken your code according to the tests that you wrote. That's unit tests.
My changes might break the whole system. That's integration tests. I just to run that once and then I can go back to unit tests while I fix the mess I've made.
https://github.com/juspay/services-flake
We actually do this in Nammayatri, an OSS project providing "Uber" for autos in India.
https://github.com/nammayatri/nammayatri
There is a services-flake module allowing you to spin the entire nammayatri stack (including postgres, redis, etc.) using a flake app. Similarly, there's one for running load test, which is also run in Jenkins CI.
With TestContainers - I've perceived that running integration tests / a single test repeatedly locally is extremely slow as the containers are shut down when the java process is killed. This approach allows for this while also allowing to keep it consistent - example, just mount the migrations folder in the start volume of your DB container and you have a like-for-like schema of your prod DB ready for integration tests.
I've found the https://github.com/avast/gradle-docker-compose-plugin/ very useful for this.
One small note: test run time will probably increase. If a person has an outdated computer, I suspect they will have a hard time running the IT suite. Especially if it’s a complicated system with more than one dependency.
Except where everyone is saying that's too slow and instead they have a long-lived instance which they manually teardown each time. That's even what the examples do (some, at least, I didn't check them all).
If you've already bought into the container world then why not embrace a few more. For everyone else, not sure there's much point in extra complexity (they call it simplicity) or bloat.
Why not keep this information in code .. often the developers are ending up doing those task anyway. (not recommended .. but seen it so many times)
Link: Microsoft aspire (https://learn.microsoft.com/en-us/dotnet/aspire/get-started/...)
Go has a lot of in-memory versions of things for tests, which run so much quicker than leaning on docker. Similarly, I found C# has in-memory versions of deps you can lean on.
I really feel that test containers, although solving a problem, often introduces others for no great benefit
That's an integration test. These are integration tests. You're literally testing multiple units (e.g., Redis, and the thing using Redis) to see if they're integrating.
Why do we even have words.
These are valuable in their own right. They're just complicated & often incredibly slow compared to a unit test. Which is why I prefer mocks, too: they're speedy. You just have to get the mock right … and that can be tricky, particularly since some APIs are just woefully underdocumented, or the documentation is just full of lies. But the mocks I've written in the past steadily improve over time. Learn to stop worrying, and love each for what they are.
(Our CI system actually used to pretty much directly support this pattern. Then we moved to Github Actions. GHA has "service containers", but unfortunately the feature is too basic to address real-world use cases: it assumes a container image can just … boot! … and only talk to the code via the network. Real world use cases often require serialized steps between the test & the dependencies, e.g., to create or init database dirs, set up certs, etc.)
My biased recommendation is to write a custom Dagger function, and run it in your GHA workflow. https://dagger.io
If you find me on the Dagger discord, I will gladly write a code snippet summarizing what I have in mind, based on what you explained of your CI stack. We use GHA ourselves and use this pattern to great effect.
Disclaimer: I work there :)
Testcontainers does have a docker compose integration [1].
https://www.redhat.com/sysadmin/kubernetes-workloads-podman-... << as a for instance ;)
I use a very similar thing via pytest-docker: https://github.com/avast/pytest-docker The only difference seems to be you declare your containers via a docker-compose file which I prefer because it's a standard thing you can use elsewhere.
I don't like layering abstractions on top of abstractions that were fine to begin with. Docker-compose is pretty much perfect for the job. An added complexity is that the before/after semantics of the test suite in things like JUnit are a bit handwavy and hard to control. Unlike testng, there's no @BeforeSuite (which is really what you want). The @BeforeAll that junit has is actually too late in the process to be messing around with docker. And more importantly, if I'm developing, I don't want my docker containers to be wasting time restarting in between tests. That's 20-30 seconds I don't want to add on top of the already lengthy runtime of compiling/building, firing up Spring and letting it do it's thing before my test runs in about 1-2 seconds.
All this is trivially solved by doing docker stuff at the right time: before your test process starts.
So, I do that using good old docker compose and a simple gradle plugin that calls it before our tests run and then again to shut it down right after. If it's already running (it simply probes the port) it skips the startup and shut down sequence and just leaves it running. It's not perfect but it's very simple. I have docker-compose up most of my working day. Sometimes for days on end. My tests don't have to wait for it to come up because it's already up. On CI (github actions), gradle starts docker compose, waits for it to come up, runs the tests, and then shuts it down.
This has another big advantage that the process of running a standalone development server for manual testing, running our integration tests, and running our production server are very similar. Exactly the same actually; the only difference configuration and some light bootstrapping logic (schema creation). Configuration basically involves telling our server the hosts and ports of all the stuff it needs to run. Which in our case is postgres, redis, and elasticsearch.
Editing the setup is easy; just edit the docker compose and modify some properties. Works with jvm based stuff and it's equally easy to replicate with other stuff.
There are a few more tricks I use to keep things fast. I have ~300 integration tests that use db, redis, and elasticsearch. They run concurrently in under 1 minute on my mac. I cannot emphesize how important fast integration tests are as a key enabler for developer productivity. Enabling this sort of thing requires some planning but it pays off hugely.
I wrote up a detailed article on how to do this some years ago. https://www.jillesvangurp.com/blog/2016-05-25-functional-tes...
That's still what I do a few projects and companies later.
We also want homogeneity in tech when possible (we already heavily use kubernetes, we don't want to keep docker hosts anymore).
Teams of testers need to be accounted in terms of resource quotas and RBAC.
What exactly do you see as an overkill in wanting to run short-lived containers in kubernetes rather than in docker (if we already have kubernetes and "cook" it ourselves)?
Running integration tests are significantly more complicated to write and take longer to run.
There is also race conditions present that you need to account for programmatically.. Such as waiting for a db to come up and schema to be applied. Or waiting for a specific event to occur in the daemon.
That being said, this looks like a decent start. One thing that seems to be missing is the ability to tail logs and assert specific marks in the logs. Often you need to do an operation and wait until you see an event.