It's built by the founders of Puppeteer which came out of the Chrome team. Some things I like about it:
1. It's reliable and implements auto-waiting as described in the article. You can use modern async/await syntax and it ensures elements are a) attached to the DOM, visible, stable (not animating), can receive events, and are enabled: https://playwright.dev/docs/actionability
2. It's fast — It creates multiple processes and runs tests in parallel, unlike e.g. Cypress.
3. It's cross-browser — supports Chrome, Safari, and Firefox out-of-the-box.
4. The tracing tools are incredible, you can step through the entire test execution and get a live DOM that you can inspect with your browser's existing developer tools, see all console.logs, etc...
5. The developers and community are incredibly responsive. This is one of the biggest ones — issues are quickly responded to and addressed often by the founders, pull requests are welcomed and Slack is highly active and respectful.
My prior experience with end-to-end tests was that they were highly buggy and unreliable and so Playwright was a welcome surprise and inspired me to fully test all the variations of our checkout flow.
Cypress was a surprisingly nice experience as well and led me to research other modern e2e tools. Most of the points above can be compared against Cypress — Playwright supports parallel execution of tests within the same file on the same machine, which Cypress doesn't, and so is much faster. Cypress doesn't use modern async / await syntax. Due to its architecture, Playwright can test across tabs, work with iframes easily, which Cypress can't.
The UI for Cypress's developer tools is nice, but... as I said, Playwright's tracing UI is really excellent and the documentation is also really well done. This is also a personal thing, but I trust tools that came out of browser teams (Chrome) to emulate browsers in a more efficient way, e.g. spinning up cheap, isolated browser contexts in Chrome, the details of waiting for an element to be ready, etc...
Another post on this: https://alisterbscott.com/2021/10/27/five-reasons-why-playwr...
Are you are discovering these for the first time? Great, happy that you are getting exposed! If you read these and think, "we could utilize these concepts with our engineers(test or not)," I would encourage you to look at it from an organizational perspective. You may want to add someone to your team(s) with these skillsets. Most automation testers understand these concepts well and can help you on the next-level maturity items.
If you have one or more integrations to external systems, where you cannot control your test data, it becomes much harder to write stable E2E tests.
Some don't have test environments, some have too few. Most don't allow you to setup data easily either way.
You can, of course, mock the external systems, but if they play a large enough part, your tests start looking more like integration tests again, but with the added overhead of something like browser automation.
It's a hard balance to strike.
I suspect this is a good idea, but it raises some red flags for me. People may not want to fix tests if they don't feel like they have time, or fixing tests will help their promotion (i.e. culture). Of course if you have good engineering culture, this is probably a useful signal for tests to remove.
One pattern that we can apply to increase visibility or ownership is stability metrics. If a test must/should be fixed many times can be teased out once you can view these metrics. On failure, display that this test has passed in this configuration for the past x-amount of runs. - Pass the last 100 runs? High likelihood the test is highlighting a bug and must be engaged on. - 95% pass rate in the last 100 runs? It may be time to quarantine this test and add it to the remediation backlog. Your level of acceptable false-positive rates may differ depending on team velocity and suite runtimes.
"How many tests are in quarantine, what is the average time-to-fix, and what direction is this trending" are valuable metrics that we can utilize to find ownership and highlight the technical debt.
As you said, culture around such patterns isn't always there.
I'm a solopreneur building an app with Flutter. Flutter's testing support is mostly broken and or unwritten. It's very frustrating.
A lot of reliability in that system came from being able to quickly iterate on different levels of the system. The easier it was to solve a failure where it's happening, the more likely your bugs can be fixed quickly, so you have a healthy system (as opposed to suffering from entropy and tech debt)
Especially in mobile/web applications where you are often consuming loads of services/libraries/sdks, some in house, some external, you are often running a tiny amount of your own code. Adding tonnes of unit tests to that is sort of missing the big picture - you need to test it all works together as a user would.
Unit tests fall on the far left, workload tests/E2E tests/testing-in-production fall on the far right.
It turns out that there's no 'wrong' level, there's just different tradeoffs. I've worked at a lot of companies that embraced the realness of E2E tests, but then suffered from the maintenance/performance/diagnosability/instability of those tests. I have colleagues who worked at places that avoided E2E at all costs, and suffered because they would have a green test run, but user scenarios that a simple E2E test would have caught, were completely broken.
IMO there is a lot that can be done to improve E2E testing at most companies, but they definitely have the capacity to add value to your release/testing pipeline.