A wrapper around an API is by definition heavier (more code, more functions) than using the lower level api.
It’s not using less resources.
It’s not faster (it has implicit waiting).
It’s not less code; it’s literally a superset of selenium?
Feels like a “selenium framework” is more accurate than light weight web automation?
Anyway, there’s no fixing automation tests with fancy APIs.
No matter what you try to do, if people are only interested in writing quick dirty scripts, you’re doomed to a pile of stupid spaghetti no matter what system or framework you have.
If you want sustainable automation, you have to do Real Software Engineering and write actual composable modules; and you can do that in anything, even raw selenium.
So… I’d be more interested if this was pitched as “composable lego for building automation” …
…but, personally, as it stands all I can really see is “makes easy things easier with sensible defaults”.
That’s nice for getting started; but getting started is not the problem with automation tests.
It’s maintaining them.
Helium helps with maintaining automation tests as well. click("Compose") is infinitely more maintainable than document.getElementById("eIu7Db").click(). (I just took this example from Gmail's web interface.)
I would much rather directly rely on Selenium's stable APIs than someone else's wrapped APIs that is opionated and could be incomplete, incorrect, outdated and potentially unmaintained someday. There are always much more resources put into Selenium than these add-ons.
If I really want, I can choose a few APIs that I actually use and wrap them within my codebase. That's more reliable than this.
loginAsUser(user)
id = createBooking(user)
loginAsAdmin()
approveBooking(id)
?
Is it the same as selenium? Do whatever you want your self?
That’s what I’m talking about. Unless you have high level composable modules that let you express high level test activities then your tests will always fall apart.
The syntax of the low level operations doesn’t matter because you will never ever care about a click(“compose”).
That’s not a test.
A test might be:
createEmail()
attachFile(…)
… whatever your bespoke business requirements are.
Having fancy wrappers?
Is it nicer? Sure.
Does it meaningfully improve the tests, maintaining tests?
Nope.
Because at the end of the day the low level operations will be bespoke, nasty, messy and different for each website; that’s why you wrap them up in functions and compose them.
At least, in my experience; this looks a lot like cypress; a high level set of operations with sensible defaults for easy tasks.
…but, practically, I’m skeptical that hiding the low level nasty details actually makes them go away; it’s smoothing them over for the “happy path”; but automation tests are like 90% edge cases.
> It’s use can be lighter
I don’t think that’s the generally accepted meaning of a light weight framework.
…but eh, fair enough. I understand what you mean.
“Lighter” may be used as an alternative adjective to the word easy or easier. Your post, which comes off as very rude, misses the point of how the project is marketed.
At least the OP did not call it Python automation for humans …
That is, again, not common usage, there’s a word for easier to use; it’s “easier”. but whatever. It doesn’t matter; it’s just branding.
My point however, is that making easy to use frameworks for test automation is fundamentally misguided, and the responses like “try it, you’ll be amazed it makes all the problems go away” is the type of “drinking kool aid” that’s displays a deep lack of understanding of the problem space.
Doing easy things does not solve doing hard things; not here. Not in go. Not in rust. Not ever.
So, my point was (and is):
How does this address doing hard things because as someone who is familiar with this space and has tried it, I can’t see anything that helps with the hard things and no one who is heavily invested in automation realllly cares about doing easy things.
We can already do easy things
Another way of doing easy things is like using prettier or not; it’s a style preference.
So, is that what this is?
Selenium with a function calling style preference, or something that actual helps building automation?
There’s nothing wrong with making tools that make superficial cosmetic changes to the way you do things.
…but, that’s not how the project is marketed; as, at least, I’ve understood it.
I'd love to see such an "open automation" format (could even be more general than pure software, could also automate your IoT or whatever, through extensions)
eg you could have a file "Type my bank login password" for bank websites which doesn't let you use keyboard input but force you to click on stuff, like a self-documented script using .md with code
# Type my bank login password
## Trigger
```trigger:hotkey
key: cmd+l
filter: frontmost-app=Chrome and chrome.tab.url=~mybank.com/login
```
## Deps
```ensure-deps
shell-runner>=1.*
screen-ocr>=1.*
python-runner>=1.*
```
Ensure that my system has the proper extensions for the framework, to run all tasks
## What it does
This automation lets me input my password in a "click-only" input for my lousy bank UI
```run:shell /bin/sh:capture-output=password
echo $(op --vault personal --site mybank)
```
(the above runs the shell script and captures the output as a "password" variable I can use in other scripts below)
```run:screen-ocr:capture-output=ocr-result
window:chrome
```
...go on scripting using typescript/python to locate the numbers in the ocr-resultThis looks largely like common workarounds that most people will write using Python-based browser automation. Most of the time, we accept that those capabilities aren't there by default because they are not explicit enough and can result in bugs and undefined behavior even when the elements that we expect to be on the page are actually there.
Given the adage "explicit is better than implicit", I worry that a layer like this might create more trouble than it's worth for the sake of readability. When we get into the nitty-gritty of browser automation, it might just make it harder to debug than going straight to Selenium or Playwright.
Yup, I would never do it in a .py file. But I do it all of the time in the interpreter, which is what the video shows.
This looks largely like common workarounds that most people will write using Python-based browser automation. Most of the time, we accept that those capabilities aren't there by default because they are not explicit enough and can result in bugs and undefined behavior even when the elements that we expect to be on the page are actually there.
It sounds like you haven't tried Helium yet. I think you should, and see for yourself whether the trade-off you talk about actually exists.
Given the adage "explicit is better than implicit", I worry that a layer like this might create more trouble than it's worth for the sake of readability.
You could make the same argument about using C / assembly instead of Python. I suggest you try Helium before making statements about the "trouble" it may create. I believe you will find that there is no trouble.
Or, if you have tried it, if you could explain why you don’t think the tool makes the right tradeoffs
Adages like “explicit is better than implicit” are incredibly context dependent, otherwise we’d all be writing assembly
Fwiw, thanks for contributing this. It seems apt for a number of repetitive things I probably do dozens of times a week and don't even notice as cruft anymore.
I'm not sure why there were such hot takes on what this is or isn't. Maybe Big Selenium crisis actors? You made something cool, you shared it w/ world -- that should be the system prompt for people posting about it in my kinder world of things.
(Functional style: "method(thing)" vs object oriented style: "thing.method()")
We mostly abandoned the functional style when we merged with the WebDriver project (aka Selenium 2), but that functional style still lives on in the Selenium IDE record/playback tool.
That is all to say, there are fans of many different styles for automation APIs. No single API will please everyone. (But I personally like the simpler, functional style, fwiw!)
Side-note: This is also why I'm a fan of the Nim programming language. "method(thing)" and "thing.method" are supported syntax for literally the same thing. For others new to the idea, the fancy term for this is "Uniform Function Call Syntax".
Looks like a nice, almost natural language-like API around what is otherwise a quite cumbersome API.
I appreciate the effort. Thank you M. Hermann.
Rolling in a captcha solving service like DeathByCaptcha or AntiCaptcha and you got yourself a quick and easy script that can do anything on any website regardless of captchas.