Yeah I thought the same, found this in the repo though.
At the time I need a multi-line pattern language, I just whip up a FSM-like program in code. And I'll continue to do that.
curl -sL news.ycombinator.com | rosie grep -o subs net.url
How many LOC is your ad-hoc "FSM-like program", and have you done correctness and performance testing on it? If I had a nickel for every incorrect email validity checker I ran across, I'd have an awful lot of nickels.I won't argue with that. Email is a clusterf---. To make a validator to be truly "correct" is essentially impossible. Three RFCs have successively created standardized regexes and the latest RFC 5322 still has a bug. [1]
You'd have to convince me that you'd be significantly fewer nickles had they used Rosie.
> library of useful patterns
That's cool. I certainly use a library if I need to validate emails.
Not that this is an invalid approach, not at all, but I feel like there's a load-bearing "just" here that indicates this project could be useful for a lot of people :)
If I started from "scratch", perhaps the overhead of learning Rosie would be worth it.
The main selling points appear to be naming patterns in order to reuse them from libraries, functions, and more systematic grouping and lookahead operators: increased expressive power, which probably matters only in complex situations far beyond the limits of appropriate use of grep-like tools, is only a minor benefit.
Regex is terrible to read. Give me an example, things we frequently match for, IP, credit card number, dates ... I read the example for date parsing, I'm ... not sure what the equiv is? I suggest the authors put up a Rosetta stone of sort, eg. in regex-speak: [0-1][0-9]-[0-3][0-9]-201[0-9], in Rosie-speak: xyz. What about capture group, that's what makes regex powerful, not just matching, that and the look-ahead look-behind.
Meta-comment: regex is buried in everything significant that I work with, it's buried in grep, the language libraries, Splunk. It's going to be hard to dislodge, there's a deep moat because the tools and common use cases are ugly but well-understood. Why are regex still being used? Why has nothing better come along? How would I even regex-match extended Unicode?
Also there was a strangeloop talk by it's creator https://youtu.be/MkTiYDrb0zg