ESLint is so slow. I think it takes longer to lint our codebase than it does to run our hundreds and hundreds of tests over it. So beyond the technical deep-dive, it's really exciting that attention is being given to performance
Eslint perf if destroyed by some of the plugins. At work we use lint-staged to run eslint on staged files only and full in CI.
Wait, so in this part the author implies that you can only safely index into a JavaScript string one character at a time if it's ASCII, not Unicode
That's true in languages like C, and even Rust, where an index is an exact byte offset. But I'm 95% sure JavaScript smooths over this, and will treat each index as one unicode character (there's an exception for multi-character entities like diacritics, but). Did the author get this wrong?
For legacy reasons, JavaScript's "character unit", the basic component of a string, is an "UTF-16 character", that is, sixteen bits that are interpreted as being UTF-16-encoded. That said, sixteen bits are not enough to represent all valid Unicode characters in the UTF-16 encoding. Instead, characters in the [supplemental planes] are represented in UTF-16 using two sixteen-bytes "non-characters", which do not individually map to any Unicode codepoint in any plane, but in combination reference an Unicode codepoint in one of the supplemental planes.
JavaScript's internal representation of strings, as well as the APIs it exposes for dealing with strings, such as index accessing and string length, treat each of the sixteen bit "halves" of the UTF-16 representation of a supplemental plane codepoint as individual characters.
This means that, when you index a string, you might get an UTF-16 character that represents a Unicode codepoint in the basic plane, or an UTF-16 "non-character" that, along with its other half, would represent an Unicode codepoint in one of the supplemental planes.
[supplemental planes]: https://en.wikipedia.org/wiki/Plane_(Unicode) (see planes 1 to 16)
That's great feedback! After reading your comment and re-reading the section in the article it does indeed sound wrong. Decided to remove that paragraph. Your explanation of the string representation is really good. Thanks for sharing!
This all was for nought, or were PRs for other changes opened?
15 opened ones, last merged one from Feb 2021 :/
We decided to open the PR nonetheless as people kept asking us about it: https://github.com/estools/esquery/pull/134
I really hope after this series you consider consolidating the posts and making a book out of it, I would definitely buy it as I'm sure others would as well.
This content is great.
> I really hope after this series you consider consolidating the posts and making a book out of it, I would definitely buy it as I'm sure others would as well.
I'm really happy to hear this! With the blog posts I wanted to test the waters to see if there is enough interest in such a topic. It helps me practice my writing and storytelling too. I'd really love to write proper book one day!