> website.rating = "great!"
Ok that’s new to me. Can anyone point out where this could be useful, if anywhere?
str.toUpperCase()
Is actually doing this: new String(str).toUpperCase()
...hand waving away obvious optimizations that surely occur for multiple property access.There may be cases, for instance a series of imperative synchronous hot loops, where this has a meaningful performance impact.
Another thing that’s important to know is that you can do this:
Object.assign(str, {
anythingAt: ‘all’
})
And you’ve now mutated your string into a String. This breaks a buuuuunch of expected behaviors with primitives, such as === and the default Array#sort comparator.Edit: this last one bit me experimenting with a runtime implementation of a common pattern in TypeScript to define nominal types for primitives (e.g. to distinguish a known valid URL from a regular string) called branding[1]. The approach is useful if you’re obsessive about type safety, but flawed because it almost certainly misrepresents the runtime type. I tried to use Object.assign to get that much more type safety and... I got the opposite! I got a String claiming to be a string, which broke all sorts of stuff, including the aforementioned === and Array#sort.
1: https://medium.com/@KevinBGreene/surviving-the-typescript-ec...
But I think the whole `new String` is problematic, because it breaks the strict comparison:
let a = "Foo";
let b = new String("Foo");
a === "Foo"
true
b === "Foo"
falseYeah this sounds terrifying to me, if I were to debug something like this.
Also the fact they identify as objects:
typeof new String("Foo") === 'object';
Case closed, I will never use it :DI personally would never expect a property which is treated as a string. That's what POJOs are for, but that is just opinion I guess. still weird they let you do this.
> and making objects that can be inherited from (unlike primitives).
I mean the SO gives an example of how to do it but I would still love to see some place where this is needed or even pragmatic to any degree.
[^1]: It actually depends on how that string was made, if internally it still references the parent string then slicing it up into lines won't save you any memory. I'm referring to "flattened" strings.
[^2]: I don't remember what the exact character set is, I think it's not exactly ASCII but close enough.
Webkit and the JDK implement the same string optimization, while .NET unfortunately doesn't: https://github.com/dotnet/runtime/issues/6612
Thank you!
For example, the cited text says: "You can return the UTF-16 code for a character in a string using charCodeAt()...".
Not true. This only works if the UTF-16 code fits in 16 bits; if it's more than 16 bits, charCodeAt will only return a part of the character.
There are lots of discussions about this, here's one: https://stackoverflow.com/questions/3744721/javascript-strin...
JavaScript can handle characters outside the BMP, but you sometimes have to aware of the problem & carefully code around it when such characters are possible.
And it gets even worse, when you consider that for many purposes you're not even interested in code points but in graphemes which can be sequences of code points -- e.g. a single visible emoji might actually be a sequence of 5 code points, represented by 8 UTF-16 code units, taking up 16 bytes [2]. Similarly a single accented character will often be two code points (letter plus combining diacritic).
If you want to split a string by graphemes -- e.g. to count its visible length, or delete its last visible character -- you can either use a library for it [3], or the relatively new Intl.Segmenter [4] which is in Chrome and Safari, but hasn't made it to Firefox [5].
Kind of amazing it's 2021 and you still can't calculate the number of visible characters (graphemes) in a string using native functions across all modern browsers.
[1] https://blog.jonnew.com/posts/poo-dot-length-equals-two
[2] https://www.contentful.com/blog/2016/12/06/unicode-javascrip...
[3] https://github.com/orling/grapheme-splitter
In many cases strings are best considered units that can be concatenated at will, but it's best if you avoid splitting them, and if you must split them, generally only split them on ASCII character boundaries. Don't consider "lengths" as something that has a meaning to humans (it doesn't), and don't assume that a "character" is a single JavaScript character (it isn't). If you normally just work strings as opaque sequences of "characters" that can be later displayed, you can avoid many complications (though obviously NOT all of them).
I'll read up on it until I understand it, and then add something to the article that covers it.
// Correct: returns '1'
'Résumé'.localeCompare('RESUME', undefined, {sensitivity: 'accent'})
localeCompare() returns 0 if the strings are equal and -1/+1 if they're different. Since this section is about comparing two strings that only differ in case and accents, I would expect to see a method I could use that would consider the strings to be equal. Instead, this example just shows two ways to compare strings (=== and localeCompare) that both consider the strings to be different.This example was supposed to use: {sensitivity: 'base'}
I've corrected it.
Short version: they accept an array of raw substrings and a variadic set of arguments corresponding to the runtime values provided in template positions, each positional value corresponding following the raw string preceding it.
That raw array is more than what it seems, it also has a getter of raw string values for the template expressions. This is what String.raw (also not mentioned) uses to treat those arguments essentially the same way an untagged template literal would.
It’s an odd design/interface but it can be used to do some pretty cool stuff. For example, Zapatos[1], a type-safe SQL library for TypeScript.
My only complaints:
- I can’t think of a real reason for it to be variadic, and this makes authoring them a little more error prone. You should be able to expect one array of strings with a length N, and one array of (type checkable/inferrable) values with a length N-1.
2. Likewise I can’t think of a real reason for the raw values to be bolted onto a weird array subclass. It could just as easily have been an iterable third argument.
I've added a section on it:
https://www.baseclass.io/guides/string-handling-modern-js#ta...
Super-strict!
While SPLITting a string into an array is useful, JOINing array elements to create a string is also useful.
e.g. let x = commaSepList.split(',').join('\n')
I'm not sure how I managed to completely forget about joining strings. I'll add that.
" Trim Me ".trim() // "Trim"
should be " Trim Me ".trim() // "Trim Me"