What is javascript safe string length?

Given charAt(), charCodeAt()and codePointAt(), I found a mismatch between what the parameter means. Before I really thought about this, I thought you would always be safe to access the character at length-1. But I read the difference between charCodeAt () and codePointAt () that charCodeAt () refers to 16-bit (byte pairs), so besides reading i, you will also need i+1if they are surrogate pairs (like the methodology with UTF-16) . While codePointAt () needs a parameter that refers to the character position of UTF-8 (based on zero). So now I am having difficulty regarding whether it counts the lengthnumber of characters or the number of byte pairs of UTF-16. I believe JavaScript contains strings like UTF-16, but usinglength-1 from a line in a line with a large number of 4-byte characters using the function codePointAt()will not be at the end of the line !!

+4
source share
2 answers

length A string is counted in 16-bit unsigned integer values ​​("elements") or code units (which together form a valid or invalid code sequence of UTF16 code), as well as its indices. We could also call them "characters."

, charAt, chatCodeAt codePointAt, length - 1 . , . , , for … of.

+3

[...str].length .

var mb = "𐐷";
console.log(mb.length);
console.log([...mb].length); // "real" length (ES6)
console.log(mb.charAt(0)); // The first two byte
console.log(mb.codePointAt(0)); // The first two byte
console.log(mb.codePointAt(1)); // The second two byte
console.log(mb.charCodeAt(0)); // The four bytes combined (ES6)
console.log(mb.charCodeAt(1)); // The second two byte (ES6)
Hide result
+2

Source: https://habr.com/ru/post/1671951/


All Articles