javascript - Unexpected behaviour of String.fromCodePoint / String#codePointAt (Firefox/ES6) -


since version 29 of firefox, mozilla provides string.fromcodepoint , string#codepointat methods , published polyfills on respective mdn pages.

so happens trying out , seems missing important, splitting string "ä☺𠜎" codepoints , reassembling these returns an, @ least me, unexpected result.

i've built test case: http://jsfiddle.net/dcodeio/yhwp7/

var str = "ä☺𠜎"; ...split it, reassemble it... 

am missing something?

this not problem of .codepointat, more of char encoding of character 𠜎. 𠜎 has javascript string length of 2.

why? because javascript strings encoded using 2-byte utf-16. 𠜎 ( charcode: 132878 ) greater 2-byte utf-16 ( 0-65535 ). means needs encoded using 4-byte utf-16. utf-16 representation 0xd841 0xdf0e consuming 2 characters in string.

when using .charat() see correct values:

var string = "𠜎"; console.log( string.charat(0), string.charat(1) ); // logs 55361 57102 (0xd841 0xdf0e) 

why doesn't display 228, 9786, 55361, 57102? thats because .codepointat() converts 4-byte utf-16 characters integers correctly ( 132878 ).

so why output 57,102 then? because iterating str.length in loop, returns 4 (because "𠜎".length == "), .codepointat() executed on str[3] 57102.


Comments

Popular posts from this blog

c++ - OpenCV Error: Assertion failed <scn == 3 ::scn == 4> in unknown function, -

php - render data via PDO::FETCH_FUNC vs loop -

The canvas has been tainted by cross-origin data in chrome only -