c# - How to decode an utf8 encoded string split in two buffers right in between a 4 byte long char? -


a character in utf8 encoding has 4 bytes. imagine read stream 1 buffer , another. unfortunately happens @ end of first buffer 2 chars of 4 byte utf8 encoded char left , @ beginning of the second buffer rest 2 bytes.

is there way partially decode string (while leaving 2 rest byte) without copying 2 buffers 1 big

string str = "hello\u263aworld";  console.writeline(str); console.writeline("length of 'helloworld': " + encoding.utf8.getbytes("helloworld").length); var bytes = encoding.utf8.getbytes(str); console.writeline("length of 'hello\u263aworld': " + bytes.length); console.writeline(encoding.utf8.getstring(bytes, 0, 6)); console.writeline(encoding.utf8.getstring(bytes, 7, bytes.length - 7)); 

this returns:

hello☺world

length of 'helloworld': 10

length of 'hello☺world': 13

hello�

�world

the smiley face 3 bytes long.

is there class deals split decoding of strings? first "hello" , "☺world" reusing reminder of not encoded byte array. without copying both arrays 1 big array. want use reminder of first buffer , somehow make magic happen.

you should use decoder, able maintain state between calls getchars - remembers bytes hasn't decoded yet.

using system; using system.text;  class test {     static void main()     {         string str = "hello\u263aworld";          var bytes = encoding.utf8.getbytes(str);         var decoder = encoding.utf8.getdecoder();          // long enough whole string         char[] buffer = new char[100];          // convert first "packet"         var length1 = decoder.getchars(bytes, 0, 6, buffer, 0);         // convert second "packet", writing buffer         // left off         // note: 6 not 7, because otherwise we're skipping byte...         var length2 = decoder.getchars(bytes, 6, bytes.length - 6,                                        buffer, length1);         var reconstituted = new string(buffer, 0, length1 + length2);         console.writeline(str == reconstituted); // true             } } 

Comments

Popular posts from this blog

c++ - OpenCV Error: Assertion failed <scn == 3 ::scn == 4> in unknown function, -

php - render data via PDO::FETCH_FUNC vs loop -

The canvas has been tainted by cross-origin data in chrome only -