Functions written there work properly that is pack(unpack("string"))
yields to "string"
. But I would like to have the same result as "string".getBytes("UTF8")
gives in Java.
The question is how to make a function giving the same functionality as Java getBytes("UTF8") in JavaScript?
For Latin strings unpack(str)
from the article mentioned above provides the same result as getBytes("UTF8")
except it adds 0
for odd positions. But with non-Latin strings it works completely different as it seems to me. Is there a way to work with string data in JavaScript like Java does?
"中".getBytes("UTF8")
yields to{-28, -72, -83}
, but the function from the answer to[78, 45]
. – ivkremer Sep 20 '12 at 19:430
s -- they're the upper half of each 16-bit code unit. That Hanzhi character requires 3-bytes when encoded according to UTF-8 scheme while only 2-bytes via UTF-16. – oldrinb Sep 20 '12 at 21:31