Well i have a byte array, and i know its a xml serilized object in the byte array is there any way to get the encoding from it?
Im not going to deserilize it but im saving it in a xml field on a sql server... so i need to convert it to a string?
Well i have a byte array, and i know its a xml serilized object in the byte array is there any way to get the encoding from it? Im not going to deserilize it but im saving it in a xml field on a sql server... so i need to convert it to a string? | ||||
feedback
|
You could look at the first 40-ish bytes Realistically, do you expect you'll ever get anything other than UTF-8 or UTF-16? If not, you could check for the patterns you get at the start of both of those and throw an exception if it doesn't follow either pattern. Alternatively, if you want to make another attempt, you could always try to decode the document as UTF-8, re-encode it and see if you get the same bytes back. It's not ideal, but it might just work. I'm sure there are more rigorous ways of doing this, but they're likely to be finicky :)
| |||||||
feedback
|
A solution similar to this question could solve this by using a Stream over the byte array. Then you won't have to fiddle at the byte level. Like this:
| |||
feedback
|
The first 2 or 3 bytes may be a BOM which can tell you whether the stream is UTF-8, Unicode-LittleEndian or Unicode-BigEndian. UTF-8 BOM is 0xEF 0xBB 0xBF Unicode-Bigendian is 0xFE 0xFF Unicode-LittleEndiaon is 0xFF 0xFE If none of these are present then you can use ASCII to test for ASCII is use up until | |||
feedback
|