Sign up ×
Stack Overflow is a community of 4.7 million programmers, just like you, helping each other. Join them; it only takes a minute:

How do I convert a string to a byte[] in .NET (C#)?

Also, why should encoding be taken into consideration? Can't I simply get what bytes the string has been stored in? Why is there a dependency on character encodings?

share|improve this question
7  
Every string is stored as an array of bytes right? Why can't I simply have those bytes? – Agnel Kurian Jan 23 '09 at 14:05
70  
The encoding is what maps the characters to the bytes. For example, in ASCII, the letter 'A' maps to the number 65. In a different encoding, it might not be the same. The high-level approach to strings taken in the .NET framework makes this largely irrelevant, though (except in this case). – Lucas Jones Apr 13 '09 at 14:13
10  
To play devil's advocate: If you wanted to get the bytes of an in-memory string (as .NET uses them) and manipulate them somehow (i.e. CRC32), and NEVER EVER wanted to decode it back into the original string...it isn't straight forward why you'd care about encodings or how you choose which one to use. – Greg Dec 1 '09 at 19:47
36  
Surprised no-one has given this link yet: joelonsoftware.com/articles/Unicode.html – Bevan Jun 29 '10 at 2:57
8  
A char is not a byte and a byte is not a char. A char is both a key into a font table and a lexical tradition. A string is a sequence of chars. (A words, paragraphs, sentences, and titles also have their own lexical traditions that justify their own type definitions -- but I digress). Like integers, floating point numbers, and everything else, chars are encoded into bytes. There was a time when the encoding was simple one to one: ASCII. However, to accommodate all of human symbology, the 256 permutations of a byte were insufficient and encodings were devised to selectively use more bytes. – George Aug 28 '14 at 15:43

33 Answers 33

You need to take the encoding into account, because 1 character could be represented by 1 or more bytes (up to about 6), and different encodings will treat these bytes differently.

Joel has a posting on this:

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

share|improve this answer
4  
"1 character could be represented by 1 or more bytes" I agree. I just want those bytes regardless of what encoding the string is in. The only way a string can be stored in memory is in bytes. Even characters are stored as 1 or more bytes. I merely want to get my hands on them bytes. – Agnel Kurian Jan 23 '09 at 14:07
8  
You don't need the encodings unless you (or someone else) actually intend(s) to interpret the data, instead of treating it as a generic "block of bytes". For things like compression, encryption, etc., worrying about the encoding is meaningless. See my answer for a way to do this without worrying about the encoding. – Mehrdad Apr 30 '12 at 7:54
4  
@Mehrdad - Totally, but the original question, as stated when I initially answered, didn't caveat what OP was going to happen with those bytes after they'd converted them, and for future searchers the information around that is pertinent - this is covered by Joel's answer quite nicely - and as you state within your answer: provided you stick within the .NET world, and use your methods to convert to/from, you're happy. As soon as you step outside of that, encoding will matter. – Zhaph - Ben Duguid Apr 30 '12 at 10:48
// C# to convert a string to a byte array.
public static byte[] StrToByteArray(string str)
{
    System.Text.ASCIIEncoding  encoding=new System.Text.ASCIIEncoding();
    return encoding.GetBytes(str);
}


// C# to convert a byte array to a string.
byte [] dBytes = ...
string str;
System.Text.ASCIIEncoding enc = new System.Text.ASCIIEncoding();
str = enc.GetString(dBytes);
share|improve this answer
3  
1) That will lose data due to using ASCII as the encoding. 2) There's no point in creating a new ASCIIEncoding - just use the Encoding.ASCII property. – Jon Skeet Jan 27 '09 at 6:35
byte[] strToByteArray(string str)
{
    System.Text.ASCIIEncoding enc = new System.Text.ASCIIEncoding();
    return enc.GetBytes(str);
}
share|improve this answer
3  
This doesn't always work. Some special characters can get lost in using such a method I've found the hard way. – JB King Jan 23 '09 at 17:14
1  
if the charset was utf it wouldn't work! – ahmadali shafiee Sep 18 '12 at 6:27

protected by Paŭlo Ebermann Jun 27 '13 at 19:25

Thank you for your interest in this question. Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site.

Would you like to answer one of these unanswered questions instead?

Not the answer you're looking for? Browse other questions tagged or ask your own question.