Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I'm trying to convert a java byte array to a String as follows:

byte[] byteArr = new byte[128];
myFill(byteArr);
String myString = new String(byteArr);

myFill() populates byteArr with a string that's less than 128 characters long, and byteArr is zero padded. The code fine except myString somehow converts all the zero pads to some illegible characters. myString.length() also returns 128 instead of the actual ASCII content.

How do I rectify this?

Thanks!

share|improve this question
2  
java is not C. 0 is not the special "end of string" character in java. instead, it is just another character, so you need to pass the length as a second parameter. also, you should be specifying a character set in the constructor (i.e. "US-ASCII"). –  jtahlborn 22 hours ago
add comment

1 Answer

up vote 2 down vote accepted

As jtahlborn pointed out, there is nothing special about NUL (char = 0) in Java strings - it's just another character. Because of this, the (or, at least one) solution is to remove the extra characters when converting the source data it into a Java string.

To do that, use the String constructor overload that takes in an array offset/length and a charset:

byte[] byteArr = new byte[128];
myFill(byteArr);
String myString = new String(byteArr, 0, encodedStringLength, "US-ASCII");

Then it's just a matter of finding out the "encodedStringLength" which might look like so (after filling byteArr, of course):

int encodedStringLength = Arrays.asList(byteArr).indexOf(0);

That's not the "most efficient" way, sure, but it ought to do the trick. Keep in mind that indexOf could return -1 if the source string uses all 128 bytes (e.g. is not NUL terminated).

Also, one should generally (or, perhaps, always) specify a character encoding with String-from-byte[] constructors as the "default encoding" can vary across run-time environments. For instance, if the default encoding was UTF-16 then the original code would also have severely mangled the ASCII source data.


Alternatively, if one didn't care about leading/trailing spaces or control characters then the following would also work (once again, note the explicit character encoding):

String myString = new String(byteArr, "US-ASCII").trim();

This is because trim removes all leading/trailing characters with values less than or equal to 0x20 (Space) - including NUL characters.

share|improve this answer
 
Thanks. trim() worked. –  user1118764 22 hours ago
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.