Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Should Lua string.format( "%c", value ) be equivalent to string.char( value )?

It seems not when character value is zero.

string.format( "%c", 0 ):len()

returns 0

string.char( 0 ):len()

returns 1

Even stranger,

string.format( "%c%s", 0, "abc" ):len()

returns 3; where any other non-zero-modulo-256 value for %c returns 4, so string.format is not truncating the whole string at null byte like C sprintf, just collapsing %c field to empty string instead of one-byte string. Note that C sprintf writes the zero byte followed by the abc bytes in this case.

I couldn't find anything in the Lua docs describing expected behavior in this case. Most other string handling in Lua seems to treat zero byte as valid string character.

This is on Lua 5.1.4-8 on OpenWrt.

Idiosyncracy or bug?

share|improve this question
What does ("%c%s"):format(0, "abc"):sub(1, 1) give? – Eric May 17 at 9:47

2 Answers

up vote 6 down vote accepted

I think this is a bug.

In Lua 5.1 and LuaJIT 2.0, string.format formats one item at a time (using the sprintf provided by the host C runtime.) It then calls strlen to update the length of the output string. Since strlen stops at the null character, this character will be overwritten.

This is documented behaviour for %s, but probably unintentional for %c

This is fixed in Lua 5.2. I wouldn't expect any more updates to 5.1.

share|improve this answer
1  
Note that Lua 5.2 only fixes %c, but not %s. LuaJIT 2.1 handles this correctly for both %c and %s. – Mike Pall May 17 at 8:48

In the book "programming in lua" 2nd edition. In the chapter 2.4, there are some context like below: "Strings in Lua have the usual meaning: a sequence of characters. Lua is eight-bit clean and its strings may contain characters with any numeric code, including embedded zeros. This means that you can store any binary data into a string. "

So this is not a bug

share|improve this answer
I interpret that statement to mean "embedded zeros are treated just like any other byte". But the documentation of string.format is more relevant here, because it states an exception to this rule. – finnw May 16 at 10:48

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.