9
\$\begingroup\$

I've been trying to solidify my understanding of the binary number system by writing code. So, I tried to write a function with what I currently understand about the binary system that returns a string representation of the binary number representing the input.

Any feedback regarding how I may optimize the program is greatly appreciated. Thanks!

char *uint32_to_binarystr(uint32_t n) {
    if (n != UINT32_MAX) {
        int expb2 = 0;              // Exponent of 2.      
        static char binarystr[33];  // 32 bits + 1 termination NULL char.
        binarystr[32] = '\0';       // Add terminating NULL char.
        memset(binarystr, '0', 32); // Initialize 32-bit string as 0.
        while (n != 0) {                         // Continue until n has been completely represented by a sum of powers of 2.
            while (1 << expb2 <= n) { expb2++; } // Find the power of 2 that yields a result greater than n. 
            expb2 -= 1;                          // Once this number is found, we are sure that the previous power yields the largest power of 2 less than n.
            n -= 1 << expb2;                     // We've found a power of 2 to represent this "chunk" of n. Reduce n by the same amount.
            binarystr[31 - expb2] = '1';         // Set the bit at the index with expb2 digits following it to 1.
            expb2 = 0;                           // Reset exponent of 2 and repeat.
        }
        return binarystr;
    } else {
        /*
        In the case that UINT32_MAX is to be represented, just return hard-coded string. 
        Why? 
        A 32-bit shift is not doable in some cases:
        https://stackoverflow.com/questions/7401888/why-doesnt-left-bit-shift-for-32-bit-integers-work-as-expected-when-used
        */
        static char binarystr[33];  
        binarystr[32] = '\0';       
        memset(binarystr, '1', 32); 
        return binarystr;
    }
}
New contributor
Sean Xie is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
\$\endgroup\$
7
  • 2
    \$\begingroup\$ solidify my understanding of the binary number system: I pedagogically suggest attempting a general algorithm to convert any given (input) number with its base, to any given (input) base. Binary, Hex, Decimal, it is all the same really. But not Roman numerals of course! Your code seems to be a bit shifting exercise and necessarily machine (CPU) specific. \$\endgroup\$ – radarbob yesterday
  • 1
    \$\begingroup\$ @radarbob 32 bit integers stored on digital (aka binary) computers are not machine specific! Bit shifting is always equivalent to multiplying by a power of two (left shifting) or dividing by a power of two (losing the remainder) (right shifting). The only (unlikely) possible machine specific representation is for negative numbers. AFAIK all modern CPUs use two's complement to represent negative binary numbers since addition, subtraction, and left and right bit shifting all work the same as for positive numbers. \$\endgroup\$ – Reversed Engineer 18 hours ago
  • \$\begingroup\$ @radarbob: It's not all the same: bases that are powers of 2 let you get the most-significant digit first, because each digit represents a group of bits in the binary number. This is what makes it possible to efficiently convert to hex in parallel with SIMD, for example. (How to convert a binary integer number to a hex string?). But for bases that aren't powers of 2, you need actual division, and the low digit depends on all the higher bits. So you have to start with x % base ; x /= base; and store the digits in reverse. \$\endgroup\$ – Peter Cordes 17 hours ago
  • \$\begingroup\$ @ReversedEngineer: Yes, and ISO C guarantees that integral types are binary. This code isn't using signed integer (except by mistake with 1 << expb2 which should be 1UL << expb2) so it doesn't even depend on two's complement. But yes, I don't know of any modern CPUs that use anything else either; IIRC C++20 is going to drop one's complement and sign/magnitude, leaving only two's complement for integral types (and hopefully guaranteeing that >> on signed types is an arithmetic right shift, not implementation-defined. unsigned >> shifts in zeros, instead of copies of the sign bit). \$\endgroup\$ – Peter Cordes 17 hours ago
  • \$\begingroup\$ Your program is missing test cases, like test(strcmp(uint32_to_binarystr(0), "0000..0000") == 0), where test is an assert-like function or macro for testing. Write a bunch of correct tests and then massage the code until they all pass. \$\endgroup\$ – Kaz 15 hours ago
9
\$\begingroup\$

The interface is problematic. What output does this code produce?

#include <stdio.h>
int main(void)
{
    const char *s1 = uint32_to_binarystr(0);
    const char *s2 = uint32_to_binarystr(1000000);
    printf("%s\n%s\n", s1, s2);
}

You might expect it to output 00000000000000000000000000000000 followed by 00000000000011110100001001000000, but that's not what we get. Do you see why?

The problem is

    static char binarystr[33];  
    ⋮
    return binarystr;

s1 and s2 both point to the same data!

The two usual solutions to this problem are

  • return newly-allocated memory that the caller must free(), or
  • have the caller supply a buffer into which to write.

I tend to prefer the latter - it avoids overhead of heap allocation in many cases, and reduces the risk of memory leaks due to misuse.

I would write that as

#include <stdbool.h>
#include <stdint.h>
#include <string.h>

bool uint32_to_binarystr(char *binarystr, size_t len, uint32_t n) {
    if (len < 33) {
        // 32 bits + 1 terminating NUL char
        return false;
    } 
    ⋮
    return true;
}
int main(void)
{
    char s1[33];
    char s2[33];
    assert(uint32_to_binarystr(s1, sizeof s1, 0));
    assert(uint32_to_binarystr(s2, sizeof s2, 1000000));
    printf("%s\n%s\n", s1, s2);
}

Passing the length available might seem unnecessary (who would supply a buffer that's too short?) but it does turn out to be a good idea in practice.


For the algorithm, I would approach this with a "moving bit" (for simplicity, I'll demonstrate the uint8_t version). If we start with our number abcdefgh and an initial mask 10000000, then we can decide whether the first digit is 0 or 1 using a bitwise AND operation n & mask. Then move to the next digit by shifting mask rightwards by one. When the bit runs off the end, mask becomes zero, and we know we're done.

In code (and switching back to 32 bits), that looks like:

for (uint32_t mask = 0x80000000;  mask;  mask >>= 1) {
    if (n & mask) {
        /* we have a 1 bit */
    } else {
        /* we have a 0 bit */
    }
}

We can simplify the body of the loop, too, because C guarantees that '1' is one more than '0':

char digit = '0' + ((n & mask) != 0);

If we write all the characters like this, then there's no need for memset() at the beginning.

Putting this all together, I have a replacement which is worth studying:

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>

bool uint32_to_binarystr(char *buf, size_t len, uint32_t n) {
    if (len < 33) {
        // 32 bits + 1 terminating NUL char
        return false;
    }
    for (uint32_t mask = 0x80000000;  mask;  mask >>= 1) {
        bool bit_is_set = n & mask;
        *buf = '0' + bit_is_set;
        ++buf;
    }
    *buf = '\0';          /* add the terminator */
    return true;
}
\$\endgroup\$
7
  • 2
    \$\begingroup\$ *buf++ = '0' + ((n & mask) != 0); is really cryptic to read for beginners though. I would split that expression up in several lines, to increase readability. \$\endgroup\$ – Lundin yesterday
  • \$\begingroup\$ @Lundin, I tried to offset that by showing the intermediate step with if/else. But I've now edited to break up the expression. \$\endgroup\$ – Toby Speight yesterday
  • \$\begingroup\$ Turns out I was writing an answer with nearly the same solution as you did that :) The bool seems to give slightly better machine code than if(n & mask) *buf = '0'; else *buf = '1';. \$\endgroup\$ – Lundin yesterday
  • \$\begingroup\$ What happens if len does not match the actual size of buf, especially if buf is smaller? Never trust input! \$\endgroup\$ – Glen Yates 18 hours ago
  • 1
    \$\begingroup\$ @Glen, It's a common idiom in C to pass buffer and length as a pair (c.f. strncpy(), sprintf() etc). Here, we're trusting the programmer to supply the actual capacity, not trusting input. That's a big difference! \$\endgroup\$ – Toby Speight 17 hours ago
5
\$\begingroup\$

Your code has a possible bug, depending on your CPU/compiler.

You are treating UINT32_MAX (0xFFFFFFFF) as a special case, however, you could run into problems with any value greater that 0x80000000.

For example, if we let n be 0x80000001, then ...

        int expb2 = 0;
        ...
        while (n != 0) {
            while (1 << expb2 <= n) { expb2++; } 
            ...

we start increasing expb2, shifting 1 over by more and more bits until it is greater than n.

But on a 32-bit architecture, since 1 << 31 is less n, you increase expb2 to 32. and 1 << 32 will overflow 32-bit integer, which results in undefined behaviour, but my experience, the result will be either 0 or 1. Since this is less than n, the loop continues, and the while (1 << expb2 <= n) { expb2++; } loop will continue forever.

Instead of trying to bit shift to find the ideal bit to start at, since you have a known type (uint32) with a known bit width (32), simply start at bit 31 and work your way down.

char *uint32_to_binarystr(uint32_t n) {
    static char binarystr[33];
    binarystr[32] = '\0';
    memset(binarystr, '0', 32);

    for(int expb2 = 31; expb2 >= 0; expb2--) {
        if (1 << expb2 >= n) {
            binarystr[31 - expb2] = '1';
            n -= 1 << expb2;
        }
    }
    return binarystr;
}

Since the maximum shift is 31, there is no need to special case UINT32_MAX.

\$\endgroup\$
2
  • \$\begingroup\$ Thanks for the informative feedback and debugging! The UINT32_MAX special case felt really "hacky" when I implemented it, so thanks again for the alternative algorithm. \$\endgroup\$ – Sean Xie yesterday
  • \$\begingroup\$ 1 << 31 is already signed-overflow UB (assuming 32-bit signed int; it could be a shift wider than the type width on a C implementation with 16-bit int. That's why it should be 1UL << whatever. (@SeanXie). And normally you'd want to test if that bit is set, like n & mask, not arithmetic compare. So +1 for pointing out that special-casing some inputs is super weird, but you could say more. Also, there were two separate static char binarystr[33]; buffers in different branches, and the function returned a different one in the special case! Declare it outside the if. \$\endgroup\$ – Peter Cordes 1 hour ago
5
\$\begingroup\$

As already said in other reviews, you should switch to caller allocation. And there's no initialize the string at all, just overwrite all of it.

As for the algorithm itself, or in general, go with readability and keep things simple. When we do so, we tend to get efficiency for free too. This seems like a good compromise between readability and performance:

void uint32_to_binstr (char dst[33], uint32_t n)
{
  for(size_t i=0; i<32; i++)
  {
    bool one = n & (1u << i);
    size_t index = 32-i-1; 
    dst[index] = '0' + one;
  }
  dst[32]='\0';
}

There's a rule of thumb to always keep (for) loops as simple as possible, as close to idea ideal for(size_t i=0; i<n; i++) as possible. When you do this, the code tends to turn both more readable and efficient. (A bit of fiddling around with disassemblers showed me that this gave somewhat better machine code than the down counting versions or the versions right shifting a mask.)

Since I counted from 0 to 31, I need to compensate for that when determining the index used for storage - the above actually fills the array from the back to the front. The calculation 32-i-1 gives 31, 30, 29...

The bool one = n & (1u << i); is basically the same as if(n & (1u << i)) and will likely result in a branch too, probably there's way around it without turning the code real ugly.

Note that I used 1u for shifting, since 1 gives a signed int and shifting signed numbers is a very bad idea. In case of left shift we might shift data into the sign bit = undefined behavior bug. In case of right shift we don't know if we get arithmetic or logical shift = non-portable code and generally a bad idea.

Peeking at the resulting machine code (gcc x86 -O3):

uint32_to_binstr:
        lea     r9, [rdi-32]
        mov     rax, rdi
        mov     r8d, 1
.L2:
        mov     ecx, edi
        mov     edx, r8d
        sub     ecx, eax
        sal     edx, cl
        and     edx, esi
        cmp     edx, 1
        mov     edx, 48
        sbb     dl, -1
        sub     rax, 1
        mov     BYTE PTR [rax+32], dl
        cmp     r9, rax
        jne     .L2
        mov     BYTE PTR [rdi+32], 0
        ret

This is reasonably compact and not many branches.

Now since we decided to use a trivial, readable loop, we automatically get lots of advantages we hadn't even considered from the start. Suppose we want to modify the code to drop leading zeroes or just print a certain amount of digits? We can use pretty much the very same code, just swap out the hard-coded 32 for a parameter and then change the order of parameters:

void uint32_to_binstr (uint32_t n, size_t digits, char dst[digits])
{
  for(size_t i=0; i<digits; i++)
  {
    bool one = n & (1u << i);
    size_t index = digits-i-1; 
    dst[index] = '0' + one;
  }
  dst[digits]='\0';
}
\$\endgroup\$
11
  • 1
    \$\begingroup\$ Interesting, I've always counted down for efficiency, because if you count up, you have to load the value that your counting up to into a register every time you go through the loop for the comparison, whereas if you count down, you only have to load the value into the register once and then decrement it every loop, the comparison against 0 to terminate the loop is basically free, no need to load 0 into another register. But then, the last time I cared about efficiency at this low a level, my target processor only had 2 registers and an accumulator, maybe you have more registers to work with. \$\endgroup\$ – Glen Yates 18 hours ago
  • 1
    \$\begingroup\$ Variable-count shifting isn't that great. If you're going to start at the LSB and store backwards into the buffer, n >>= 1; is a lot more attractive. Then digit = '0' + (n & 1). Best of all, i isn't involved in anything except the array index anymore, so that's just a decrement towards 0, as @GlenYates pointed out, letting the loop branch be dec reg / jnz. Or not; to get the indexing right with a function-arg digits, GCC does the decrement before using it for a store, so it has to test after. godbolt.org/z/e68oo6cEG. But still significantly fewer instructions. \$\endgroup\$ – Peter Cordes 17 hours ago
  • 1
    \$\begingroup\$ Unless I disallow digits=0 as an input, I don't see a great way to use a canonical for() loop structure instead of --i at the top of the loop body. Otherwise, for (size_t i = digits - 1 ; i != 0 ; i--) could wrap to SIZE_MAX, which would be nice to avoid. It would make some sense to rule out digits=0 in your variable-digits version, though, to allow an asm do{}while loop structure without any check before entering the first iteration that stores the LSB at dst[digits-1]. \$\endgroup\$ – Peter Cordes 17 hours ago
  • 1
    \$\begingroup\$ Also, unsigned is allowed to be narrower than uint32_t. 1UL << count would be safely portable. (But usually worse than just shifting as a loop-carried dependency; most CPUs have efficient low-latency right-shift-by-1. Variable-count shift is only good if you're going to use it to start at the MSB, and then only because C doesn't have an easy rotate-left.) \$\endgroup\$ – Peter Cordes 17 hours ago
  • \$\begingroup\$ @GlenYates Down-counting as a way of manual optimization is a very old trick from back in the days when compilers were horrible. Modern compilers should be able to that optimization for you. Various more exotic embedded systems compilers may still struggle with it, but overall I'd say it's somewhat rare to find such bad compilers still out there. \$\endgroup\$ – Lundin 2 hours ago
-2
\$\begingroup\$

Computers convert decimal to binary anytime people code. As such, computers probably have a very fast way to do it. Ben Eater, on YouTube, has a video about hardwiring a decimal to hex converter with discrete logic gates. I imagine a similar principle can apply here. Where there should be some way you can use it if the computer can't.

If you have a good G.P.U. In theory using CUDA or something similar with an actual multiply instruction might be better if you were converting a lot of numbers. But at the same time I see no reason for that.

You could probably optimize by using assembly. When we write something like mov x2, #10 it has to convert that 10 to a 1010. I think it does so via a lookup table or hardware somehow. I am curious, I've always given computers hex values as I feel like it would be at least a few clock cycles to figure out what a base 10 representation of a number should be stored as.

Would it be possible it would be faster to use %x. As %x will give you the hexadecimal representation to work with. Then it will be much easier to convert the output to binary one digit at a time, make a lookup table for every 4 bits. But I'm guessing the persistent memory access will outweigh any gain.

fprintf(temporary.txt, "%x", 42)

From my armature eyes it seems like you took the right approach and tracked down the only bug. I'm just giving food for thought and thinking out loud a bit. For your purposes of learning the basics I think you did great. If you'd like to learn more about the metal and consider trying to find a way to implement my idea. If you do I'd love to hear about it.

New contributor
Abstract Approach is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
\$\endgroup\$
8
  • 1
    \$\begingroup\$ I'm sorry your answer was deleted, I am not a moderator, but I believe that answers should be left to be up or down voted based on their merit, unless they are just completely off-topic or offensive. Having said that, your "gut feeling that there must be a way to use the computers converter. The computer clearly has one, probable hardwired" is incorrect. People code in decimal - yes, represented as a string no less - however this must be converted to binary, there is no shortcut to this. You would see this if you looked at the output of a compiler. \$\endgroup\$ – Glen Yates 19 hours ago
  • \$\begingroup\$ I also, just wanted to say there is merit to looking and thinking 'outside the box' as you have suggested, otherwise we become stagnant, yours is the way to innovation. \$\endgroup\$ – Glen Yates 18 hours ago
  • \$\begingroup\$ Thank you for your kind and informative reply's. I am shocked to find there is no shortcut. I need to learn how to compile to assembly (probably just a gcc optional argument), as it will help me learn a-lot. Also it seems (based on your replies) my using hex inputs is more efficient than decimal, right? \$\endgroup\$ – Abstract Approach 18 hours ago
  • \$\begingroup\$ "my using hex inputs is more efficient than decimal, right?" No, you have to remember that when you are typing in source code, you are not typing "hex" or "decimal" numbers, you are typing in "string" representations of them. So when your program is compiled, these strings must be parsed and converted to the actual binary representations of the numbers. Ex. If you type decimal "31" it is ascii chars 51 and 49, if you type hex "1F" it is ascii chars 49 and 70, both must be converted to binary 00011111 possibly with many more leading zeros depending on the size of int you are using. \$\endgroup\$ – Glen Yates 16 hours ago
  • \$\begingroup\$ Interesting, I was coming g back to edit my reply, as obviously I know what the assembly code looks like, to ask what you mean by examining the source code. (It sounds like a fun yet daunting exercise to try to map assembly commands to their respective numerical representation, probably after converting to hex) ................................................................. Now-I-am-again-surprised by your reply. It seems to imply anytime we use a number in our code it must use several clock cycles of a core. There must be a way to avoid this, to fill the address with the appropriate binary \$\endgroup\$ – Abstract Approach 15 hours ago

Your Answer

Sean Xie is a new contributor. Be nice, and check out our Code of Conduct.

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.