Implementing realloc in C

Question

I have an assignment to implement realloc in C, This's my code please review it.

void    *my_realloc(void *ptr, size_t len)
{
    void    *real;

    real = malloc(len);
    memset(real, 0, len);
    if (real)
        memcpy(real, ptr, len);
    free(ptr);
    return (real);
}

I'd say that's not possible, since you cannot know the size of an allocated area by just providing the pointer. The assignment is pointless, unless you also override malloc and free so you can store allocated size, for instance in the first bytes of the block. But reimplementing realloc alone is impossible. — Jean-François Fabre, 20 hours ago
Did you read the realloc specification before starting to write code? That should have been your first step. — Eric Lippert, 17 hours ago
Does your assignment including reimplementing malloc, free and calloc? Is your realloc meant to work on a block returned by the platform's malloc? If so, the answers below tell you you're stuck! — PJTraill, 13 hours ago

Cody Gray · Answer 1 · 2016-12-28 19:17:51Z

The code leaves a bad first impression because of how it is formatted. In particular, the extra whitespace between the return type and the function name, and that between the variable's type and its name, looks odd.
Unless you are writing code that needs to be compatible with pre-C99 compilers, you should break the habit of declaring variables at the top of the block and initializing them later. Instead, postpone the declaration of variables until the point when they can be initialized (to the extent possible). This helps to minimize bugs caused by uninitialized variables.
I recommend always using curly braces to create block scope after if statements, even if you currently only have one statement there. This minimizes the potential for serious bugs when you later go back and add some additional logic.
The return value of malloc must be checked immediately after the call to malloc! If malloc fails, it will return a null pointer, and any attempt to use a null pointer results in undefined behavior. Therefore, you need to move the call to memset to after you check the result of malloc, otherwise you risk undefined behavior.
In fact, you do not need to call memset at all. If you check the C language specification for information on the realloc function, you will see that it does not initialize the additional memory in the case where the new size is larger than the old size.
You are breaking the realloc specification in several other ways, too. If your goal is to simulate/reinvent realloc, then you absolutely must ensure that your implementation behaves in exactly the same way as realloc.

In fact, realloc is really a complete memory management subsystem wrapped in a single function—it does everything that malloc and free do—and more—making its semantics very complicated and confusing. Here's a summary of the relevant requirements for realloc from the standard:
- If the requested block size is smaller than the original size of the block, realloc frees the unwanted memory at the end of the block and returns the input pointer unchanged.
- If the requested block size is larger than the original size of the block, realloc may allocate an expanded block at a new address and copy the contents of the original block into the new location. In this case, a pointer to the expanded block is returned, and the extended portion of the block is left uninitialized.
- If realloc cannot satisfy a request to expand a block, it returns a null pointer and does not free the original block. (realloc will always succeed when you request to shrink a block.)
- If the input pointer is null, then realloc behaves exactly as if you had called malloc(size), returning a pointer to the newly-allocated block of the requested size, or a null pointer if the request could not be satisfied.
- If the requested size is 0 and the input pointer is non-null, then realloc behaves exactly as if you had called free(ptr), and always returns a null pointer.
- If the input pointer is null and the requested size is 0, then the result is undefined!
Whew! To be correct, your code needs to implement all of these conditions, precisely as documented.

Conversely, if you are not simulating realloc (and that's perfectly fine—maybe you're reinventing the wheel because you want different behavior in your program), then you need to use a different name so that other programmers are not fooled into thinking that it is identical to realloc and won't expect it to behave in the same way.
In fact, as you can see from the list of requirements above, it is actually impossible to reinvent realloc as required by the specification, since you do not have insider knowledge about how the standard library implements memory management. You need to be able to determine the original size of the memory block pointed to by ptr, and there is no portable way to do so.

Why do you need this information? Two reasons. First, you need to know whether the caller is requesting to shrink or expand the memory block so that you can follow the correct semantics. Second, you need to know how many bytes to copy when you call memcpy—right now, you are copying past the end of the original buffer, which is undefined behavior!

The first problem could be solved by altering the semantics of our function so that it only grows the size of a block, renaming it something like GrowBlock or ExpandMemory to make it clear that it is not identical to realloc. Unfortunately, there would be no reliable way to assert this requirement that len is greater than or equal to the current size of the memory block pointed to by ptr within the body of the function, so documentation alone would have to suffice, which is a very weak guarantee. Even more unfortunately, this won't solve the second problem—we still have no way to correctly call memcpy!

Therefore, our only real option is to modify the function's signature to accept the original size of ptr as a parameter (in addition to the desired new size that you already accept as a parameter).

All of that considered, here's how I'd write it:

// This function is similar to realloc() and implements the same semantics,
// except that the caller must explicitly pass the original size of the
// memory block pointed to by 'ptr', as well as the desired size.
void *my_realloc(void *ptr, size_t originalLength, size_t newLength)
{
   // Note that because we cannot rely on implementation details of the standard library,
   // we can never grow a block in place like realloc() can. However, it is semantically
   // equivalent to allocate a new block of the appropriate size, copy the original data
   // into it, and return a pointer to that new block. In fact, realloc() is allowed to
   // do this, as well. So we lose a possible performance optimization (that is rarely
   // possible in practice anyway), but correctness is still ensured, and the caller
   // never need be the wiser.
   // 
   // Likewise, we cannot actually shrink a block of memory in-place, so we either
   // have to return the block unchanged (which is legal, because a block of memory
   // is always allowed to be *larger* than necessary), or allocate a new smaller
   // block, copy the portion of the original data that will fit, and return a
   // pointer to this new shrunken block. The latter would actually be slower,
   // so we'll avoid doing this extra work when possible in the current implementation.
   // (You might need to change this if memory pressure gets out of control.)

   if (newLength == 0)
   {
      free(ptr);
      return NULL;
   }
   else if (!ptr)
   {
      return malloc(newLength);
   }
   else if (newLength <= originalLength)
   {
      return ptr;
   }
   else
   {
      assert((ptr) && (newLength > originalLength));
      void *ptrNew = malloc(newLength);
      if (ptrNew)
      {
          memcpy(ptrNew, ptr, originalLength);
          free(ptr);
      }
      return ptrNew;
    }
}

Indeed, this looks quite complicated—and it it isn't even as complicated as a real implementation of realloc! It's a good lesson about why realloc is actually bad design. You should not write massive multi-purpose functions, and you certainly shouldn't implement an entire subsystem in a single function! Instead, break the important behaviors out into separate functions. In this case, you might have four separate functions that allocate a block (AllocateMemory), free a block (FreeMemory), expand a block (ExpandMemory), and shrink a block (ShrinkMemory). This division of labor makes the implementation(s) much easier to write, easier to reason about, easier to include error-checking, and therefore easier to maintain and less likely to contain bugs. I realize your assignment was to write realloc, but there is a broader (perhaps inadvertent) lesson here that you should not miss.

There's no need for the assert here, and the free(ptr) should go inside the if block. — David Hammen, 16 hours ago
Thanks, @David. I had overlooked the rule that realloc does not free the original block in the event of failure, even though it returns null. I almost certainly would have written it differently... Anyway, updated the answer. And of course there's no need for an assert. That's the whole point of an assert. You wouldn't use it when there was a realistic possibility that it might be triggered! I mainly used it here to show the logic at the point of entry into that block. — Cody Gray, 16 hours ago
@jamesqf: Not to my way of thinking, though it is sometimes worth starting a new scope. It does require you to unlearn the expectation that variables turn up at the top, rather than where they are needed. — PJTraill, 13 hours ago
Re #1, I'd be a little slow to criticise the formatting, as it could be that he has to conform to some conventions whose benefits show up in more code. But it does seem odd to position the * as part of the name, rather than the return type; I suspect it is a hangover from the habit of declaring a crazy bunch of variables of different types related to the same basic type. — PJTraill, 13 hours ago
@james I don't see how it makes the code difficult to read. C++ code is always written this way, and although there are things that make C++ code hard to read, variables declared at the point of initialization is decidedly not one of them. The only argument in favor of the C89 style seems to be that all of the variables that will be used in the function are gathered all in one place, but I don't see how that makes the code easier to read, since the important thing about a variable is what its value is—not what its name is. You still have to go hunting for its initialization. — Cody Gray, 13 hours ago

D. Jurcau · Answer 2 · 2016-12-28 09:10:17Z

You are checking if malloc succeeded before calling memcpy, but not before calling memset.

You are copying len bytes from ptr to the new memory location, but don't know how many bytes ptr actually points to. It probably points to less, this being a major reason why one would want to call realloc in the first place.

Neil · Answer 3 · 2016-12-28 12:15:08Z

Generally, malloc, realloc and free are all part of the same library. One of the things this allows is some 'behind the scenes' meta-data chicanery.

For example if you wanted to call malloc(16), the memory library might allocate 20 bytes of space, with the first 4 bytes containing the length of the allocation and then returning a pointer to 4 bytes past the start of the block.

When calling free, the library will subtract 4 bytes from the pointer passed in, to find the pointer to the length of the allocation.

realloc can also use this trick to find out how long the original allocation was.

So, in answer to your question, you can't /only/ implement realloc, you must implement malloc and free also.

Also, if you have an original K&R, I believe you would find some realloc source in there too.

abelenky · Answer 4 · 2016-12-28 17:20:18Z

Here are some simple annotations to highlight a major problem:

real = malloc(len);    // This may or may not succeed.  On failure, real is NULL
memset(real, 0, len);  // This uses real on the assumption that it succeeded!
if (real)              // Here, you acknowledge that it may have failed, even though you already used it!
    memcpy(real, ptr, len);

asked	yesterday
viewed	1380 times
active	today

current community

your communities

more stack exchange communities

Implementing realloc in C

4 Answers 4

Your Answer

Not the answer you're looking for? Browse other questions tagged c reinventing-the-wheel memory-management homework or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Implementing realloc in C

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged c reinventing-the-wheel memory-management homework or ask your own question.

Related

Hot Network Questions