Checking array size in C/C++ to avoid segmentation faults

Question

So it's well known that C does not have any array bounds checking when accessing memory. Nowadays, if you call myArray[7] when you initialised it as int myArray[3], your program will get a segfault and crash thanks to protected memory.

Now, if you have an argument in a function such as myFunc(int *yourArray), but you know you need at least 8 slots in the array, is it possible to check if myArray[7] is illegal beforehand in order to throw a custom error:

"Sorry, yourArray is too small for this function. We need 8 ints of space."

rather than

"Segmentation fault."

Regarding the close vote: I believe this is on-topic both here and at SO. On SO, the answer would probably be that C/C++ does or does not have support for this. Here, the answer is about whether such a feature is conceptually possible, and if not, how you could achieve the same goal. — Ixrec, 14 hours ago
@lxrec: other way around. C/C++ does not/does have support for it respectively. — MSalters, 14 hours ago

Basile Starynkevitch · Accepted Answer · 2015-08-19 20:30:44Z

up vote 6 down vote accepted

Checking array bounds like you want is implementation specific, because buffer overflow is an example of undefined behavior (and this explains why UB can be really bad).

It is also an undecidable problem in general. You can easily show that statically finding (by static program analysis, e.g. of the C++ source code, without actually running the program) every buffer overflow is equivalent to the halting problem. Read also about Rice's theorem.

However, several (partial) practical tools exist (notably on Linux):

you could add assert or static_assert-s in your code, and/or runtime checks.
you might find and use a static code analyzer à la Frama-C (it works for C code currently).
you could customize your GCC compiler using MELT.
You should compile your code with all warnings & debug info, e.g. g++ -Wall -Wextra -g if using GCC.
You might run your program with valgrind, at least for tests.
you could use the address sanitizer, e.g. add -fsanitize=address to your compilation flags (when testing)
notably in C (and sometimes in C++) it is a good convention to pass both array pointers and their size (like e.g. snprintf(3) or strncmp(3) do). In C, you might also use flexible array members in struct and store the flexible array's size inside the struct

BTW C and C++ pointer arithmetic abilities make finding buffer overflow even harder.

In C++11 you'll better avoid plain arrays and raw pointers and use standard containers and smart pointers.

edited yesterday

answered yesterday

Basile Starynkevitch
9,97412836

3

The word statically is key: The reduction to the halting problem only works if we want to rule out all programs that go out of bounds, but no more, and without running the program. If run-time checks are permitted, the problem is almost trivial, it just has runtime overhead (and a quite large one, for a naive solution). Likewise, it is easy to reject all programs that go out of bounds as well as some that don't (the hard part is not rejecting practically useful ones). – delnan yesterday

4

The only thing I'd add: In C, always pass the array's length along with the array itself (a la strcpy vs strncpy), for the same reason you'd use a container instead of a raw array in C++. – Ixrec yesterday

Interesting case about the halting problem! I never thought about that but you're right in the case of compile-time issues. – CJxD yesterday

1

@Ixrec: You know strcpy vs strncpy is an atrociously bad example? strncpy does not copy strings, it copies a maximum of n non-0 bytes from a source-string and 0-pads to n bytes. And anyway, iff you provide a buffer length, you must either be ok with truncation (often very much not the case), or you need away to signal failure. – Deduplicator yesterday

1

@Ixrec: strlcpy is considered a "safer for some scenarios" version of strcpy, strncpy is just a very specialized completely different tool. Though yes, there are many who just don't know that, which makes it an even worse example. – Deduplicator yesterday

| show 2 more comments

Jerry Coffin · Answer 2 · 2015-08-19 23:21:33Z

The answer is really fairly simple: if you want safety, use something that actually provides it--and that's not C, and not raw C-style arrays.

Without departing too far from the basic style of C and raw arrays, you can use C++ and an std::vector with [i] replaced by .at(i), and get bounds checking.

Using std::vector instead makes most of the problems with arrays easy. You can check the current size of the vector with its .size() member function. Most of the time, you don't need to do that though, because when you want to add something to it you just use its .push_back() member function.

At least in theory, you can sort of do most of the same sorts of things in C, but doing so gets relatively ugly. Although it's not terribly difficult to define a wrapper that (for example) puts a pointer and a current allocation size into a struct, you have to define functions to do all the manipulation on it, and even then you have to live with the fact that existing code won't know how to use it or deal with it. I've done this a few times, and if you need it badly enough you can make it work--but I long ago decided it just wasn't worth the pain.

Rufflewind · Answer 3 · 2015-08-19 23:46:08Z

A function that receives a pointer does not know of the length of the corresponding array. You must pass in as a parameter yourself explicitly:

void myFunc(int *yourArray, size_t yourArrayLen)

Once you've done that, throwing an error is trivial.

Of course, this still leaves the possibility that your caller might give you the wrong length. You can't really prevent that without either:

implementing a custom data type to store arrays and then making sure the length stays in-sync with the true length at all times using encapsulation, or
allowing static arrays only, e.g.
```
void myFunc(int (*yourArray)[8]);
```

dan04 · Answer 4 · 2015-08-19 20:58:51Z

up vote 1 down vote

There is no way in C(++) to get the length of an array from a pointer to its first element. (There are platform-specific functions like _msize in MSVCRT, but that only works on malloced pointers.)

What's typically done when passing arrays to functions is to pass the length along with the pointer so that bounds-checking can be done at runtime.

void myFunc(int* yourArray, int length)
{
    if (length < 8)
    {
        puts("Sorry, yourArray is too small for this function. We need 8 ints of space.");
        return;
    }

    // ...
}

void caller()
{
    int arr[LEN];
    myFunc(arr, LEN);
}

answered yesterday

dan04
2,60411523

1

I cannot find any guarantee that malloc and the rest won't ever provide more space than requested. – Deduplicator yesterday

It's in fact common due to rounding, but you have to realize that such extra space (when present) is always uninitialized. – MSalters 14 hours ago

add a comment |

ddyer · Answer 5 · 2015-08-19 21:39:59Z

up vote 0 down vote

Use a custom wrapper for malloc (or write your own) that keeps additional information about the blocks it allocates. The one I use adds a few "guard bytes" to every allocation, embeds the length of the allocation as the a[-1], and checks the guard bytes and other things upon deallocation.

answered yesterday

ddyer
3,238514

1

That will only help if no array ever only uses only part of a memory-block... – Deduplicator yesterday

If you are in the habit of allocating chunks of memory then using parts of it in ad hoc ways that lose contact with the original pointer, you need better methodology. – ddyer 18 hours ago

add a comment |

Zaibis · Answer 6 · 2015-08-20 07:25:31Z

up vote -2 down vote

You could simply use the sizeof()-Operator. Which returns the size of the array. so you could do a condition like:

int arr[5];
if ((sizeof(arr) < (sizeof(int) * 8))
{
    //something went wrong yo
}
else
{
    foo(arr);
}

if you would take as parameter instead of a pointer an array (given you create the interface) you could perform this check even still inside the function.

you would declare the prototype for this like that:

void foo (int arr[]);

answered 17 hours ago

Zaibis
994

That's simply wrong, arrays are never function parameters. What syntactically looks like an array parameter is just a bog-standard pointer-parameter in disguise. – Deduplicator 16 hours ago

@Deduplicator: I can look it up for you in the evening. I KNOW that at least the standard for plain C works this way and the array as parameter is used to ensure that array decaying is still possible what would not be the case for a pointer. and this results in the posibility of even using sizeof on the parameter by it self. before downvoting me why not just write a testprogramm and see it your self. (Hopefully you are not testing in MSVC what isn't even plain C) – Zaibis 16 hours ago

@Deduplicator: On the fly even the plain C definition of the main supports my claim: int main(int argc, char *argv[]) { /* ... */ } which supports exactly what I stated.... So where does your claim come from that I'm wrong?! – Zaibis 15 hours ago

An exactly equivalent definition is: int main(int argc, char **argv) { /* ... */ } – Deduplicator 8 hours ago

Arrays decay into pointers when you use them as function arguments and a few other operations. This code is incorrect and presents bad habits. – Snowman 4 hours ago

add a comment |

asked	yesterday
viewed	563 times
active	today

current community

your communities

more stack exchange communities

Checking array size in C/C++ to avoid segmentation faults

6 Answers 6

Your Answer

Not the answer you're looking for? Browse other questions tagged c++ c error-handling error-detection or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Checking array size in C/C++ to avoid segmentation faults

6 Answers 6

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged c++ c error-handling error-detection or ask your own question.

Related

Hot Network Questions