3
\$\begingroup\$

I implemented a basic dynamic array using C++.

When deleting an element and shifting the others, the last element will be twice repeated (the original still exists after copying). I want to avoid memory leaking by deallocating the last element of the array. Currently, I do this by shrinking once after a number of deletions. Are there better solutions?

Finally, I am interested in some feedback on this implementation.

#include <iostream>
#include <stdexcept>

using namespace std;


class DynamicArray {
protected:
    int size;
    int* arr;
    int length;

    void grow() {
        copyToNewSize(size * 2);
    }

    void shrink() {
        copyToNewSize(size / 2);
    }

    void copyToNewSize(int newSize) {
        int* temp = new int[newSize];
        if (newSize > size)
            for (int i = 0; i < size; i++) {
                temp[i] = arr[i];
            }
        else
            for (int i = 0; i < newSize; i++) {
                temp[i] = arr[i];
            }
        delete[] arr;
        size = newSize;
        arr = temp;
    }


    void checkIndex(int index) {
        if (index >= length || index < 0)
            throw "Index is out of the array range!";
    }

public:
    DynamicArray(int startingSize) {
        size = startingSize;
        arr = new int[size];
        length = 0;
    }

    ~DynamicArray() {
        delete[] arr;
    }

    int push(int value) {
        length++;
        if (length == (size + 1))
            grow();
        arr[length - 1] = value;
        return length;
    }

    int pop() {
        length--;
        int value = arr[length];
        // memory leaking , delete arr[length]
        if (length <= size / 2)
            shrink();
        return value;
    }

    int insert(int index, int value) {
        checkIndex(index);
        length++;
        if (length == (size + 1))
            grow();
        for (int i = length - 1; i > index; i--) {
            arr[i] = arr[i - 1];
        }
        arr[index] = value;
        return length;
    }

    int remove(int index) {
        checkIndex(index);
        int value = arr[index];
        length--;
        for (int i = index; i < length; i++)
            arr[i] = arr[i + 1];
        // memory leaking , delete arr[length]
        if (length <= (size / 2))
            shrink();
        return value;
    }

    int get(int index) {
        checkIndex(index);
        return arr[index];
    }

    void set(int index, int value) {
        checkIndex(index);
        arr[index] = value;
    }

    int search(int value) {
        for (int i = 0; i < length; i++) {
            if (arr[i] == value)
                return i;
        }
        return -1;
    }

    int getLength() {
        return length;
    }
}
New contributor
Youssef Refaat is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
\$\endgroup\$
2
  • \$\begingroup\$ Do you have any specific reason for not using std::vector<int> or one of the other standard containers? \$\endgroup\$ 12 hours ago
  • 8
    \$\begingroup\$ @JohanduToit Yes, learning to code. \$\endgroup\$
    – Swedgin
    12 hours ago
8
\$\begingroup\$

First of all, there is a major problem with your class: it does not respect the rule of three/five/zero:

  • it has a non trivial destructor that destroys memory allocated in constructor

  • it neither defines nor deletes the copy constructor and assignment operator Let us look at the following code:

      DynamicArray dynarr[16];
      // initialize values in dynarr
      if (...) {
          // opens a nested block
          DynamicArray dynarr2 = dynarr;    // copy construct the new DynamicArray (1)
          // use it
      }
      // dynarr2 goes out of scope (2)
    

At (1), the compiler generated copy constructor will blindly copy the member, having dynarr and dynarr2 share their inner int array.

At (2), the dtor for dynarr2 will delete the shared inner array: ZAP!... dynarr now has a dangling pointer, Undefined Behaviour is ensured from there....

There is another possible error in size management. You consistently divide it by 2 when removing elements and multiply it by 2 when adding. Fine. Except that if you let the size reach 0 by removing the last element of a DynamicArray 0*2 is still 0 so the size will be definitively stuck at 0! You must handle that corner case by preventing the size to reach 0 in shrink.

What follow are only minor possible improvements.

  • Your code uses using namespace std;

    This is not an error because it is allowed by the language, and is even common in many tutorials. It is a nice trick that allows to use all the goodies from the standard library without the std:: prefix. But, it does import a number of useless symbols into the current namespace, somehow defeating the whole namespace concepts. In production code, best practices recommend to never use using namespace std but only import the relevant symbols. Only a cosmetic improvement.

  • in checkIndex, you use throw "Index is out of the array range!";

    This actually throws a pointer to a string litteral. Here again it is not an error since it is allowed by the language. But best practices recomment to only throw objects of (subclasses of) the std::exception class. Users of your class will probably not expect to have to catch a char pointer. You'd better use an out_of_range or a runtime_error exception if you think it is not worth declaring a custom exception

  • You compare size and newsize in copyToNewSize to know how many elements should be copied. But in fact you have only to copy length elements... This one is only a tiny optimization.


For your initial question, there is nothing bad in only shrinking after a number of removals, because a shrink involve reallocation and a copy of the elements, so it is an expensive operation. For that reason, you should even considere stop shrinking below a threshold (for example 8 elements). As a bonus, it will directly prevent size to reach 0, and you could even make that minimal size an optional parameter of your constructor.

\$\endgroup\$
3
\$\begingroup\$
  1. Consider using size_t instead of int. The actual type of size_t is platform-dependent; a common mistake is to assume size_t is the same as unsigned int, which can lead to problems, particularly as 64-bit architectures become more prevalent.

  2. It is generally bad practice to repeat code that does the same thing. Consider changing copyToNewSize to something like this:

void copyToNewSize(size_t newSize) {
    int* temp = new int[newSize];
    size_t copySize = std::min(size, newSize);
    for (size_t i = 0; i < copySize; i++) {
       temp[i] = arr[i];
    }
    delete[] arr;
    size = newSize;
    arr = temp;
}
  1. I would suggest that you rename size to allocatedSize to avoid confusion between size and length.
New contributor
Johan du Toit is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
\$\endgroup\$
1
\$\begingroup\$

Don't talk like a pirate

I would prefer the name array (another reason not to using namespace std;) to arr. And I'd prefer a more descriptive name like data or numbers to either.

Don't waste work

    int push(int value) {
        length++;
        if (length == (size + 1))
            grow();
        data[length - 1] = value;
        return length;
    }

You add and subtract three times. Consider

    int push(int value) {
        if (length == size) {
            grow();
        }

        data[length] = value;
        length++;
        return length;
    }

That adds once (the increment).

In general, I would avoid the statement form of control structures. That can lead to odd, hard to find bugs. Worse, you use it inconsistently, which makes the code harder to read.

Fragile code

        if (length <= size / 2)
            shrink();

and

    void shrink() {
        copyToNewSize(size / 2);
    }

Combined these are fragile. To work properly, you need size / 2 to be the same in both places. Consider

        size_t bound = size / 2;
        if (length <= bound) {
            copyToNewSize(bound);
        }

Now it has to be the same in both places. Another alternative

    void shrinkIfNecessary() {
        size_t bound = size / 2;
        if (length <= bound) {
            copyToNewSize(bound);
        }
    }

Then replace the if and call to shrink with shrinkIfNecessary.

In general, you should try to avoid parallel logic. If you need the same logic in two places, try to abstract it out into at least a variable (e.g. bound here) or a function (e.g. shrinkIfNecessary).

It's natural to think that grow should have a shrink. But here, the shrinking operation is triggered differently. They aren't mirror images of each other. You have to grow any time you exceed the bounds. You don't need to shrink every time. So it actually makes sense to have grow and shrinkIfNecessary instead.

You also might consider moving the 2 to a constant, e.g. GROWTH_FACTOR. But that will work fine even if inconsistent between growing and shrinking, so it's less important.

std::copy

        if (newSize > size)
            for (int i = 0; i < size; i++) {
                temp[i] = arr[i];
            }
        else
            for (int i = 0; i < newSize; i++) {
                temp[i] = arr[i];
            }

In this, you copy manually. But you don't need to do that. C++ has std::copy. So you could just say

        std::copy(arr, arr + std::min(size, newSize), temp);

The std::min was already posted here.

Note that std::copy requires #include <algorithm>.

This would get around the entire question of int vs. size_t vs. auto as regards to the loop index variables. No more loops means no more loop indexes.

The C++ way

C++ also has std::vector, which would get around the entire class.

The C way

If you prefer to reinvent the wheel, you might consider if you want to go all the way and return to the C memory management. Why C? Because in C, you have realloc, which can make arrays smaller or larger. Usually when making an array smaller, it would use the same memory and free just what is at the end. C++ doesn't have that capability (except in that C++ has access to the C routines). Of course, if you do that, you want to stop using new and delete in favor of calloc/malloc and free.

The realloc function also handles the copying for you if necessary. But it won't always be necessary, because realloc will usually reuse the existing array when making it smaller. And it will sometimes use the existing array when making it bigger as well.

I wouldn't be surprised if underneath the hood, std::vector used the C approach or even went straight to assembly code. It wouldn't need to do either, but that would allow for a more optimized experience.

Duplicates

When deleting an element and shifting the others, the last element will be twice repeated (the original still exists after copying).

In one sense, this is harmless. If you track the size correctly and initialize the memory before using, this might be true, but you'd never know.

If you are concerned about leaving traces in your program, note that realloc can make an array smaller efficiently enough that you could call it on every pop or remove. However, you'd still likely prefer to grow less frequently. Because at least some of the time, it would cause the array to need to be copied.

You might also consider manually clearing the entry. E.g. in remove:

        data[length] = 0;

That's the kind of thing you'd do if the data were sensitive for some reason.

\$\endgroup\$
0

Your Answer

Youssef Refaat is a new contributor. Be nice, and check out our Code of Conduct.

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.