Sign up ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I am studying C++ and I wrote the following Merge Sort algorithm. I need some comments from skilled C++ programmers to understand where it can be improved, in terms of execution time, memory usage and "style". I am aware this question is rather generic but really, any comment is appreciated, even more if well motivated.

Is it ok to return a new vector by value every time the recursive function is called?

template<typename T>
vector<T> mergeSort(vector<T> const &v, unsigned long s, unsigned long e){
    if (e-s == 0) return vector<T>{}; //return empty vector
    else if (e-s == 1) return vector<T>{v[s]}; //return vector with single element
    else{
        unsigned long mid = (e+s)/2;
        vector<T> v1 = mergeSort(v, s, mid);
        vector<T> v2 = mergeSort(v, mid, e);
        vector<T> vr (e-s); // allocate the space needed

        typename vector<T>::iterator p1 = v1.begin();
        typename vector<T>::iterator p2 = v2.begin();
        typename vector<T>::iterator pr = vr.begin();

        while (p1 != v1.end() && p2 != v2.end()) *pr++ = (*p1 < *p2) ? *p1++ : *p2++;
        while (p1 != v1.end()) *pr++ = *p1++;
        while (p2 != v2.end()) *pr++ = *p2++;

        return vr;
    }
}

int main(int argc, const char * argv[]) {
    vector<int> v {2, 7, -1, 4, 2, 6, -10};    
    vector<int> vsorted = mergeSort(v, 0, v.size());
    return 0;
}
share|improve this question

migrated from stackoverflow.com Feb 4 at 13:44

This question came from our site for professional and enthusiast programmers.

    
Memory Usage and Execution Time are usually exclusive. Using more memory generally speeds up execution (e.g. loop unrolling), while using less memory slows down execution (e.g. data compression, bit twiddling). – Thomas Matthews Feb 4 at 0:14

3 Answers 3

Is it ok to return a new vector by value every time the recursive function is called?

Returning the vector by value is no problem, thanks to the return value optimization.

However, you do have a potential performance problem here:

vector<T> vr (e-s); // allocate the space needed

This means each iteration of your algorithm will allocate memory, as well as deallocate the memory returned from recursive calls. It would be more efficient to avoid most of these allocations if you can.

share|improve this answer
    
In this case NRVO slightly different same affect. – Loki Astari Feb 4 at 22:21

A single most important property of a merge sort is stability: equal elements maintain their relative order. This

    *pr++ = (*p1 < *p2) ? *p1++ : *p2++;

is a very typical mistake. In case *p1 == *p2, the element will be merged from the second array. Stability is lost. Not a problem for integers, of course, but keep in mind. The fix is simple:

    *pr++ = (*p2 < *p1) ? *p2++ : *p1++;
share|improve this answer
1  
Or the first expression with <= – Loki Astari Feb 4 at 22:23

The only thing I would change is the interface.

Most algorithms in C++ use iterators. This abstracts the container type.

template<typename I>
vector<T> mergeSort(I begin, I eend);

In a couple of places you use copies when you can use move.

while (p1 != v1.end() && p2 != v2.end())
{
    *pr++ = std::move((*p1 < *p2) ? *p1++ : *p2++);
         // ^^^^^^^^^ Move here
}

Also your temporary array you create all the objects.

// Allocate all the elements and calls the constructor on
// each T member (even though it could be expensive).
// Note int is not expensive but not everything is that cheap.
vector<T> vr(e-s);

You can use resize to allocate the space.
Then move construct the elements into the vector;

vector<T> vr;
vr.reserve(e-s);

// STUFF

pr.push_back(std::move((*p1 < *p2) ? *p1++ : *p2++));

Also I prefer all sub statements to be wrapped in '{}'. This prevents problems with macros that are desguised as functions.

An attempt

// Merge in place.
template<typename I>
void mergeSort(I begin, I end)
{
    auto const  distance = std::distance(begin, end);

    if (distance <= 1)
    {
        return;
    }

    I midPoint = begin;
    std::advance(begin, distance/2);

    mergeSort(begin, midPoint);
    mergeSort(midPoint, end);

    std::vector<T> tmp;
    tmp.reserve(distance);

    I lower = begin;
    I upper = midPoint;
    while(lower != midPoint && upper != end)
    {
        tmp.push_back(std::move((*lower <= *upper) ? *lower++ : *upper++));
    }

    while(lower != midPoint)
    {
        tmp.push_back(std::move(*lower++));
    }
    while(upper != end)
    {
        tmp.push_back(std::move(*upper++));
    }

    // Put sorted data back in the original.
    std::move(std::begin(tmp), std::end(tmp), begin);
}
share|improve this answer
1  
The interface is missing a comparator argument for orderings besides ascending. The variable distance should be std::iterator_traits<I>::difference_type const (or auto const) as its size is dependent on the target platform. The else after return is unnecessary. The iterator midPoint could be initialized using std::next and should be const-qualified. The underlying value type T is not defined, so you'll need to extract the value type from the iterator. You could also use std::inplace_merge and not have to worry about writing your own merge after the recursive calls. – Snowhawk04 Feb 6 at 1:02
    
@Snowhawk04: All good points. – Loki Astari Feb 6 at 1:17

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.