Is this an efficient way to implement Merge Sort in C++?

Question

I am studying C++ and I wrote the following Merge Sort algorithm. I need some comments from skilled C++ programmers to understand where it can be improved, in terms of execution time, memory usage and "style". I am aware this question is rather generic but really, any comment is appreciated, even more if well motivated.

Is it ok to return a new vector by value every time the recursive function is called?

template<typename T>
vector<T> mergeSort(vector<T> const &v, unsigned long s, unsigned long e){
    if (e-s == 0) return vector<T>{}; //return empty vector
    else if (e-s == 1) return vector<T>{v[s]}; //return vector with single element
    else{
        unsigned long mid = (e+s)/2;
        vector<T> v1 = mergeSort(v, s, mid);
        vector<T> v2 = mergeSort(v, mid, e);
        vector<T> vr (e-s); // allocate the space needed

        typename vector<T>::iterator p1 = v1.begin();
        typename vector<T>::iterator p2 = v2.begin();
        typename vector<T>::iterator pr = vr.begin();

        while (p1 != v1.end() && p2 != v2.end()) *pr++ = (*p1 < *p2) ? *p1++ : *p2++;
        while (p1 != v1.end()) *pr++ = *p1++;
        while (p2 != v2.end()) *pr++ = *p2++;

        return vr;
    }
}

int main(int argc, const char * argv[]) {
    vector<int> v {2, 7, -1, 4, 2, 6, -10};    
    vector<int> vsorted = mergeSort(v, 0, v.size());
    return 0;
}

Memory Usage and Execution Time are usually exclusive. Using more memory generally speeds up execution (e.g. loop unrolling), while using less memory slows down execution (e.g. data compression, bit twiddling). — Thomas Matthews, Feb 4 at 0:14

John Zwinck · Answer 1 · 2015-02-04 00:29:12Z

Is it ok to return a new vector by value every time the recursive function is called?

Returning the vector by value is no problem, thanks to the return value optimization.

However, you do have a potential performance problem here:

vector<T> vr (e-s); // allocate the space needed

This means each iteration of your algorithm will allocate memory, as well as deallocate the memory returned from recursive calls. It would be more efficient to avoid most of these allocations if you can.

In this case NRVO slightly different same affect. – Loki Astari Feb 4 at 22:21 — Loki Astari, Feb 4 at 22:21

vnp · Answer 2 · 2015-02-04 20:38:15Z

up vote 3 down vote

A single most important property of a merge sort is stability: equal elements maintain their relative order. This

    *pr++ = (*p1 < *p2) ? *p1++ : *p2++;

is a very typical mistake. In case *p1 == *p2, the element will be merged from the second array. Stability is lost. Not a problem for integers, of course, but keep in mind. The fix is simple:

    *pr++ = (*p2 < *p1) ? *p2++ : *p1++;

answered Feb 4 at 20:38

vnp
16.7k848

1

Or the first expression with <= – Loki Astari Feb 4 at 22:23

add a comment |

Loki Astari · Answer 3 · 2015-02-06 01:17:15Z

The only thing I would change is the interface.

Most algorithms in C++ use iterators. This abstracts the container type.

template<typename I>
vector<T> mergeSort(I begin, I eend);

In a couple of places you use copies when you can use move.

while (p1 != v1.end() && p2 != v2.end())
{
    *pr++ = std::move((*p1 < *p2) ? *p1++ : *p2++);
         // ^^^^^^^^^ Move here
}

Also your temporary array you create all the objects.

// Allocate all the elements and calls the constructor on
// each T member (even though it could be expensive).
// Note int is not expensive but not everything is that cheap.
vector<T> vr(e-s);

You can use resize to allocate the space.
Then move construct the elements into the vector;

vector<T> vr;
vr.reserve(e-s);

// STUFF

pr.push_back(std::move((*p1 < *p2) ? *p1++ : *p2++));

Also I prefer all sub statements to be wrapped in '{}'. This prevents problems with macros that are desguised as functions.

An attempt

// Merge in place.
template<typename I>
void mergeSort(I begin, I end)
{
    auto const  distance = std::distance(begin, end);

    if (distance <= 1)
    {
        return;
    }

    I midPoint = begin;
    std::advance(begin, distance/2);

    mergeSort(begin, midPoint);
    mergeSort(midPoint, end);

    std::vector<T> tmp;
    tmp.reserve(distance);

    I lower = begin;
    I upper = midPoint;
    while(lower != midPoint && upper != end)
    {
        tmp.push_back(std::move((*lower <= *upper) ? *lower++ : *upper++));
    }

    while(lower != midPoint)
    {
        tmp.push_back(std::move(*lower++));
    }
    while(upper != end)
    {
        tmp.push_back(std::move(*upper++));
    }

    // Put sorted data back in the original.
    std::move(std::begin(tmp), std::end(tmp), begin);
}

The interface is missing a comparator argument for orderings besides ascending. The variable distance should be std::iterator_traits<I>::difference_type const (or auto const) as its size is dependent on the target platform. The else after return is unnecessary. The iterator midPoint could be initialized using std::next and should be const-qualified. The underlying value type T is not defined, so you'll need to extract the value type from the iterator. You could also use std::inplace_merge and not have to worry about writing your own merge after the recursive calls. — Snowhawk04, Feb 6 at 1:02

asked	8 months ago
viewed	156 times
active	8 months ago

current community

your communities

more stack exchange communities

Is this an efficient way to implement Merge Sort in C++?

migrated from stackoverflow.com Feb 4 at 13:44

3 Answers 3

Your Answer

Hot Network Questions

current community

your communities

more stack exchange communities

Is this an efficient way to implement Merge Sort in C++?

migrated from stackoverflow.com Feb 4 at 13:44

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions