Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I'm looking for comments on the design, correctness and performance (not so much style) of a dynamic work queue for OpenMP worker threads.

I have an algorithm that can be thought of in terms of some number of work items (how many is not known beforehand), each of which is executed and may then spawn two child work items. I start with a root work item and continue until there are no more tasks, then terminate. Tasks that are not ancestor and descendant are independent (and therefore can be run in parallel).

I want to use OpenMP to parallelize the work by having threads pop work from a queue. The difficult bit is:

  • waiting for work to become available in an efficient way
  • terminating when there are no work items left, and new ones can't be created any more, but not before

I'm most concerned about the correctness of the termination condition and of my use of the OpenMP functions.

In the version below, the work items are dummies, but the idea is that you could drop in arbitrary work (as long as mutually non-ancestral work is independent).

#include <mutex>
#include <condition_variable>
#include <stack>

#include "omp.h"

using namespace std;

// Dummy nlogn algorithm work item
class Work {
public:
    Work() : n() {}
    Work(int n) : n(n) {}

    // Returns true iff child work created
    bool go(Work& childA, Work& childB) {
        if (n > 1) {
            childA = Work(n - 1);
            childB = Work(n - 1);
            return true;
        }
        return false;
    }

    int n;
};

int main() {
    Work rootWorkItem(20);

    int numWaiting = 0;
    std::mutex m;
    std::condition_variable cv;
    std::stack<Work> workStack;

    workStack.push(rootWorkItem);

#pragma omp parallel
    while (true) {
        std::unique_lock<std::mutex> lock(m);

        if (workStack.empty()) {
            // Register as waiting for work
            numWaiting++;

            // If everybody is now waiting, it means that no one has more work to push
            if (numWaiting >= omp_get_num_threads()) {
                cv.notify_all();
                break;
            }

            // Wait for one of two things:
            //  a) more work has been pushed to the stack
            //  b) everything's done
            while (true) {
                cv.wait(lock);
                if ((numWaiting >= omp_get_num_threads()) || !workStack.empty()) break;
            }

            // If it was b), exit
            if (numWaiting >= omp_get_num_threads()) break;

            // Otherwise, tell everybody that we're no longer waiting
            numWaiting--;
        }

        // Pop a work item to start executing
        Work work = workStack.top();
        workStack.pop();

        lock.unlock();

        // Execute work and, if it spawned children, push them
        Work childA, childB;
        if (work.go(childA, childB)) {
            // Push child work items and tell somebody to get working on them
            lock.lock();
            workStack.push(childA);
            workStack.push(childB);
            cv.notify_one();
            lock.unlock();
        }
    }

    return 0;
}
share|improve this question
    
How attached are you to using OMP? – Dannnno Feb 6 at 1:19
1  
@Dannnno Is there a reason not to? Manually maintaining a thread pool is a lot of extra work. – alcedine Feb 6 at 7:32
    
Look at Threading Building Blocks (TBB), or, if you're wedded to OpenMP, use OpenMP tasks, which can do exactly what you want without you having to do create your own implementation. – Jim Cownie Feb 8 at 15:44

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.