Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Possible Duplicate:
Optimizing function that searches for binary patterns

I have two functions that calculate whether or not a given coordinate (x, y) is a diagonal or anti-diagonal of another coordinate that is a power of 2. In other words if (x+/-k, y+/-k) == (cx, cy) where cx or cy is a power of 2, then the binary representation of (x+/-k, y+/-k) follows known patterns.

  • In case of major diagonals (\), the binary representation could contain any number of 1s any number of consecutive 1s (could be none) and has to end with at least one 0.
  • In case of minor diagonals (/), the binary representation of the product contains at most two 1s (in any order, consecutive or not).

These functions are called on very large numbers that have around 5,000,000 bits and have become the most expensive call path. They end up taking 60% of the algorithm time and desperately need to be optimized.

Here are the functions.

/// <summary>
/// A look up array to match binary patterns for the numbers 0 to 255 inclusive.
/// Checks for the numbers: 000, 001, 003, 007, 015, 031, 063, 127, 255.
/// </summary>
public static readonly bool [] LookupArrayDiagonalMajorMoreOnesPossible = new bool [256];

/// <summary>
/// A look up array to match binary patterns for the numbers 0 to 255 inclusive.
/// Checks for the numbers: 000, 001, 003, 007, 015, 031, 063, 127, 255.
/// </summary>
public static readonly bool [] LookupArrayDiagonalMajorIsValidByteBeforeEncounter = new bool [256];

/// <summary>
/// A look up array to match binary patterns for the numbers 0 to 255 inclusive.
/// Checks for the numbers: 000, 001, 003, 007, 015, 031, 063, 127, 255.
/// </summary>
public static readonly bool [] LookupArrayDiagonalMajorIsValidByteAfterEncounter = new bool [256];

/// <summary>
/// A look up array of bit counts for the numbers 0 to 255 inclusive.
/// Declared static for performance.
/// </summary>
public static readonly byte [] LookupArrayBitCount = new byte []
{
    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
};

/// <summary>
/// Static constructor.
/// </summary>
static Global ()
{
    LookupArrayDiagonalMajorMoreOnesPossible = new bool [256];
    for (int i=0; i < LookupArrayDiagonalMajorMoreOnesPossible.Length; i++)
        LookupArrayDiagonalMajorMoreOnesPossible [i] = false;
    LookupArrayDiagonalMajorMoreOnesPossible [000 /* 00000000 */] =
    LookupArrayDiagonalMajorMoreOnesPossible [001 /* 00000001 */] =
    LookupArrayDiagonalMajorMoreOnesPossible [003 /* 00000011 */] =
    LookupArrayDiagonalMajorMoreOnesPossible [007 /* 00000111 */] =
    LookupArrayDiagonalMajorMoreOnesPossible [015 /* 00001111 */] =
    LookupArrayDiagonalMajorMoreOnesPossible [031 /* 00011111 */] =
    LookupArrayDiagonalMajorMoreOnesPossible [063 /* 00111111 */] =
    LookupArrayDiagonalMajorMoreOnesPossible [127 /* 01111111 */] =
    LookupArrayDiagonalMajorMoreOnesPossible [255 /* 11111111 */] =
    true;

    LookupArrayDiagonalMajorIsValidByteBeforeEncounter = new bool [256];
    for (int i=0; i < LookupArrayDiagonalMajorIsValidByteBeforeEncounter.Length; i++)
        LookupArrayDiagonalMajorIsValidByteBeforeEncounter [i] = false;
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [000 /* 00000000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [001 /* 00000001 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [002 /* 00000010 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [003 /* 00000011 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [004 /* 00000100 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [006 /* 00000110 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [007 /* 00000111 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [008 /* 00001000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [012 /* 00001100 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [014 /* 00001110 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [015 /* 00001111 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [016 /* 00010000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [024 /* 00011000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [028 /* 00011100 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [030 /* 00011110 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [031 /* 00011111 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [032 /* 00100000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [048 /* 00110000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [056 /* 00111000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [060 /* 00111100 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [062 /* 00111110 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [063 /* 00111111 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [064 /* 01000000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [096 /* 01100000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [112 /* 01110000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [120 /* 01111000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [124 /* 01111100 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [126 /* 01111110 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [127 /* 01111111 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [128 /* 10000000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [192 /* 11000000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [224 /* 11100000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [240 /* 11110000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [248 /* 11111000 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [252 /* 11111100 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [254 /* 11111110 */] =
    LookupArrayDiagonalMajorIsValidByteBeforeEncounter [255 /* 11111111 */] =
    true;

    LookupArrayDiagonalMajorIsValidByteAfterEncounter = new bool [256];
    for (int i=0; i < LookupArrayDiagonalMajorIsValidByteAfterEncounter.Length; i++)
        LookupArrayDiagonalMajorIsValidByteAfterEncounter [i] = false;
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [000 /* 00000000 */] =
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [128 /* 10000000 */] =
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [192 /* 11000000 */] =
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [224 /* 11100000 */] =
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [240 /* 11110000 */] =
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [248 /* 11111000 */] =
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [252 /* 11111100 */] =
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [254 /* 11111110 */] =
    LookupArrayDiagonalMajorIsValidByteAfterEncounter [255 /* 11111111 */] =
    true;
}

/// <summary>
/// Checks to see if this cell lies on a major diagonal of a power of 2.
/// The binary representation could contain any number of consecutive 1s (could be none) and has to end with at least one 0.
/// ^[0]*[1]*[0]+$ denotes the regular expression of the binary pattern we are looking for.
/// </summary>
/// <param name="bytes">The expected number should be [Absolute(cell.X - cell.Y)].ToByteArray().</param>
/// <returns>[True] if [cell] lies on a power of 2 major diagonal. [False] otherwise.</returns>
public static bool IsDiagonalMajorToPowerOfTwo (byte [] bytes)
{
    bool oneEncountered = false;
    bool moreOnesPossible = true;

    // The last bit should always be 0.
    if ((((int) bytes [0]) & 1) == 1)
        return (false);

    for (int i=(bytes.Length - 1); i >= 0; i--)
    {
        if (oneEncountered)
        {
            if (moreOnesPossible)
            {
                if (LookupArrayDiagonalMajorIsValidByteAfterEncounter [bytes [i]])
                {
                    if (bytes [i] == 0)
                    {
                        moreOnesPossible = false;
                    }
                    else
                    {
                        moreOnesPossible = LookupArrayDiagonalMajorMoreOnesPossible [bytes [i]];
                    }
                }
                else
                {
                    return (false);
                }
            }
            else
            {
                if (bytes [i] > 0)
                {
                    return (false);
                }
            }
        }
        else
        {
            if (LookupArrayDiagonalMajorIsValidByteBeforeEncounter [bytes [i]])
            {
                moreOnesPossible = LookupArrayDiagonalMajorMoreOnesPossible [bytes [i]];
            }
            else
            {
                return (false);
            }

            oneEncountered = (bytes [i] > 0);
        }
    }

    return (true);
}

/// <summary>
/// Checks to see if this cell lies on a minor diagonal of a power of 2.
/// The condition is met if the binary representation of the product contains
/// at most two 1s.
/// </summary>
public static bool IsDiagonalMinorToPowerOfTwoParallel (System.Numerics.BigInteger number)
{
    int sum = 0;
    byte [] bytes = null;

    // 00000000 && 00000001
    if ((number == 0) || (number == 1))
        return (true);

    // Contains only one 1.
    if (number.IsPowerOfTwo)
        return (false);

    bytes = number.ToByteArray();

    System.Threading.Tasks.Parallel.For<int>
    (
        0,
        bytes.Length,
        options, // MaxDegreeOfParallelism = Environment.ProcessorCount.
        () => 0, // Initial value for [sumLocal].
        (int i, System.Threading.Tasks.ParallelLoopState state, int sumLocal) =>
        {
            sumLocal += LookupArrayBitCount [bytes [i]];

            if (sumLocal > 2)
            {
                state.Break();
            }

            return (sumLocal);
        },
        (int sumLocal) =>
        {
            System.Threading.Interlocked.Add(ref sum, sumLocal);
        }
    );

    return (sum <= 2);
}

I've done all I can to optimize these functions and am now at a loss for ideas. By the way, I've also cached the byte [] for the BigInteger which is not reflected here.

NOTE: I'm not even sure if these binary patterns can be detected using some other more efficient algorithm. For those interested in context, I wrote these functions based on joriki's answer here.

share|improve this question

marked as duplicate by codesparkle, James Khoury, Brian Reichle, Corbin, Jeff Vanzella Aug 30 '12 at 22:55

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

3 Answers 3

I suggest you exploit the possibilities for parallel processing.

First of all, your methods for the major and minor diagonals can execute simultaneously. Use a cancellation token to bail out of IsDiagonalMinorToPowerOfTwo if IsDiagonalMajorToPowerOfTwo has already returned true and vice versa.

Second, in IsDiagonalMinorToPowerOfTwo you could divide the bytes array into good-sized chunks (you'll have to experiment to find the optimum size) and process the chunks in parallel, using another cancellation token to bail out of all the parallel tasks if sum becomes greater than 2. The chunks should not be new copies of portions of the array; that would be inefficient. Instead, designate your chunks with starting and ending indexes in the array you already have. Also, you'll have to be careful to increment sum in a thread-safe way. Interlocked.Add will do the trick.

Third, in IsDiagonalMajorToPowerOfTwo if you find that you're spending a lot of your time verifying that the remaining digits are zeroes, you could make that check in parallel chunks.

If you want help setting up the parallel tasks, let me know.

BTW, do you have a bug in the IsDiagonalMajorToPowerOfTwo? It seems to me that a number whose binary representation is, for example, 0000001100000011 will return true. I think you only want to make that test for 00000001 through 11111111 on the first byte. Is that right?

share|improve this answer
    
Larry, first of all, thank you for the thorough review, a brilliant suggestion and pointing out the potential bug. There was an error in my question text which I have edited with a strikeout in case that misled you. I will parallelize the Minor function, implement a look-up array for Major and update the code shortly. I would appreciate your guidance on parallelizing the Major function once I post the update. –  Raheel Khan Aug 21 '12 at 20:04
    
The function is now completely error free. I'd appreciate your help with parallelizing the IsDiagonalMajorToPowerOfTwo function. Even a verbal concept will do. I cant seem to visualize how to approach this one. –  Raheel Khan Aug 24 '12 at 4:36
    
I'm glad I've been able to help a little. I will give you specific ideas on IsDiagonalMajorToPowerOfTwo by tomorrow morning. Sorry it has taken me so long to get back to you! –  Larry Spencer Aug 29 '12 at 17:06
    
Currently, when you're verifying that all the remaining bytes are 0, you loop around to test if (bytes[i] > 0) { return (false); } once for each byte after oneEncountered has been set to true and moreOnesPossible has been set to false. You could replace that byte-by-byte looping with parallel processing on the remaining bytes once those conditions have been met. Just replace the if test with a parallel statement and then break. Does that give enough of an idea? –  Larry Spencer Aug 29 '12 at 17:36
    
Thank you. I do understand what you mean. However, this function gets called with numbers that are more likely to return false. In other words, out of 5,000,000 bits, it is likely that the (oneEncountered=true and moreOnesPossible=false) condition would be hit quite late. We will ALWAYS have enough cores so maximum parellization is beneficial rather than detrimental. But parallel processing the entire array is what I fail to conceptualize. IsDiagonalMinorToPowerOfTwoParallel was simple but IsDiagonalMajorToPowerOfTwoParallel has too many conditions. –  Raheel Khan Aug 30 '12 at 2:46

A few comments:

  1. I suggest you provide code that compiles. Namely, provide the class' declaration (Global. right?)
  2. To demonstrate the main idea, I suggest you provide a few examples for input and expected output. Even better - provide unit tests so that after refactoring is done we all know that the functionality still works as expected.
  3. Nesting: the IsDiagonalMajorToPowerOfTwo method has too much nested blocks, and the flow is not very clear.

Now, for performance, perhaps you could change:

System.Threading.Interlocked.Add(ref sum, sumLocal);

to:

if (System.Threading.Interlocked.Add(ref sum, sumLocal) > 2) return false;

So you don't have to wait for all (other) tasks to be completed once you know that you have sum > 2.

Update

After another review, I think that you should partition the byte array so that each task takes more than a single byte (say, 1000 bytes). IMO it should significantly improve performance (less context switches).

A brief example to elaborate that:

const int segmentSize = 1000;
bytes = number.ToByteArray();
List<ArraySegment<byte>> segments = new List<ArraySegment<byte>>();
var segmentsCount = bytes.Length / segmentSize;
for (var k = 0; k < segmentsCount; ++k)
{
    var offset = k * segmentSize;
    int count;
    if (k < segmentsCount - 1)
        count = segmentSize;
    else // k is (segmentsCount - 1)
        count = bytes.Length - ( k * (segmentsCount - 1) );
    ArraySegment<byte> segment = new ArraySegment<byte>(bytes, offset, count);
    segments.Add(segment);
}

Now do the Parallel.For on each segment, and inside each segment, do a simple (not parallel) loop. Note that I didn't compile the code, just wrote it in a text editor.

share|improve this answer
    
Thank you Ron. Your suggestion for early breaking definitely helped. I'm not sure I understand your update though. How do you propose I partition? Do you mean performing Parallel.For on a portion of the byte array only and then move on to the next? Wouldn't that increase threading overhead? –  Raheel Khan Aug 30 '12 at 2:19

In your comments, you said IsDiagonalMajorToPowerOfTwoParallel will probably encounter its 1's late in the byte array. In that case, there's probably not much to gain by parallelizing the few remaining bytes of 0's. Maybe IsDiagonalMajorToPowerOfTwoParallel is not amenable to parallelizing.

Also, I agree with Ron Klein: You should partition the array so each task processes more than one byte. Otherwise, you're unlikely to see any performance gain.

share|improve this answer

Not the answer you're looking for? Browse other questions tagged or ask your own question.