How to speed up image processing algorithm

Question

I have written the next algorithm (for Android/NDK) to apply levels to a bitmap. The problem is that is really very slow, on a fast device such as the SGSIII can take up to 4 seconds for a 8MP image. And on devices with ARMv6 takes ages (over 10 seconds). I know that can be speed up with NEON instructions, but I don't have such knowledge. Is there any way to optimize it?

void applyLevels(unsigned int *rgb, const unsigned int width, const unsigned int height, const float exposure, const float brightness, const float contrast, const float saturation)
{
    float R, G, B;

    unsigned int pixelIndex = 0;

    float exposureFactor   = powf(2.0f, exposure);
    float brightnessFactor = brightness / 10.0f;
    float contrastFactor   = contrast > 0.0f ? contrast : 0.0f;
    float saturationFactor = 1.0f - saturation;

    for (int y = 0; y < height; y++)
    {
        for (int x = 0; x < width; x++)
        {
            const int pixelValue = buffer[pixelIndex];

            R = ((pixelValue & 0xff0000) >> 16) / 255.0f;
            G = ((pixelValue & 0xff00) >> 8) / 255.0f;
            B = (pixelValue & 0xff) / 255.0f;

            // Clamp values

            R = R > 1.0f ? 1.0f : R < 0.0f ? 0.0f : R;
            G = G > 1.0f ? 1.0f : G < 0.0f ? 0.0f : G;
            B = B > 1.0f ? 1.0f : B < 0.0f ? 0.0f : B;

            // Exposure

            R *= exposureFactor;
            G *= exposureFactor;
            B *= exposureFactor;

            // Contrast

            R = (((R - 0.5f) * contrastFactor) + 0.5f);
            G = (((G - 0.5f) * contrastFactor) + 0.5f);
            B = (((B - 0.5f) * contrastFactor) + 0.5f);

            // Saturation

            float gray = (R * 0.3f) + (G * 0.59f) + (B * 0.11f);
            R = (gray * saturationFactor) + R * saturation;
            G = (gray * saturationFactor) + G * saturation;
            B = (gray * saturationFactor) + B * saturation;

            // Brightness

            R += brightnessFactor;
            G += brightnessFactor;
            B += brightnessFactor;

            // Clamp values

            R = R > 1.0f ? 1.0f : R < 0.0f ? 0.0f : R;
            G = G > 1.0f ? 1.0f : G < 0.0f ? 0.0f : G;
            B = B > 1.0f ? 1.0f : B < 0.0f ? 0.0f : B;

            // Store new pixel value

            R *= 255.0f;
            G *= 255.0f;
            B *= 255.0f;

            buffer[pixelIndex] = ((int)R << 16) | ((int)G << 8) | (int)B;

            pixelIndex++;
        }
    }
}

Josay · Answer 1 · 2013-07-18 20:08:50Z

Depending on your compiler, these advices might or might not have an effect. As a rule of thumb, let's try to give the compiler and the reader as many hints as possible and let's make things as simple as it can be.

unsigned int pixelIndex = 0;
for (int y = 0; y < height; y++)
{
    for (int x = 0; x < width; x++)
    {
        some_function(pixelIndex);
        pixelIndex++;
    }
}

could be written with a single-loop, thus saving some comparisons and the update of 3 different variables :

const unsigned int limit = height*width; // your compiler should notice that this is const and does not need being re-evaluated at each iteration but let's give him a bigger hint.
for (unsigned int pixelIndex = 0; pixelIndex < limit; pixelIndex++)
{
    some_function(pixelIndex);
}

You could define exposureFactor, brightnessFactor, contrastFactor and saturationFactor, gray as constants.

You could declare and define R, G, B inside the loop to limit their scope.

You could compute gray * saturationFactor only once if you were storing the result.

You could also try to limit the number of accesses to R,G,B by packing the evaluation. The drawback is that you might lose readibility by doing do.

At the end, my version of the code is

void applyLevels(unsigned int *rgb, const unsigned int width, const unsigned int height, const float exposure, const float brightness, const float contrast, const float saturation)
{
    const float exposureFactor   = powf(2.0f, exposure);
    const float brightnessFactor = brightness / 10.0f;
    const float contrastFactor   = contrast > 0.0f ? contrast : 0.0f;
    const float saturationFactor = 1.0f - saturation;

    const unsigned int limit = height*width;
    for (unsigned int pixelIndex = 0; pixelIndex < limit; pixelIndex++)
    {
        const int pixelValue = buffer[pixelIndex];

        float R = ((pixelValue & 0xff0000) >> 16) / 255.0f;
        float G = ((pixelValue & 0xff00) >> 8) / 255.0f;
        float B = (pixelValue & 0xff) / 255.0f;

        // Clamp values + exposure factor + contrast
        R = (((R > 1.0f ? exposureFactor : R < 0.0f ? 0.0f : exposureFactor*R) - 0.5f) * contrastFactor) + 0.5f;
        G = (((G > 1.0f ? exposureFactor : G < 0.0f ? 0.0f : exposureFactor*G) - 0.5f) * contrastFactor) + 0.5f;
        B = (((B > 1.0f ? exposureFactor : B < 0.0f ? 0.0f : exposureFactor*B) - 0.5f) * contrastFactor) + 0.5f;

        // Saturation + Brightness
        const float graySaturFactorPlusBrightness = brightnessFactor + ((R * 0.3f) + (G * 0.59f) + (B * 0.11f)) * saturationFactor;
        R = graySaturFactorPlusBrightness + R * saturation;
        G = graySaturFactorPlusBrightness + G * saturation;
        B = graySaturFactorPlusBrightness + B * saturation;

        // Clamp values
        R = R > 1.0f ? 255.0f : R < 0.0f ? 0.0f : 255.0f*R;
        G = G > 1.0f ? 255.0f : G < 0.0f ? 0.0f : 255.0f*G;
        B = B > 1.0f ? 255.0f : B < 0.0f ? 0.0f : 255.0f*B;

        buffer[pixelIndex] = ((int)R << 16) | ((int)G << 8) | (int)B;
    }
}

I am convinced it is slightly better than yours from a performance point of view but I'm not sure you'll be able to notice any actual difference in the run time.

asked	5 months ago
viewed	58 times
active	5 months ago

Explore our sites

How to speed up image processing algorithm

1 Answer

Your Answer

Not the answer you're looking for? Browse other questions tagged optimization algorithm performance or ask your own question.

Hot Network Questions

Explore our sites

How to speed up image processing algorithm

1 Answer

Your Answer

Sign up or login

Post as a guest

Not the answer you're looking for? Browse other questions tagged optimization algorithm performance or ask your own question.

Related

Hot Network Questions