I've been trying to optimize this piece of code:
void detect_optimized(int width, int height, int threshold)
{
int x, y, z;
int tmp;
for (y = 1; y < width-1; y++)
for (x = 1; x < height-1; x++)
for (z = 0; z < 3; z++)
{
tmp = mask_product(mask,a,x,y,z);
if (tmp>255)
tmp = 255;
if (tmp<threshold)
tmp = 0;
c[x][y][z] = 255-tmp;
}
return;
}
So far I've tried "Blocking" and and a few other things, but can't seem to get it to run any faster.
Blocking resulted in:
for(yy = 1; yy<height-1; yy+=4){
for(xx = 1; xx<width -1; xx+=4){
for (y = yy; y < 4+yy; y++){
for (x = xx; x < 4+xx; x++){
for (z = 0; z < 3; z++)
{
tmp = mask_product(mask,a,x,y,z);
if (tmp>255)
tmp = 255;
if (tmp<threshold)
tmp = 0;
c[x][y][z] = 255-tmp;
}}}}}
Which ran at the same speed as the original program.
Any suggestions would be great.
mask_function cannot be changed, but here is its code:
int mask_product(int m[3][3], byte bitmap[MAX_ROW][MAX_COL][NUM_COLORS], int x, int y, int z)
{
int tmp[9];
int i, sum;
// ADDED THIS LINE (sum = 0) TO FIX THE BUG
sum = 0;
tmp[0] = m[0][0]*bitmap[x-1][y-1][z];
tmp[1] = m[1][0]*bitmap[x][y-1][z];
tmp[2] = m[2][0]*bitmap[x+1][y-1][z];
tmp[3] = m[0][1]*bitmap[x-1][y][z];
tmp[4] = m[1][1]*bitmap[x][y][z];
tmp[5] = m[2][1]*bitmap[x+1][y][z];
tmp[6] = m[0][2]*bitmap[x-1][y+1][z];
tmp[7] = m[1][2]*bitmap[x][y+1][z];
tmp[8] = m[2][2]*bitmap[x+1][y+1][z];
for (i=0; i<9; i++)
sum = sum + tmp[i];
return sum;
}
tmp = 255;
indicates that you want that if to be wrapped into the other one's body. Is that correct? If so, then you are missing parenthesis. If not, then the indentation should be fixed. – Nobody yesterdaymask_product
some standard function that you may not change? If not then please provide its code as well. – Nobody yesterday