I haven't tested this in any way whatsoever, but you could use a 32-bit integer as a buffer to store the bytes as you process them. Then, any multiples of 8 bytes that end up in the buffer can be passed on to wherever they are needed.
The first byte that comes in you process and push all 12 bits into the 32-bit integer - increment a counter by 12. You now have 12 bits in the buffer variable.
Then while the number of bits in the buffer is at least 8 bits you grab the upper-most 8 bits and pass them on to wherever they are going. Decrement the counter by 8 for each iteration. You have now sent 1 byte and you have 4 bits left in the buffer.
The next byte that comes in you push the new 12 bits into the buffer. The counter, increased by 12, is now 16.
16 is greater than or equal to 8, so send out the upper 8 bits of the 16 and decrement the counter by 8.
8 is greater than or equal to 8, so send out the upper 8 bits of the 8 and decrement the counter by 8.
Counter is now 0, so nothing else to send. Back to the beginning again.
Something like:
void processByte(uint8_t in) {
static uint32_t buffer = 0;
static uint8_t validBits = 0;
buffer <<= 12; // Make room for 12 new bits.
buffer |= (first processed bit << 11);
buffer |= (second processed bit << 10);
buffer |= ((in & 0x80) << 2);
buffer |= (third processed bit << 7);
buffer |= ((in & 0x70) << 1);
buffer |= (fourth processed bit << 4);
buffer |= (in & 0x0F);
validBits += 12;
while (validBits >= 8)
validBits -= 8;
executeByte((buffer >> validBits) & 0xFF);
}
}