I have written an implementation of memcpy
function and I want to know how good is it and if it gets the most out of the processor capabilities or not.
The reason for this is because I am writing my operating system (which will be of course reviewed on parts here on Code Review) it is supposed to work efficiently on both x86_64
and x86
machines
#include <stddef.h>
#include <stdint.h>
void *memcpy(void *dest, const void *src,size_t n)
{
if (n==0)
return dest;
#if defined(__x86_64__) || defined(_M_X64)
size_t i=0;
if(n>=8)
while(i<n/8*8){
*(uint64_t*)(dest + i) = *(uint64_t*)(src+i);
i+=8;
}
if(n-i>=4){
*(uint32_t*)(dest + i) = *(uint32_t*)(src+i);
i+=4;
}
if(n-i>=2){
*(uint16_t*)(dest+i) = *(uint16_t*)(src+i);
i+=2;
}
if(n-i>=1)
*(uint8_t*)(dest+i) = *(uint8_t*)(src+i);
#elif defined(__i386) || defined(_M_IX86)
size_t i=0;
if(n>=4)
while(i<n/4*4){
*(uint32_t*)(dest + i) = *(uint32_t*)(src+i);
i+=4;
}
if(n-i>=2){
*(uint16_t*)(dest+i) = *(uint16_t*)(src+i);
i+=2;
}
if(n-i>=1)
*(uint8_t*)(dest+i) = *(uint8_t*)(src+i);
#endif
return dest;
}
I made some aggressive testing with memcmp()
to check for correct data transmission and valgrind to check for memory leaks and It passed all the tests. I didn't post the testing code because I think it could be useless since I don't want it to be reviewed.
memmove
to me.src + n
remains constant. Please double check if this is indeed the right code. \$\endgroup\$src + n
should besrc + i
on every line except the last. This code is broken. Did you test it? \$\endgroup\$n
is 1? You will write todst[-4]
at some point. \$\endgroup\$n > 1024
. Did you write some basic unit tests and a benchmark againstmemcpy
fromstring.h
? \$\endgroup\$