In want to implement length prefixed strings in C (not null terminated), with some idiosyncrasies: malloc is prohibited, memory efficiency is important, each string (except for literal-backed ones) is modifiable and can change its length, but it has a fixed (at declaration) allocated length (maximum 255), C code is generated by a code generator. I want this to work on as many architectures and C compilers as possible.
Suppose I do the following:
typedef struct PLS {
uint8_t maxlen;
uint8_t curlen;
char buf[1]; // actual allocated size is in maxlen
} PLS ;
Then I could declare, say
struct {
...
PLS s1; // real allocated size: 100
char s1buf[99];
PLS s2; // real allocated size: 64
char s2buf[63];
...
};
// initialization:
mystruct.s1.maxlen=100;
mystruct.s1.curlen=0;
mystruct.s2.maxlen=64;
mystruct.s2.curlen=0;
// PLS library:
void PLS_copy(PLS *to,PLS *from) {
int tc = from-> curlen; // chars to copy
if(tc > to->maxlen) tc = to->maxlen;
memcpy(to->buf,from->buf,tc);
}
Does this look ok? Any pitfalls? Would some compiler give warnings?
I believe that data structure alignment should not padding between s1
and s1buf
, but even then it should work, right?
For literals, if I want to avoid a duplication of allocated chars, things seem trickier (remember, though, that code is machine-generated). Would this be objectionable?
// static declaration of literal "hello" (5 chars) trailing null byte wasted
char lit1[] = "\x05\x05hello";
// can the above be casted to PLS? (I won't modif it, I swear)
PLS * lit1Pls = &(PLS)(lit1); // smells funny, though...
1
causes undefined behaviour. You seem unaware , but the C standard is very strict about array bounds. The length-1-and-read-off-the-end hack that was "popular" in the 1980s was never legal and modern compilers will break the code. Instead you could use the flexible array member. See here or browse that tag for other examples. \$\endgroup\$void PLS_copy()
could returnint
to indicate if the copy was completely successful. \$\endgroup\$