1

I define a variable length user-defined data type in postgresql according to the docs (http://www.postgresql.org/docs/9.0/static/xtypes.html)

C definition:

typedef struct MyType {
    char    vl_len_[4];
    char    data[1];
} mytype;

CREATE TYPE statements

CREATE TYPE mytype;
CREATE FUNCTION mytype_in(cstring) RETURNS mytype AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_out(mytype) RETURNS cstring AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_recv(internal) RETURNS mytype AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_send(mytype) RETURNS bytea AS 'mytype' LANGUAGE C IMMUTABLE STRICT;

CREATE TYPE mytype (
 internallength = VARIABLE,
 input = mytype_in,
 output = mytype_out,
 receive = mytype_recv,
 send = mytype_send,
 alignment = int4
 ,storage = plain
);

And I also define the functions in C. All of these work well. However, since my data could be very long, I change the storage from plain to external or extended . Then it outputs wrong result. Is there some TOAST function I need to use in my C-functions?

For example:

I have an operator to merge two values as follows:

PG_FUNCTION_INFO_V1(mytype_add);

Datum
mytype_add(PG_FUNCTION_ARGS)
{
    mytype *anno1 = (mytype *) PG_GETARG_POINTER(0);
    mytype *anno2 = (mytype *) PG_GETARG_POINTER(1);
    mytype    *result;
    int     newsize;

    newsize = VARSIZE(anno1) + VARSIZE(anno2) - VARHDRSZ;
    result = (mytype *) palloc(newsize);
    SET_VARSIZE(result, newsize);
    memcpy(result->data, anno1->data, VARSIZE(anno1) - VARHDRSZ);
    memcpy((result->data + VARSIZE(anno1) - VARHDRSZ), anno2->data, VARSIZE(anno2) - VARHDRSZ);

    PG_RETURN_POINTER(result);
}

The values in the anno1->data (12 bytes, 3 integers) are: 10, -1, -1, the values in anno2->data are: 20, -1, -1

So the values in result->data (24 bytes) are: 10,-1,-1,20,-1,-1

If I set the storage as plain, I got above correct result. If I set the storage as external, the output is totally wrong: -256,-1,1317887 ...

Thanks very much if anyone can give any hint. I have spend many hours on this

2
  • Sounds like you're failing to deTOAST the tuple, but you've provided nowhere near enough information to really say. Feb 7, 2014 at 7:07
  • @CraigRinger I add one example, could you have a look at it? Thanks very much
    – ssttddo
    Feb 7, 2014 at 7:32

2 Answers 2

0

You are failing to deTOAST the input Datum. So you're concatenating a compressed form, or possibly a pointer to out-of-line storage, rather than the raw data.

I think you need to use PG_GETARG_VARLENA_P(0) to ensure the datum is detoasted before working with it. I have not worked directly with TOAST and varlena types much, though.

It's not clear to me why you're declaring your own type with an identical structure to struct varlena, rather than just using Datum and the underlying struct varlena for variable-length datums. Start with:

struct varlena *anno1 = PG_GETARG_VARLENA_P(0);

On a side note, why are you trying to re-implement intarray (badly, i.e. using char arrays)? Please read this relevant article, and this one.

4
  • Thanks for the reply!! So I should use PG_GETARG_VARLENA_P(0) to replace PG_GETARG_POINTER(0)? I am new to postgresql, , that's why I follow the user-defined type to do this.
    – ssttddo
    Feb 7, 2014 at 8:23
  • @ssttddo Yes; see src/include/fmgr.h around line 243 Feb 7, 2014 at 8:24
  • @ssttddo ... and please stop redefining varlena with your own type like that. Just typedef it if you must have your own name. Feb 7, 2014 at 8:29
  • Yes. varlena is exactly what I need. Thanks a million.
    – ssttddo
    Feb 7, 2014 at 8:43
0

Add one more thing to it, the main difference is the structure that saves are different, with a pre-header that indicate the length of the data.

Therefore, when writing input function, you would need in implement a 4 byte header before your data start and use "SET_VARSIZE(PTR,len)" to alter the value of the 4 byte header.

On the other hand, when retrieve the data, you would need to use "PG_GETARG_VARLENA_P(n)", and the retrieved results would also contain a 4 byte header that indicate length. You can get the length by using "VARSIZE_4B(PTR)" and it will return the byte length of the data.

To summary and giving out the sample code, we assume we wanna store a non-known number of struct complex:

typedef struct Complex 
{
    double      x;
    double      y;
} Complex;

So after receiving input string, we decided we would need to store n numbers of struct. Therefore, allocate memory:

struct varlena* result = (struct varlena*)palloc(n * sizeof(Complex) + 4);

As stated in the documentation, we need to edit first 4 byte and set length:

SET_VARSIZE(result, n * sizeof(Complex));

The following byte, we should assign them with values, remember the address should be aligned to your system structure:

Complex * a = (Complex*)((__int64)result + 4);
for (int i = 0; i < n; i++) {
    a[i].x = input[i];
    a[i].y = input[i];
}

Finally, the data should be stored by:

PG_RETURN_POINTER(result);

To retrieve the data, need to use

struct varlen *b = PG_GETARG_VARLENA_P(0);

As stated above, also the result is going to have 4 byte at front stating the length, the output function could be:

Complex *c = (Complex *)(&(b->vl_dat));
char *result;
int n = VARSIZE_ANY_EXHDR(b) / sizeof(Complex);
for (int i = 0; i < n; i++) {
    result = psprintf("(%g;%g)", c[i].x, c[i].y);
}
PG_RETURN_CSTRING(result);

I haven't tested this exact code but a similar one, the result should be OK. It is nice if anyone could add comment on this or correct any mistakes I made. This is also for myself's reference.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.