Python ctypes - Setting c_char array when string has embedded null?

Question

I'm using ctypes bit fields to dissect tightly packed binary data. I stuff a record's worth of data into a union as a string, then pull out key fields as integers.

This works great when there are no nulls in the buffer, but any embedded nulls cause cytpes to truncate the string.

Example:

from ctypes import *

class H(BigEndianStructure):
    _fields_ = [ ('f1', c_int, 8),
                 ('f2', c_int, 8),
                 ('f3', c_int, 8),
                 ('f4', c_int, 2)
                 # ...
                 ]

class U(Union):
    _fields_ = [ ('fld', H),
                 ('buf', c_char * 6)
                 ]

# With no nulls, works as expected...
u1 = U()
u1.buf='abcabc'
print '{} {} {} (expect: 97 98 99)'.format(u1.fld.f1, u1.fld.f2, u1.fld.f3)

# Embedded null breaks it...  This prints '97 0 0', NOT '97 0 99'
u2 = U()
u2.buf='a\x00cabc'
print '{} {} {} (expect: 97 0 99)'.format(u2.fld.f1, u2.fld.f2, u2.fld.f3)

Browsing the ctypes source, I see two methods to set a char array, CharArray_set_value() and CharArray_set_raw(). It appears that CharArray_set_raw() will handle nulls properly whereas CharArray_set_value() will not.

But I can't figure out how to invoke the raw version... It looks like a property, so I'd expect something like:

ui.buf.raw = 'abcabc'

but that yields:

AttributeError: 'str' object has no attribute raw

Any guidance appreciated. (Including a completely different approach!)

(Note: I need to process thousands of records per second, so efficiency is critical. Using an array comprehension to stuff a byte array in the structure works, but it's 100x slower.)

Mark Tolonen · Answer 1 · 2014-10-11 05:08:17Z

c_char*6 is handled, unfortunately, as a nul-terminated string. Switch to c_byte*6 instead, but lose the convenience of initializing with strings:

from ctypes import *

class H(BigEndianStructure):
    _fields_ = [ ('f1', c_int, 8),
                 ('f2', c_int, 8),
                 ('f3', c_int, 8),
                 ('f4', c_int, 2)
                 # ...
                 ]

class U(Union):
    _fields_ = [ ('fld', H),
                 ('buf', c_byte * 6)
                 ]

u1 = U()
u1.buf=(c_byte*6)(97,98,99,97,98,99)
print '{} {} {} (expect: 97 98 99)'.format(u1.fld.f1, u1.fld.f2, u1.fld.f3)

u2 = U()
u2.buf=(c_byte*6)(97,0,99,97,98,99)
print '{} {} {} (expect: 97 0 99)'.format(u2.fld.f1, u2.fld.f2, u2.fld.f3)

Output:

97 98 99 (expect: 97 98 99)
97 0 99 (expect: 97 0 99)

Thanks, Mark. This works, but the CPU overhead associated with marshaling the bytes from the string into the byte array makes the approach too slow for my application. (My trials were about 100x slower than ctype's memcpy()). — Simian, Oct 13 '14 at 17:20

pixelbrei · Answer 2 · 2015-12-14 14:57:25Z

up vote 0 down vote

You can also create the raw-string array outside of your struct/union:

mystring = (c_char * 6).from_buffer(u2)
print mystring.raw

This way you don't have any overhead for conversion. I wonder why a (c_char * 6) behaves differently when used alone vs. used in a Structure/Union...

edited Dec 14 '15 at 14:57

answered Dec 11 '15 at 13:43

pixelbrei

18517

For convenience (usually), the CField descriptors for c_char and c_wchar arrays are special-cased in PyCField_FromDesc (in Modules/_ctypes/cfield.c) to convert to and from native Python strings using s_get / s_set and U_get / U_set. – eryksun Dec 11 '15 at 14:17

1

Don't use from_address for this since the resulting array doesn't own a reference on the source buffer, u2. This is a recipe for segfault disaster. Use (c_char * 6).from_buffer(u2). – eryksun Dec 11 '15 at 14:23

In cases such as this I prefer to make the field name private (e.g. _buf) and use a public property. – eryksun Dec 11 '15 at 14:24

add a comment |

asked	2 years ago
viewed	458 times
active	1 year ago

current community

your communities

more stack exchange communities

Python ctypes - Setting c_char array when string has embedded null?

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged python python-2.7 ctypes or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Python ctypes - Setting c_char array when string has embedded null?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python python-2.7 ctypes or ask your own question.

Related

Hot Network Questions