I've been porting a Python package that uses libsvm onto some production servers and ran into a strange segmentation fault which I traced to a ctypes function pointer. I'm trying to determine where the ctypes wrapper failed and if this is a distro specific problem or not.
The system I am running this on is a very clean virtual machine with almost nothing installed: Solaris 5.11 amd64 pentium_pro+mmx pentium_pro pentium+mmx pentium i486 i386 i86 Python 2.7.2
Now for the problem description and how I narrowed to ctypes. In libsvm you can specify the print function by passing a void (*print_func)(const char *)
pointer into the svm_set_print_string_function
function. The default with a NULL pointer is to print to stdout. Now the interesting part is that the Python wrapper for libsvm (which works fine on a variety of other systems) makes such a function pointer when asking for quiet mode (no printing) via the following:
PRINT_STRING_FUN = CFUNCTYPE(None, c_char_p)
def print_null(s):
return
if argv[i] == "-q":
self.print_func = PRINT_STRING_FUN(print_null)
libsvm.svm_set_print_string_function(self.print_func)
When I set quiet mode libsvm accepts the function pointer but hangs after a few seconds when calling svm_train
then seg faults. I tried making a void *
argument function pointer and then casting it to a const char *
function pointer with the same results, which means it wasn't the conversion from const char *
to a PyStringObject.
Then I finally just wrote a C++ function to set the function pointer to a no-op in the library itself by:
void print_null(const char *) {}
void svm_set_print_null() {
svm_set_print_string_function(&print_null);
}
which worked as expected with no segmentation faults. This leads me to think that the ctypes is failing at some internal point of function pointer conversion. Looking through the ctypes source files hasn't revealed anything obvious to me though I haven't worked a lot with ctypes explicitly so it's difficult to narrow down where the bug might be.
I can use my library addition solution for now, but if I want to silently process the returns I would need to actually be able to pass a function pointer into libsvm. Plus it doesn't give me peace of mind about stability if I need to implement such workarounds without knowing what's the true root cause of the problem.
Has anyone else had problems with libsvm print functions on Solaris or specifically ctypes function pointers in Python on Solaris? I couldn't find anything searching online about either such problems with Solaris. I planning on playing around with library calls and making some function processing libs to find the exact boundaries of failure, but someone else's input might save me a day or two of debug testing.
UPDATE
The problem is reproducible on the 32bit version of Solaris 5.11 as well.
self.print_func
) but does that object continue to exist aftersvm_set_print_string_function
is called? – Mark Tolonen Aug 11 '12 at 4:31