1

Trying to use the code here https://stackoverflow.com/a/15390953/378594 to convert a numpy array into a shared memory array and back. Running the following code:

shared_array = shmarray.ndarray_to_shm(my_numpy_array)

and then passing the shared_array as an argument in the list of argument for a multiprocessing pool:

pool.map(my_function, list_of_args_arrays)

Where list_of_args_arrays contains my shared array and other arguments.

It results in the following error

PicklingError: Can't pickle <class 'multiprocessing.sharedctypes.c_double_Array_<array size>'>: attribute lookup multiprocessing.sharedctypes.c_double_Array_<array size> failed

Where <array_size> is the linear size of my numpy array.

I guess something has changed in numpy ctypes or something like that?

Further details:

I only need access to shared information. No editing will be done by the processes.

The function that calls the pool lies within a class. The class is initiated and the function is called by a main.py file.

1

Apparently when using multiprocessing.Pool all arguments are pickled, and so there was no use using multiprocessing.Array. Changing the code so that it uses an array of processes did the trick.

0

I think you are overcomplicating things: There is no need to pickle arrays (especially if they are read only):

you just need to do keep them accessible through some global variable:

(known to work in linux, but may not work in windows, don't know)

import numpy as np,multiprocessing as mp
class si:
  arrs=None

def summer(i):
    return si.arrs[i].sum()

def main():
    si.arrs=[np.zeros(100) for _ in range(1000)]
    pool = mp.Pool(16)
    res=pool.map(summer,range(1000))
    print res

if __name__ == '__main__':
    main()

If your arrays need to be read and written, you need to use this: Is shared readonly data copied to different processes for Python multiprocessing?

5
  • This looks good. Can you add how it's supposed to look like if the file is imported and there is no if __name__ == '__main': and main(). What is the important element here, that the global variable and the function summer are in the same scope? Or maybe the definition of the pool and the global variables?
    – Uri
    Apr 30, 2013 at 16:13
  • The important thing is to have a global resource initialized before mp.Pool(). E.g. main() and '_ main _' could be in other file (say sumfile.py) Just in main() instead of si.arr it will be sumfile.si.arr=[] and instead of pool.map(summer..) it'll be pool.map(sumfile.summer,...)
    – sega_sai
    Apr 30, 2013 at 16:33
  • Strange, I'm updating the global variables stored in class, and this update is done before calling Pool(), but still the processes spawned behave as though I did not change the global variables at all - they see the default value (i.e. None) despite the value is indeed changed in the main program process...
    – Uri
    Apr 30, 2013 at 22:25
  • Also take a look here docs.python.org/2/library/multiprocessing.html#windows. It seems to contradict your example
    – Uri
    Apr 30, 2013 at 22:27
  • I guess if you are using windows that may not work (their process spawning is different), but I never had windows to try it.
    – sega_sai
    May 1, 2013 at 0:25

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.