15

What are the advantages and disadvantages of storing Python objects in a numpy.array with dtype='o' versus using list (or list of list, etc., in higher dimensions)?

Are numpy arrays more efficient in this case? (It seems that they cannot avoid the indirection, but might be more efficient in the multidimensional case.)

  • 1
    I personally have never come across compelling uses for NumPy arrays of objects. It would be interesting to see if someone can come up with a convincing example. (+1) – NPE Apr 11 '13 at 10:08
  • Possible duplicate of stackoverflow.com/questions/6141853/… – tiago Apr 11 '13 at 12:29
  • 2
    @NPE I mostly agree, other than as a convenient way to do (some) mathematical operations with arrays of Fraction objects (or the like), without resorting to half a dozen nested zip's and map's. – Jaime Apr 11 '13 at 14:08
  • @NPE NodePy uses NumPy arrays of objects (sympy expressions) to do some exact analysis of numerical methods for ODEs. – jorgeca Apr 11 '13 at 17:54
10

Slicing works differently with NumPy arrays. The NumPy docs devote a lengthy page on the topic. To highlight some points:

  • NumPy slices can slice through multiple dimensions
  • All arrays generated by NumPy basic slicing are always views of the original array, while slices of lists are shallow copies.
  • You can assign a scalar into a NumPy slice.
  • You can insert and delete items in a list by assigning a sequence of different length to a slice, whereas NumPy would raise an error.

Demo:

>>> a = np.arange(4, dtype=object).reshape((2,2))
>>> a
array([[0, 1],
       [2, 3]], dtype=object)
>>> a[:,0]             #multidimensional slicing
array([0, 2], dtype=object)
>>> b = a[:,0]
>>> b[:] = True        #can assign scalar
>>> a                  #contents of a changed because b is a view to a
array([[True, 1],
       [True, 3]], dtype=object)    

Also, NumPy arrays provide convenient mathematical operations with arrays of objects that support them (e.g. fraction.Fraction).

  • Also, you can use numpy helpers like flat, flatten, nditer – Neil G Apr 12 '13 at 6:01
0

Numpy is less memory use than a Python list. Also Numpy is faster and more convenient than a list.

For example: if you want to add two lists in Python you have to do looping over all element in the list. On the other hand in Numpy you just add them.

# adding two lists in python
sum = []
l1 = [1, 2, 3]
l2 = [2, 3, 4]
for i in range(len(l1)):
   print sum.append(l1[i]+l2[i])
# adding in numpy
a1 = np.arange(3)
a2 = np.arange(3)
sum2 = a1+a2
print sum2
  • I wasn't asking about arrays of numbers. I was asking about arrays of objects. And I don't know that Python lists need more memory or are slower. After all, they're both just contiguous arrays underneath. – Neil G Apr 20 '17 at 1:53

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.