NumPy
arrays are great for both performance and easy use (easier slicing, indexing than lists).
I try to construct a data container out of a NumPy structured array
instead of dict
of NumPy arrays
. The problem is the performance is much worse. About 2.5 times as bad using homogeneous data and about 32 times for heterogeneous data (I'm talking about NumPy
datatypes).
Is there a way to speed the structured array's up? I tried changing the memoryorder from 'c' to 'f' but this didn't have any affect.
Here's my profiling code:
import time
import numpy as np
NP_SIZE = 100000
N_REP = 100
np_homo = np.zeros(NP_SIZE, dtype=[('a', np.double), ('b', np.double)], order='c')
np_hetro = np.zeros(NP_SIZE, dtype=[('a', np.double), ('b', np.int32)], order='c')
dict_homo = {'a': np.zeros(NP_SIZE), 'b': np.zeros(NP_SIZE)}
dict_hetro = {'a': np.zeros(NP_SIZE), 'b': np.zeros(NP_SIZE, np.int32)}
t0 = time.time()
for i in range(N_REP):
np_homo['a'] += i
t1 = time.time()
for i in range(N_REP):
np_hetro['a'] += i
t2 = time.time()
for i in range(N_REP):
dict_homo['a'] += i
t3 = time.time()
for i in range(N_REP):
dict_hetro['a'] += i
t4 = time.time()
print('Homogeneous Numpy struct array took {:.4f}s'.format(t1 - t0))
print('Hetoregeneous Numpy struct array took {:.4f}s'.format(t2 - t1))
print('Homogeneous Dict of numpy arrays took {:.4f}s'.format(t3 - t2))
print('Hetoregeneous Dict of numpy arrays took {:.4f}s'.format(t4 - t3))
Edit: Forgot to put my timing numbers:
Homogenious Numpy struct array took 0.0101s
Hetoregenious Numpy struct array took 0.1367s
Homogenious Dict of numpy arrays took 0.0042s
Hetoregenious Dict of numpy arrays took 0.0042s
Edit2: I added some additional test case with the timit module:
import numpy as np
import timeit
NP_SIZE = 1000000
def time(data, txt, n_rep=1000):
def intern():
data['a'] += 1
time = timeit.timeit(intern, number=n_rep)
print('{} {:.4f}'.format(txt, time))
np_homo = np.zeros(NP_SIZE, dtype=[('a', np.double), ('b', np.double)], order='c')
np_hetro = np.zeros(NP_SIZE, dtype=[('a', np.double), ('b', np.int32)], order='c')
dict_homo = {'a': np.zeros(NP_SIZE), 'b': np.zeros(NP_SIZE)}
dict_hetro = {'a': np.zeros(NP_SIZE), 'b': np.zeros(NP_SIZE, np.int32)}
time(np_homo, 'Homogeneous Numpy struct array')
time(np_hetro, 'Hetoregeneous Numpy struct array')
time(dict_homo, 'Homogeneous Dict of numpy arrays')
time(dict_hetro, 'Hetoregeneous Dict of numpy arrays')
results in:
Homogeneous Numpy struct array 0.7989
Hetoregeneous Numpy struct array 13.5253
Homogeneous Dict of numpy arrays 0.3750
Hetoregeneous Dict of numpy arrays 0.3744
The ratios between the runs seem reasonably stable. Using both methods and a different size of the array.
For the offcase it matters: python: 3.4 NumPy: 1.9.2
np_homo
vs.np_hetero
, maybe it has to do with alignment, becausenp.int64
as the second dtype isn't so slow.