Python: how to store a numpy multidimensional array in PyTables?

Question

How can I put a numpy multidimensional array in a HDF5 file using PyTables?

From what I can tell I can't put an array field in a pytables table.

I also need to store some info about this array and be able to do mathematical computations on it.

Any suggestions?

Honestly, if you're storing a lot of just straight up ND arrays, you're better off with h5py instead of pytables. It's as simple as f.create_dataset('name', data=x) where x is your numpy array and f is the open hdf file. Doing the same thing in pytables is possible, but considerably more difficult. — Joe Kington, Jan 12 '12 at 22:16
I thought of that but pytables has some features (tables.expr) to do calculations directly on the arrays, can i have that with h5py ? — scripts, Jan 12 '12 at 22:22
@scripts - Not in the compressed, accelerated way that pytables does. (Or at least not that I know of, anyway.) pytables will also give you lots of nice querying abilities. h5py is better suited to straight-up storage and slicing of on-disk arrays (and is more pythonic, i.m.o., too). Not to plug my own answer too much, but my thoughts on the tradeoff between the two is here: stackoverflow.com/questions/7883646/… — Joe Kington, Jan 12 '12 at 22:34
thanks for the info Joe Kington and for my case pytables is better suited because of the powerful querying techniques — scripts, Jan 12 '12 at 22:43

Joe Kington · Accepted Answer · 2012-01-12 22:53:58Z

There may be a simpler way, but this is how you'd go about doing it, as far as I know:

import numpy as np
import tables

# Generate some data
x = np.random.random((100,100,100))

# Store "x" in a chunked array...
f = tables.openFile('test.hdf', 'w')
atom = tables.Atom.from_dtype(x.dtype)
ds = f.createCArray(f.root, 'somename', atom, x.shape)
ds[:] = x
f.close()

If you want to specify the compression to use, have a look at tables.Filters. E.g.

import numpy as np
import tables

# Generate some data
x = np.random.random((100,100,100))

# Store "x" in a chunked array with level 5 BLOSC compression...
f = tables.openFile('test.hdf', 'w')
atom = tables.Atom.from_dtype(x.dtype)
filters = tables.Filters(complib='blosc', complevel=5)
ds = f.createCArray(f.root, 'somename', atom, x.shape, filters=filters)
ds[:] = x
f.close()

There's probably a simpler way for a lot of this... I haven't used pytables for anything other than table-like data in a long while.

Note that this can now be done much more straightforwardly using the create_array method on file objects, as described in the section 'Creating new array objects' at pytables.github.io/usersguide/tutorials.html — Ben Allison, Oct 2 '14 at 15:52

asked	4 years ago
viewed	8839 times
active	9 months ago

current community

your communities

more stack exchange communities

Python: how to store a numpy multidimensional array in PyTables?

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged python arrays multidimensional-array numpy pytables or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

Python: how to store a numpy multidimensional array in PyTables?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python arrays multidimensional-array numpy pytables or ask your own question.

Linked

Related

Hot Network Questions