Take the 2-minute tour ×

Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Creating pandas.DataFrame from numpy.array

up vote 0 down vote favorite

Here's what I'm working with:

import numpy as np
import pandas as pd
np.__version__  # '1.8.1'
pd.__version__  # '0.14.1-107-g381a289'

Here is some fake data:

foo = np.arange(5)
bar = np.random.randn(30).reshape((5, 3, 2))

I want to get foo and bar into a pd.DataFrame. To my surprise, the following doesn't work (even though foo.shape[0] == bar.shape[0]):

df = pd.DataFrame.from_dict(dict(foo=foo, bar=bar))
# Exception: Data must be 1-dimensional

This does work:

df = pd.DataFrame.from_dict(dict(foo=foo.tolist(), bar=bar.tolist()))
df['bar'] = df['bar'].apply(np.array)

This roundabout method of converting my array to a nested list and then converting it back to an array via apply is bothersome. Is there a more straightforward way that I'm just not aware of?

asked Aug 1 '14 at 1:08

drammock
589316

DataFrames aren't really meant to have lists of lists as elements. Is your working df really what you want? – DSM Aug 1 '14 at 1:20

@DSM: what I want is a DataFrame with one column of integers and one column of 2-d numpy arrays. – drammock Aug 1 '14 at 1:22

I don't think that's a supported use case for DataFrames; while you can cram nonscalar data into a cell, there's not much you can do with it after that. You'll have a column dtype of object, which is slow to begin with, and you can't really do any fast aggregation ops, so you'll have to fall back to relatively slow apply ops. Depending on preference you might be more interested in using a MultiIndex or a Panel instead of this approach. – DSM Aug 1 '14 at 1:34

speed is not a huge issue for me; I'm typically working with 2k - 40k rows. pd.Panel is not a viable option because of the mixed dimensionality of the columns; I've only shown two columns here, but I have many more with 1, 2, or 3 dimensional data in each cell. I guess the lesson here is that I can't think of pandas DataFrames as being good for the same use cases as data.frames in R. – drammock Aug 1 '14 at 1:43

add a comment |

Your Answer

Sign up or log in

Post as a guest

Name

Post as a guest

Name

discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged python pandas or ask your own question.

question feed

asked	9 months ago
viewed	69 times

current community

your communities

more stack exchange communities

Creating pandas.DataFrame from numpy.array

Your Answer

Browse other questions tagged python pandas or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Creating pandas.DataFrame from numpy.array

Know someone who can answer? Share a link to this question via email, Google+, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Browse other questions tagged python pandas or ask your own question.

Related

Hot Network Questions