Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Here's what I'm working with:

import numpy as np
import pandas as pd
np.__version__  # '1.8.1'
pd.__version__  # '0.14.1-107-g381a289'

Here is some fake data:

foo = np.arange(5)
bar = np.random.randn(30).reshape((5, 3, 2))

I want to get foo and bar into a pd.DataFrame. To my surprise, the following doesn't work (even though foo.shape[0] == bar.shape[0]):

df = pd.DataFrame.from_dict(dict(foo=foo, bar=bar))
# Exception: Data must be 1-dimensional

This does work:

df = pd.DataFrame.from_dict(dict(foo=foo.tolist(), bar=bar.tolist()))
df['bar'] = df['bar'].apply(np.array)

This roundabout method of converting my array to a nested list and then converting it back to an array via apply is bothersome. Is there a more straightforward way that I'm just not aware of?

share|improve this question
    
DataFrames aren't really meant to have lists of lists as elements. Is your working df really what you want? –  DSM Aug 1 '14 at 1:20
    
@DSM: what I want is a DataFrame with one column of integers and one column of 2-d numpy arrays. –  drammock Aug 1 '14 at 1:22
    
I don't think that's a supported use case for DataFrames; while you can cram nonscalar data into a cell, there's not much you can do with it after that. You'll have a column dtype of object, which is slow to begin with, and you can't really do any fast aggregation ops, so you'll have to fall back to relatively slow apply ops. Depending on preference you might be more interested in using a MultiIndex or a Panel instead of this approach. –  DSM Aug 1 '14 at 1:34
    
speed is not a huge issue for me; I'm typically working with 2k - 40k rows. pd.Panel is not a viable option because of the mixed dimensionality of the columns; I've only shown two columns here, but I have many more with 1, 2, or 3 dimensional data in each cell. I guess the lesson here is that I can't think of pandas DataFrames as being good for the same use cases as data.frames in R. –  drammock Aug 1 '14 at 1:43

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.