how to convert pandas data frame into numpy data frame

Question

I have one simple data set with class label and stored as "mydata.csv",

GA_ID   PN_ID   PC_ID   MBP_ID  GR_ID   AP_ID   class
0.033   6.652   6.681   0.194   0.874   3.177     0
0.034   9.039   6.224   0.194   1.137   3.177     0
0.035   10.936  10.304  1.015   0.911   4.9       1
0.022   10.11   9.603   1.374   0.848   4.566     1

i simply use given code to convert this data into numpy array so that i can use this data set for predictions and machine learning modeling but due to header is error has been raised "ValueError: could not convert string to float: " when i removed header from the file this method work well for me :

import numpy as np
#from sklearn import metrics
#from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

raw_data = open("/home/me/Desktop/scklearn/data.csv")
dataset = np.loadtxt(raw_data, delimiter=",")
X = dataset[:,0:5]
y = dataset[:,6]

i also tried to skip header but error occurs:

dataset = np.loadtxt(raw_data, delimiter=",")[1:]

then i moved to pandas and able import data from this method:

raw_data = pandas.read_csv("/home/me/Desktop/scklearn/data.csv")

but here I sucked again when i tried to convert this into numpy array its showing error like previous.

is there any method available in pandas that can : save heathers as list :

header_list = ('GA_ID','PN_ID','PC_ID' ,'MBP_ID' ,'GR_ID' , 'AP_ID','class')

last column as class label and remaining part(1:4,0:5) to numpy array for model building:

I have write down a code to get column list

clm_list = []
raw_data = pandas.read_csv("/home/me/Desktop/scklearn/data.csv")
clms = raw_data.columns()
for clm in clms:
    clm_list.append(clm)
print clm_list ## produces column list

Unclear what your real problem here is, pandas dataframes are compatible with sklearn interfaces, also if you don't want to write the header to a csv from pandas than you can pass param header=None in to_csv — EdChum, Apr 7 '15 at 10:51
@EdChum yes this is true actually my problem is that 1) if suppose i pass param as header=None and after modeling or at the time of feature selection i want to know the header how would i know the headers as i overlooked the header at the time of file opening. and 2) how can i use the given example data directly with pandas to scikit-learn data frame in the form of X = (data without header and class label) and y = (class label for predictions ) — jax, Apr 7 '15 at 10:55
Well you can do all this pandas fine, like I said the sklearn interfaces are compatible with pandas dfs — EdChum, Apr 7 '15 at 11:00
@EdChum Hi thanks for reply i have solve my problem and write down a code which i have posted as a answer. This code is doing well for me. thanks — jax, Apr 7 '15 at 11:47

jax · Answer 1 · 2015-04-07 11:44:28Z

up vote 2 down vote

after reading a lot finally I achieved what I want and successfully implemented data on scikit-learn, code to convert CSV data with scikit-learn compatible form is given bellow. thanks

import pandas as pd
r = pd.read_csv("/home/zebrafish/Desktop/ex.csv")
print r.values

clm_list = []
for column in r.columns:
    clm_list.append(column)


X = r[clm_list[0:len(clm_list)-1]].values
y = r[clm_list[len(clm_list)-1]].values

print clm_list
print X
print y

out come of this code is exactly what i want :

['GA_ID', 'PN_ID', 'PC_ID', 'MBP_ID', 'GR_ID', 'AP_ID', 'class']

[[  0.033   6.652   6.681   0.194   0.874   3.177]
 [  0.034   9.039   6.224   0.194   1.137   3.177]
 [  0.035  10.936  10.304   1.015   0.911   4.9  ]
 [  0.022  10.11    9.603   1.374   0.848   4.566]]

[0 0 1 1]

answered Apr 7 '15 at 11:44

jax

23829

You can simplify your column list creation to just this: clm_list = list(r) – EdChum Apr 7 '15 at 11:59

thanks it works great – jax Apr 8 '15 at 13:00

I just copied your code. It ran my Scikit program. THanks. – Chakra Oct 5 '15 at 11:07

add a comment |

asked	1 year ago
viewed	2162 times
active	1 year ago

current community

your communities

more stack exchange communities

how to convert pandas data frame into numpy data frame

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged python csv numpy pandas or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

how to convert pandas data frame into numpy data frame

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python csv numpy pandas or ask your own question.

Related

Hot Network Questions