I have one simple data set with class label and stored as "mydata.csv",
GA_ID PN_ID PC_ID MBP_ID GR_ID AP_ID class
0.033 6.652 6.681 0.194 0.874 3.177 0
0.034 9.039 6.224 0.194 1.137 3.177 0
0.035 10.936 10.304 1.015 0.911 4.9 1
0.022 10.11 9.603 1.374 0.848 4.566 1
i simply use given code to convert this data into numpy array so that i can use this data set for predictions and machine learning modeling but due to header is error has been raised "ValueError: could not convert string to float: " when i removed header from the file this method work well for me :
import numpy as np
#from sklearn import metrics
#from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
raw_data = open("/home/me/Desktop/scklearn/data.csv")
dataset = np.loadtxt(raw_data, delimiter=",")
X = dataset[:,0:5]
y = dataset[:,6]
i also tried to skip header but error occurs:
dataset = np.loadtxt(raw_data, delimiter=",")[1:]
then i moved to pandas and able import data from this method:
raw_data = pandas.read_csv("/home/me/Desktop/scklearn/data.csv")
but here I sucked again when i tried to convert this into numpy array its showing error like previous.
is there any method available in pandas that can : save heathers as list :
header_list = ('GA_ID','PN_ID','PC_ID' ,'MBP_ID' ,'GR_ID' , 'AP_ID','class')
last column as class label and remaining part(1:4,0:5) to numpy array for model building:
I have write down a code to get column list
clm_list = []
raw_data = pandas.read_csv("/home/me/Desktop/scklearn/data.csv")
clms = raw_data.columns()
for clm in clms:
clm_list.append(clm)
print clm_list ## produces column list
header=None
into_csv
– EdChum Apr 7 '15 at 10:51