I am trying to plot a boxplot for a column in several csv files (without the header row of course), but running into some confusion around tuples, lists and arrays. Here is what I have so far
#!/usr/bin/env python
import csv
from numpy import *
import pylab as p
import matplotlib
#open one file, until boxplot-ing works
f = csv.reader (open('2-node.csv'))
#get all the columns in the file
timeStamp,elapsed,label,responseCode,responseMessage,threadName,dataType,success,bytes,Latency = zip(*f)
#Make list out of elapsed to pop the 1st element -- the header
elapsed_list = list(elapsed)
elapsed_list.pop(0)
#Turn list back to a tuple
elapsed = tuple(elapsed_list)
#Turn list to an numpy array
elapsed_array = array(elapsed_list)
#Elapsed Column statically entered into an array
data_array = ([4631, 3641, 1902, 1937, 1745, 8937] )
print data_array #prints in this format: ([xx,xx,xx,xx]), .__class__ is list ... ?
print elapsed #prints in this format: ('xx','xx','xx','xx'), .__class__ is tuple
print elapsed_list # #print in this format: ['xx', 'xx', 'xx', 'xx', 'xx'], .__class__ is list
print elapsed_array #prints in this format: ['xx' 'xx' 'xx' 'xx' 'xx'] -- notice no commas, .__class__ is numpy.ndarray
p.boxplot (data_array) #works
p.boxplot (elapsed) # does not work, error below
p.boxplit (elapsed_list) #does not work
p.boxplot (elapsed_array) #does not work
p.show()
For boxplots, the 1st argument is an "an array or a sequence of vectors", so I would think elapsed_array
would work ... ? But yet data_array
, a "list," works ... but elapsed_list` a "list" does not ... ? Is there a better way to do this ... ?
I am fairly new to python, and I would like to understand the what about the differences among a tuple, list, and numpy-array prevents this boxplot from working.
Example error message is:
Traceback (most recent call last):
File "../pullcol.py", line 32, in <module>
p.boxplot (elapsed_list)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/pyplot.py", line 1962, in boxplot
ret = ax.boxplot(x, notch, sym, vert, whis, positions, widths, patch_artist, bootstrap)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/axes.py", line 5383, in boxplot
q1, med, q3 = mlab.prctile(d,[25,50,75])
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/mlab.py", line 946, in prctile
return _interpolate(values[ai],values[bi],frac)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/mlab.py", line 920, in _interpolate
return a + (b - a)*fraction
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'numpy.ndarray'