I have a large numpy.ndarray that I want to make a plot of, where the x axis has to do with the values in the array and the y axis shows how often that value has appeared in the array. To be clear, I don't care about the order of the data in the array or if their order gets screwed up, I just want to take the numbers, bin them, and then plot them.
Steps I have so far that I want to do, each separate in my Jupyter notebook
Open/read my array (it's 1024x1024, so quite large)- step done
Convert array into list- done
Spit out null values in array... currently not working
Bin data to count values... really lost here
Scatter plot- trimmed vs count- this part will be fine once the previous two work, matplotlib and I get along
import numpy as np
import matplotlib.pyplot as plt
scidata = np array of data that's 1024x1024
lsci = []
for r in range(1024):
scilist = scidata[r,:].tolist() lsci.extend(scilist) trimmed = lsci
for item in lsci:
if 12.58 <= i== 12.59: #the null value I don't want is in this range r.remove(item)
I'm sorry, I wish I had more, but this is where things get dicey for me and I'm kinda ashamed to post what I've tried and failed at because most are dead ends. The only real solution I've thought of is binning the data... but that won't work for a scatter plot because the length of the two lists won't be the same and a histogram isn't what I want as my final product anyway. So is there another approach I can be using for this that I'm unaware of? (I feel like there's some chunk of coding knowledge that I just never learned- surely I'm not the first person to want to do this.) Thanks!
Edit: Sorry, all my code isn't showing up as code even though I put four spaces...
matplotlib.pyplot.hist(yourarray.ravel())
and use the optional argumentsbins
andrange
to define the number of bins and the range of the data – dnalow Sep 14 '16 at 16:39CTRL+K
to convert it to code format – Nick Braunagel Sep 14 '16 at 17:47