I have a large numpy.ndarray that I want to make a plot of, where the x axis has to do with the values in the array and the y axis shows how often that value has appeared in the array. To be clear, I don't care about the order of the data in the array or if their order gets screwed up, I just want to take the numbers, bin them, and then plot them.

Steps I have so far that I want to do, each separate in my Jupyter notebook

  • Open/read my array (it's 1024x1024, so quite large)- step done

  • Convert array into list- done

  • Spit out null values in array... currently not working

  • Bin data to count values... really lost here

  • Scatter plot- trimmed vs count- this part will be fine once the previous two work, matplotlib and I get along

    import numpy as np

    import matplotlib.pyplot as plt

    scidata = np array of data that's 1024x1024

    lsci = []

    for r in range(1024):

    scilist = scidata[r,:].tolist()
    
    lsci.extend(scilist)
    
    trimmed = lsci
    

    for item in lsci:

      if 12.58 <= i== 12.59: #the null value I don't want is in this range
    
      r.remove(item)
    

I'm sorry, I wish I had more, but this is where things get dicey for me and I'm kinda ashamed to post what I've tried and failed at because most are dead ends. The only real solution I've thought of is binning the data... but that won't work for a scatter plot because the length of the two lists won't be the same and a histogram isn't what I want as my final product anyway. So is there another approach I can be using for this that I'm unaware of? (I feel like there's some chunk of coding knowledge that I just never learned- surely I'm not the first person to want to do this.) Thanks!

Edit: Sorry, all my code isn't showing up as code even though I put four spaces...

share|improve this question
    
are you actually asking for how to create a histogram? Then try to use matplotlib.pyplot.hist(yourarray.ravel()) and use the optional arguments bins and range to define the number of bins and the range of the data – dnalow Sep 14 '16 at 16:39
    
FYI - you can highlight all of your code text and press CTRL+K to convert it to code format – Nick Braunagel Sep 14 '16 at 17:47

The same thing Nick Braunage proposed, but without pandas:

import numpy as np
import matplotlib.pyplot as plt

a = np.random.randint(10, size=100)   # or use yourarray.ravel() here to make it flat

num, bins, _ = plt.hist(a)
plt.show()

or

num, bins = np.histogram(a)
plt.bar(bins[:-1], num)
plt.show()
share|improve this answer

'Binning' is definitely a histogram feature but I get the impression you want a simple pivot table. How about:

  1. Remove undesired value
  2. Convert your numpy array to dataframe
  3. Create pivot table from dataframe
  4. Plot results

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
% matplotlib inline

a = np.random.randint(10, size=100)   # array([3, 0, 3, 8, 1, 9, 1, 8,...])
exclude_value = 3  # change as required
a_new = [item for item in a if item != exclude_value] # new list without exclude value

df = pd.DataFrame(a_new).pivot_table(columns=0, aggfunc='size')

x = df.index.values
y = df.values
plt.bar(x,y)

plt.xticks(x)
plt.show()

OUTPUT:

(note how one value has been excluded, in this case 3) enter image description here

share|improve this answer
    
I think this approach is what I want to do, but two things: first, the null value isn't actually a NaN in this array because of reasons I don't want to get into (but is set at 12.85, let's say, instead of a NaN). Hence my attempt to filter this number out, but that filtering step isn't working. Second, for the pandas step, I get an error "pivot_table() got an unexpected keyword argument 'columns'." It's really not clear to me from searching for this error what can be prompting that. – kb3hts Sep 15 '16 at 14:20
    
Please see the update. Let me know if you still see the pandas error. – Nick Braunagel Sep 15 '16 at 14:48

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.