Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

So I have a large NumPy array that takes the following form:

data = [[2456447.64798471, 4, 15.717, 0.007, 5, 17.308, 0.019, 6, 13.965, 0.006],
        [2456447.6482855, 4, 15.768, 0.018, 5, 17.347, 0.024, 6, 14.001, 0.023],
        [2456447.648575, 4, 15.824, 0.02, 5, 17.383, 0.024, 6, 14.055, 0.023]]

I want to create a sub array that looks like this:

[[4, 15.717, 5, 17.308, 6, 13.965], 
 [4, 15.768, 5, 17.347, 6, 14.001],
 [4, 15.824, 5, 17.383, 6, 14.055]]

Basically I want to select out the first column, and then starting at the 4th column I want to select out every 3rd column. I tried to figure this out how to approach this with something like data[1:6:?] but I didn't understand how to step through and only get the columns that I wanted.

Also I need this to be scalable for an array that extends horizontally. So I don't just want to hard code the column values.

share|improve this question

2 Answers 2

up vote 2 down vote accepted

This will do the trick, it scales horizontally and vertically and it's easy and works.

subArray = []
newRow = []
for row in data:
    for i in xrange(0,len(row)):
        if (i % 3 == 0):
            continue
        newRow.append(row[i])
    subArray.append(newRow)
    newRow = []
share|improve this answer
1  
if you need this to be scalable and not statically sized let me knkow and I can change the code –  Stephan Jun 28 '13 at 17:09
    
Yeah I was trying to make something scalable because my actual data array is much longer both horizontally and vertically. If you have a scalable solution that would be awesome! –  sTr8_Struggin Jun 28 '13 at 18:07
1  
@sTr8_Struggin DONE! –  Stephan Jun 28 '13 at 18:15
    
Thanks for the help so far. I ran your code and I got the following error: if (row.index(element) % 3 == 0): AttributeError: 'numpy.ndarray' object has no attribute 'index' –  sTr8_Struggin Jun 28 '13 at 18:24
    
@sTr8_Struggin that's because your data is actually a numpy array, try my new code –  Stephan Jun 28 '13 at 18:33

You could do this:

>>> data[:, [1, 2, 4, 5, 7, 8]]
array([[  4.   ,  15.717,   5.   ,  17.308,   6.   ,  13.965],
       [  4.   ,  15.768,   5.   ,  17.347,   6.   ,  14.001],
       [  4.   ,  15.824,   5.   ,  17.383,   6.   ,  14.055]])
share|improve this answer
    
what is going on here, slice notation with a list? and why is there a comma after the colon? –  Stephan Jun 28 '13 at 17:43
    
@Stephan, since data is a NumPy array, it can be accessed with integer indexing. The list can be another array too. The docs for it are here. The comma separates slicing along the axes. So data[1:2, :] would select the first row and all the columns, while data[:, 1:2] would select all the rows and the first column. –  Eric Workman Jun 28 '13 at 17:52
    
are you sure data[1:2] doesn't get the second item? I thought slice was 0 indexed so data[0:1] would get the first item –  Stephan Jun 28 '13 at 18:09
    
First as in directly after zeroth. I was following the notation in the original question. Sorry I wasn't clear. –  Eric Workman Jun 28 '13 at 18:51

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.