Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Good morning experts,

I have an array which contain integer numbers, and I have a list with the unique values that are in the array sorted in special order. What I want is to make another array which will contain the indexes of each value in the a array.

#a numpy array with integer values
#size_x and size_y: array dimensions of a
#index_list contain the unique values of a sorted in a special order.
#b New array with the index values

for i in xrange(0,size_x):
     for j in xrange(0,size_y):                    
         b[i][j]=index_list.index(a[i][j])

This works but it takes long time to do it. Is there a faster way to do it?

Many thanks for your help

German

share|improve this question
Use numpy sorting, fancy indexing or whatever you need. (Or for example np.unique with its optional returns) It should be much much faster then the dictionary based method as well. – seberg Sep 11 '12 at 10:16

2 Answers

up vote 2 down vote accepted

The slow part is the lookup

index_list.index(a[i][j])

It will be much quicker to use a Python dictionary for this task, ie. rather than

index_list = [ item_0, item_1, item_2, ...]

use

index_dict = { item_0:0,  item_1:1, item_2:2, ...}

Which can be created using:

index_dict = dict( (item, i) for i, item in enumerate(index_list) )
share|improve this answer
Hayden Many thanks for your help, it works very well compared to my code, big big difference on run time. Best regards German – gerocampo Sep 11 '12 at 9:53

Didn't try, but as this is pure numpy, it should be much faster then a dictionary based approach:

# note that the code will use the next higher value if a value is
# missing from index_list.
new_vals, old_index = np.unique(index_list, return_index=True)

# use searchsorted to find the index:
b_new_index = np.searchsorted(new_vals, a)

# And the original index:
b = old_index[b_new_index]

Alternatively you could simply fill any wholes in index_list.


Edited code, it was as such quite simply wrong (or very limited)...

share|improve this answer
I have a problem with this, it works for consecutive values in the index_list (i.e. [0,1,2,3] , when I have non consecutive values (i.e. [0,1,3,5] it doesn't work,please have a look: import numpy as np a = np.random.random((11, 13))*100 a=a.astype(int) list_colors=np.unique(a) print "list colors",list_colors new_vals = np.argsort(list_colors) print "new vals",new_vals b = new_vals[a] – gerocampo Sep 13 '12 at 12:14
Have reworked to use searchsorted... the old code would have needed to fill the holes, yeah... – seberg Sep 13 '12 at 12:46

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.