Stack Overflow is a community of 4.7 million programmers, just like you, helping each other.

Join them; it only takes a minute:

Sign up
Join the Stack Overflow community to:
  1. Ask programming questions
  2. Answer and help your peers
  3. Get recognized for your expertise

I have a table of values stored into a list of lists like:

A = [   [a[1],b[1],c[1]],
        [a[2],b[2],c[2]],
        ...

        [a[m],b[m],c[m]]]

with
a[i] < b[1]
b[i] < a[i+1]
0 < c[i] < 1 

and a numpy array such as:

 X = [x[1], x[2], ..., x[n]]

I need to create an array

 Y = [y[1], y[2], ..., y[n]]

where each value of Y will correspond to

for i in [1,2, ..., n]:
  for k in [1,2, ..., m]:
     if a[k] <  x[i] < b[k]:
         y[i] = c[k]
     else:
         y[i] = 1 

Please note that X and Y have the same length, but A is totally different. Y can take any value in the third column of A (c[k] for k= 1,2,... m), as long as a[k] < x[i] < b[k] is met (for k= 1,2,... m and for i= 1,2,... n).

In the actual cases I am working on, n = 6789 and m = 6172.

I could do the verification using nested "for" cycles, but it is really slow. What is the fastest way to accomplish this? what if X and Y where 2D numpy arrays?

SAMPLE DATA:

a = [10, 20, 30, 40, 50, 60, 70, 80, 90]
b = [11, 21, 31, 41, 51, 61, 71, 81, 91]
c = [ 0.917,  0.572,  0.993 ,  0.131,  0.44, 0.252 ,  0.005,  0.375,  0.341]

A = A = [[d,e,f] for d,e,f in zip(a,b,c)]

X = [1, 4, 10.2, 20.5, 25, 32, 41.3, 50.5, 73]

EXPECTED RESULTS:

Y = [1, 1, 0.993, 0.132, 1, 1, 1, 0.375, 1 ]
share|improve this question
    
Why would you do zip([1,2, ..., n],[1,2, ..., m])? It seems likely that that doesn't do what you think it does. – user2357112 May 28 '15 at 18:42
    
@user2357112: you are indeed correct, I have updated the question. thanks. – jorgehumberto May 28 '15 at 18:47
    
The new version still looks wrong. Each y[i] value gets overwritten over and over. – user2357112 May 28 '15 at 18:47
1  
@jorgehumberto Would the posted solution work for you? – Divakar Jun 3 '15 at 18:54
1  
@Divakar: Perfect, thanks! took 3.5 s to create the array (when expanded X and Y to a 2D array), instead of the few minutes it would take when I iterated over all elements. – jorgehumberto Jun 8 '15 at 19:08
up vote 1 down vote accepted
+50

Approach #1: Using brute-force comparison with broadcasting -

import numpy as np

# Convert to numpy arrays
A_arr = np.array(A)
X_arr = np.array(X)

# Mask that represents "if a[k] <  x[i] < b[k]:" for all i,k
mask = (A_arr[:,None,0]<X_arr) & (X_arr<A_arr[:,None,1])

# Get indices where the mask has 1s, i.e. the conditionals were satisfied
_,C = np.where(mask)

# Setup output numpy array and set values in it from third column of A 
# that has conditionals satisfied for specific indices
Y = np.ones_like(X_arr)
Y[C] = A_arr[C,2]

Approach #2: Based on binning with np.searchsorted -

import numpy as np

# Convert A to 2D numpy array
A_arr = np.asarray(A)

# Setup intervals for binning later on 
intv = A_arr[:,:2].ravel()

# Perform binning & get interval & grouped indices for each X 
intv_idx = np.searchsorted(intv, X, side='right')
grp_intv_idx = np.floor(intv_idx/2).astype(int)

# Get mask of valid indices, i.e. X elements are within grouped intervals
mask = np.fmod(intv_idx,2)==1

# Setup output array 
Y = np.ones(len(X))

# Extract col-3 elements with grouped indices and valid ones from mask
Y[mask] = A_arr[:,2][grp_intv_idx[mask]]

# Remove (set to 1's) elements that fall exactly on bin boundaries
Y[np.in1d(X,intv)] = 1

Please note that if you need the output as a list, you can convert the numpy array to a list with a call like this - Y.tolist().


Sample run -

In [480]: A
Out[480]: 
[[139.0, 355.0, 0.5047342078960846],
 [419.0, 476.0, 0.3593886192040009],
 [580.0, 733.0, 0.3137694021600973]]

In [481]: X
Out[481]: [555, 689, 387, 617, 151, 149, 452]

In [482]: Y
Out[482]: 
array([ 1.        ,  0.3137694 ,  1.        ,  0.3137694 ,  0.50473421,
        0.50473421,  0.35938862])
share|improve this answer
    
Perfect, thanks! – jorgehumberto Jun 4 '15 at 21:21

With 1-d arrays, it's not too bad:

a,b,c = np.array(A).T
mask = (a<x) & (x<b)
y = np.ones_like(x)
y[mask] = c[mask]

If x and y are higher-dimensional, then your A matrix will also need to be bigger. The basic concept works the same, though.

share|improve this answer
    
I am sorry, you made me realize that explanation was incorrect. A has length "m", but X and Y have length "n". Y can take any value in the third "column" of A. I have updated the question to make it clearer. – jorgehumberto May 28 '15 at 18:37

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.