Build array from other array and table of values (Python)

Question

I have a table of values stored into a list of lists like:

A = [   [a[1],b[1],c[1]],
        [a[2],b[2],c[2]],
        ...

        [a[m],b[m],c[m]]]

with
a[i] < b[1]
b[i] < a[i+1]
0 < c[i] < 1

and a numpy array such as:

 X = [x[1], x[2], ..., x[n]]

I need to create an array

 Y = [y[1], y[2], ..., y[n]]

where each value of Y will correspond to

for i in [1,2, ..., n]:
  for k in [1,2, ..., m]:
     if a[k] <  x[i] < b[k]:
         y[i] = c[k]
     else:
         y[i] = 1

Please note that X and Y have the same length, but A is totally different. Y can take any value in the third column of A (c[k] for k= 1,2,... m), as long as a[k] < x[i] < b[k] is met (for k= 1,2,... m and for i= 1,2,... n).

In the actual cases I am working on, n = 6789 and m = 6172.

I could do the verification using nested "for" cycles, but it is really slow. What is the fastest way to accomplish this? what if X and Y where 2D numpy arrays?

SAMPLE DATA:

a = [10, 20, 30, 40, 50, 60, 70, 80, 90]
b = [11, 21, 31, 41, 51, 61, 71, 81, 91]
c = [ 0.917,  0.572,  0.993 ,  0.131,  0.44, 0.252 ,  0.005,  0.375,  0.341]

A = A = [[d,e,f] for d,e,f in zip(a,b,c)]

X = [1, 4, 10.2, 20.5, 25, 32, 41.3, 50.5, 73]

EXPECTED RESULTS:

Y = [1, 1, 0.993, 0.132, 1, 1, 1, 0.375, 1 ]

Why would you do zip([1,2, ..., n],[1,2, ..., m])? It seems likely that that doesn't do what you think it does. — user2357112, May 28 '15 at 18:42
@user2357112: you are indeed correct, I have updated the question. thanks. — jorgehumberto, May 28 '15 at 18:47
The new version still looks wrong. Each y[i] value gets overwritten over and over. — user2357112, May 28 '15 at 18:47
@Divakar: Perfect, thanks! took 3.5 s to create the array (when expanded X and Y to a 2D array), instead of the few minutes it would take when I iterated over all elements. — jorgehumberto, Jun 8 '15 at 19:08

Divakar · Accepted Answer · 2015-06-02 20:06:44Z

Approach #1: Using brute-force comparison with broadcasting -

import numpy as np

# Convert to numpy arrays
A_arr = np.array(A)
X_arr = np.array(X)

# Mask that represents "if a[k] <  x[i] < b[k]:" for all i,k
mask = (A_arr[:,None,0]<X_arr) & (X_arr<A_arr[:,None,1])

# Get indices where the mask has 1s, i.e. the conditionals were satisfied
_,C = np.where(mask)

# Setup output numpy array and set values in it from third column of A 
# that has conditionals satisfied for specific indices
Y = np.ones_like(X_arr)
Y[C] = A_arr[C,2]

Approach #2: Based on binning with np.searchsorted -

import numpy as np

# Convert A to 2D numpy array
A_arr = np.asarray(A)

# Setup intervals for binning later on 
intv = A_arr[:,:2].ravel()

# Perform binning & get interval & grouped indices for each X 
intv_idx = np.searchsorted(intv, X, side='right')
grp_intv_idx = np.floor(intv_idx/2).astype(int)

# Get mask of valid indices, i.e. X elements are within grouped intervals
mask = np.fmod(intv_idx,2)==1

# Setup output array 
Y = np.ones(len(X))

# Extract col-3 elements with grouped indices and valid ones from mask
Y[mask] = A_arr[:,2][grp_intv_idx[mask]]

# Remove (set to 1's) elements that fall exactly on bin boundaries
Y[np.in1d(X,intv)] = 1

Please note that if you need the output as a list, you can convert the numpy array to a list with a call like this - Y.tolist().

Sample run -

In [480]: A
Out[480]: 
[[139.0, 355.0, 0.5047342078960846],
 [419.0, 476.0, 0.3593886192040009],
 [580.0, 733.0, 0.3137694021600973]]

In [481]: X
Out[481]: [555, 689, 387, 617, 151, 149, 452]

In [482]: Y
Out[482]: 
array([ 1.        ,  0.3137694 ,  1.        ,  0.3137694 ,  0.50473421,
        0.50473421,  0.35938862])

perimosocordiae · Answer 2 · 2015-05-28 17:22:28Z

up vote 0 down vote

With 1-d arrays, it's not too bad:

a,b,c = np.array(A).T
mask = (a<x) & (x<b)
y = np.ones_like(x)
y[mask] = c[mask]

If x and y are higher-dimensional, then your A matrix will also need to be bigger. The basic concept works the same, though.

answered May 28 '15 at 17:22

perimosocordiae

7,24073347

I am sorry, you made me realize that explanation was incorrect. A has length "m", but X and Y have length "n". Y can take any value in the third "column" of A. I have updated the question to make it clearer. – jorgehumberto May 28 '15 at 18:37

add a comment |

asked	1 year ago
viewed	128 times
active	11 months ago

current community

your communities

more stack exchange communities

Build array from other array and table of values (Python)

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged arrays performance python-2.7 numpy or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Build array from other array and table of values (Python)

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged arrays performance python-2.7 numpy or ask your own question.

Related

Hot Network Questions