Get NumPy Array Indices in Array B for Unique Values in Array A, for Values Present in Both Arrays, Aligned with Array A

Question

I have two NumPy arrays:

A = asarray(['4', '4', '2', '8', '8', '8', '8', '8', '16', '32', '16', '16', '32'])
B = asarray(['2', '4', '8', '16', '32'])

I want a function that takes A, B as parameters, and returns the index in B for each value in A, aligned with A, as efficiently as possible.

These are the outputs for the test case above:

indices = [1, 1, 0, 2, 2, 2, 2, 2, 3, 4, 3, 3, 4]

I've tried exploring in1d(), where(), and nonzero() with no luck. Any help is much appreciated.

Edit: Arrays are strings.

rtrwalker · Accepted Answer · 2013-07-12 00:58:10Z

up vote 0 down vote accepted

I'm not sure how efficient this is but it works:

import numpy as np
A = np.asarray(['4', '4', '2', '8', '8', '8', '8', '8', '16', '32', '16', '16', '32'])
B = np.asarray(['2', '4', '8', '16', '32'])
idx_of_a_in_b=np.argmax(A[np.newaxis,:]==B[:,np.newaxis],axis=0)
print(idx_of_a_in_b)

from which I get:

[1 1 0 2 2 2 2 2 3 4 3 3 4]

answered Jul 12 '13 at 0:58

rtrwalker

78829

This seems to be the one! Thanks! – Will Jul 16 '13 at 20:21

Note: this solution is quadratic in terms of the input side, which is not ideal. – Eelco Hoogendoorn Apr 2 at 15:44

add a comment |

Ophion · Answer 2 · 2013-07-10 22:17:44Z

up vote 3 down vote

You can also do:

>>> np.digitize(A,B)-1
array([1, 1, 0, 2, 2, 2, 2, 2, 3, 4, 3, 3, 4])

According to the docs you should be able to specify right=False and skip the minus one part. This does not work for me, likely due to a version issue as I do not have numpy 1.7.

Im not sure what you are doing with this, but a simple and very fast way to do this is:

>>> A = np.asarray(['4', '4', '2', '8', '8', '8', '8', '8', '16', '32', '16', '16', '32'])
>>> B,indices=np.unique(A,return_inverse=True)
>>> B
array(['16', '2', '32', '4', '8'],
      dtype='|S2')
>>> indices
array([3, 3, 1, 4, 4, 4, 4, 4, 0, 2, 0, 0, 2])

>>> B[indices]
array(['4', '4', '2', '8', '8', '8', '8', '8', '16', '32', '16', '16', '32'],
      dtype='|S2')

The order will be different, but this can be changed if needed.

edited Jul 10 '13 at 22:17

answered Jul 10 '13 at 16:34

Ophion

9,07122348

1

You are implicitly relying in B being sorted. – Jaime Jul 10 '13 at 17:04

1

But other than that, which is easily solved, e.g. as in my answer, this is faster than np.searchsorted, so +1. – Jaime Jul 10 '13 at 17:08

Let me further complicate matters by saying A and B are arrays of strings :( Apparently digitize() doesn't like. – Will Jul 10 '13 at 21:03

1

Is B always the unique array of A? – Ophion Jul 10 '13 at 22:10

Actually, yes. B is always the unique of A. – Will Jul 10 '13 at 22:16

add a comment |

ovgolovin · Answer 3 · 2013-07-10 09:56:45Z

up vote 1 down vote

For such things it is important to have lookups in B as fast as possible. Dictionary provides O(1) lookup time. So, first of all, let us construct this dictionary:

>>> indices = dict((value,index) for index,value in enumerate(B))
>>> indices
{8: 2, 16: 3, 2: 0, 4: 1, 32: 4}

And then just go through A and find corresponding indices:

>>> [indices[item] for item in A]
[1, 1, 0, 2, 2, 2, 2, 2, 3, 4, 3, 3, 4]

answered Jul 10 '13 at 9:56

ovgolovin

7,38921853

Thanks, this is great. But, is there any way to do it in NumPy-C-happy-land? {dict: comprehension} seems a bit faster as well if we went with this route. Is there no nice NumPy way to do it without having to pass a dict around? – Will Jul 10 '13 at 10:13

1

@Will If B is large, it's important to have O(1) lookup complexity. I'm not familiar with numpy, but perfunctory search didn't yield any references to dict analogs in numpy. If B is small, it may be faster to do everything inside numpy. If so, wait for another answers, may be someone will be able to come up with all-in-numpy solution. – ovgolovin Jul 10 '13 at 10:20

add a comment |

Jaime · Answer 4 · 2013-07-10 16:58:45Z

I think you can do it with np.searchsorted:

>>> A = asarray([4, 4, 2, 8, 8, 8, 8, 8, 16, 32, 16, 16, 32])
>>> B = asarray([2, 8, 4, 32, 16])
>>> sort_b = np.argsort(B)
>>> idx_of_a_in_sorted_b = np.searchsorted(B, A, sorter=sort_b)
>>> idx_of_a_in_b = np.take(sort_b, idx_of_a_in_sorted_b)
>>> idx_of_a_in_b
array([2, 2, 0, 1, 1, 1, 1, 1, 4, 3, 4, 4, 3], dtype=int64)

Note that B is scrambled from your version, thus the different output. If some of the items in A are not in B (which you could check with np.all(np.in1d(A, B))) then the return indices for those values will be crap, and you may even get an IndexError from the last line (if the largest value in A is missing from B).

Eelco Hoogendoorn · Answer 5 · 2016-04-02 20:48:46Z

up vote 1 down vote

The numpy_indexed package (disclaimer: I am its author) implements a solution along the same lines as Jaime's solution; but with a nice interface, tests, and a lot of related useful functionality:

import numpy_indexed as npi
print(npi.indices(B, A))

edited Apr 2 at 20:48

answered Apr 2 at 15:26

Eelco Hoogendoorn

3,68311426

1

You keep posting almost identical answers pointing at your utility, not being clear about your affiliation to the linked repo. To keep them from getting flagged as spam, you should take the steps described in: How can I link to an external resource in a community-friendly way? – Mogsdad Apr 2 at 15:36

Thanks for the heads-up, but are you sure these linked conditions apply? This isn't a 'product or website' I am linking, but rather an open-source project. Mentioning my authorship under those circumstances feels more like self-promotion than useful information. – Eelco Hoogendoorn Apr 2 at 15:46

Based on similar feedback I have decided to add a disclaimer; thanks again. – Eelco Hoogendoorn Apr 2 at 20:48

add a comment |

asked	2 years ago
viewed	657 times
active	2 months ago

current community

your communities

more stack exchange communities

Get NumPy Array Indices in Array B for Unique Values in Array A, for Values Present in Both Arrays, Aligned with Array A

5 Answers 5

Your Answer

Not the answer you're looking for? Browse other questions tagged python arrays numpy scipy or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Get NumPy Array Indices in Array B for Unique Values in Array A, for Values Present in Both Arrays, Aligned with Array A

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python arrays numpy scipy or ask your own question.

Related

Hot Network Questions