How can i refactor python numpy code? [closed]

Question

Closed. This question needs details or clarity. It is not currently accepting answers.

Want to improve this question? As written, this question is lacking some of the information it needs to be answered. If the author adds details in comments, consider editing them into the question. Once there's sufficient detail to answer, vote to reopen the question.

Closed 3 years ago.

Improve this question

def arr_func(arr,selected_pixels_list): 

        rows = 2 
        m = 0 
        n = 0 
        i =0

        #Calculate the number of pixels selected 
        length_of_the_list = len(selected_pixels_list) 
        length_of_the_list = int(length_of_the_list/4)*4 
        cols = int(length_of_the_list/2) 
        result_arr = np.zeros((rows,cols)) 

        while(i<length_of_the_list): 
            result_arr[m,n] = arr[selected_pixels_list[i]] 
            result_arr[m,n+1] = arr[selected_pixels_list[i+1]] 
            result_arr[m+1,n] = arr[selected_pixels_list[i+2]] 
            result_arr[m+1,n+1] = arr[selected_pixels_list[i+3]] 

            i = i+4 
            m = 0 
            n = n+2 

        return result_arr 

import numpy as np

selected_pixel_data = np.load("coordinates.npy")
arr_data = np.load("arr.npy") 
response = arr_func(arr_data, selected_pixel_data)
print(response)

I try using "for loop" but it is not refractory.

for i in range(0,len(selected_pixels_list),4):
    n=i//2
    result_arr[m,n] = arr[selected_pixels_list[i]] 
    result_arr[m,n+1] = arr[selected_pixels_list[i+1]] 
    result_arr[m+1,n] = arr[selected_pixels_list[i+2]] 
    result_arr[m+1,n+1] = arr[selected_pixels_list[i+3]]

For selected_pixel_data:
shape = (597616, 2)
dtype = int32

For arr_data:
shape = (1064, 590)
dtype = float64

Here arr_data is an array of data and selected_pixel_data is coordinates. The function arr_func is used to create a new array with selected coordinates. Is there any way to use the code more efficiently?

What does the code do? The title should state the purpose of the application. Thanks. — ggorlen
– ggorlen, Commented Apr 19, 2022 at 19:36
arbitrarily assign height as half of the length is a very bad idea; One argument is numpy.array but the other is a nested list, this is a bad practice, both should be arrays then you can get the resolution of the second argument by simply calling its shape attribute... — Ξένη Γήινος
– Ξένη Γήινος, Commented Apr 20, 2022 at 12:51

Reinderien · Accepted Answer · 2022-04-19 22:32:15Z

Add PEP484 type hints.

int(length/4)*4 can just use floor division, as in length//4*4.

Use np.empty instead of np.zeros.

Don't surround (i<length_of_the_list) in parens.

The first layer of simplification is identifying the repeating patterns in your indices and rewriting your loop such that it does index math and only one inner assignment:

def arr_func_sequential(arr: np.ndarray, selected_pixels: np.ndarray) -> np.ndarray:
    n_selected = len(selected_pixels) // 4 * 4
    result_arr = np.empty((2, n_selected // 2))

    for i in range(0, n_selected, 4):
        n = i // 2
        for d in range(4):
            j, k = selected_pixels[i + d]
            result_arr[d // 2, n + d % 2] = arr[j, k]

    return result_arr

But this is not nearly enough. You should do vectorised indexing. For your stated shapes, this should be equivalent:

def arr_func_vectorised(arr: np.ndarray, selected_pixels: np.ndarray) -> np.ndarray:
    flat = arr[selected_pixels[:, 0], selected_pixels[:, 1]]
    return flat.reshape((-1, 2, 2)).swapaxes(0, 1).reshape((2, -1))

Including basic regression tests, this looks like

import numpy as np
from numpy.random import default_rng


def arr_func_sequential(arr: np.ndarray, selected_pixels: np.ndarray) -> np.ndarray:
    n_selected = len(selected_pixels) // 4 * 4
    result_arr = np.empty((2, n_selected // 2))

    for i in range(0, n_selected, 4):
        n = i // 2
        for d in range(4):
            j, k = selected_pixels[i + d]
            result_arr[d // 2, n + d % 2] = arr[j, k]

    return result_arr


def arr_func_vectorised(arr: np.ndarray, selected_pixels: np.ndarray) -> np.ndarray:
    flat = arr[selected_pixels[:, 0], selected_pixels[:, 1]]
    return flat.reshape((-1, 2, 2)).swapaxes(0, 1).reshape((2, -1))


def test() -> None:
    rand = default_rng(seed=0)
    arr_size = 1_064, 590
    arr_data = rand.random(size=arr_size, dtype=np.float64)
    selected_pixel_data = rand.integers(arr_size, size=(597_616, 2), dtype=np.int32)

    for method in (arr_func_sequential, arr_func_vectorised):
        response = method(arr_data, selected_pixel_data)

        assert response.shape == (2, 298_808)
        assert np.isclose(6.867644672947648e-07, response.min())
        assert np.isclose(0.9999967667212489, response.max())
        assert np.isclose(0.4996104177145426, response.mean())
        assert np.allclose(response[:, 0], np.array((0.12815171, 0.05691355)))
        assert np.allclose(response[:, -1], np.array((0.27512743, 0.60253044)))


if __name__ == '__main__':
    test()

Stack Exchange Network

How can i refactor python numpy code? [closed]

1 Answer 1

Linked

Hot Network Questions

How can i refactor python numpy code? [closed]

1 Answer 1

Linked

Related

Hot Network Questions