Python lazy evaluation numpy ndarray

Question

I have a large 2D array that I would like to declare once, and change occasionnaly only some values depending on a parameter, without traversing the whole array.

To build this array, I have subclassed the numpy ndarray class with dtype=object and assign to the elements I want to change a function e.g. :

def f(parameter):
     return parameter**2

for i in range(np.shape(A)[0]):
    A[i,i]=f
    for j in range(np.shape(A)[0]):
        A[i,j]=1.

I have then overridden the __getitem__ method so that it returns the evaluation of the function with given parameter if it is callable, otherwise return the value itself.

    def __getitem__(self, key):
        value = super(numpy.ndarray, self).__getitem__(key)
        if callable(value):
            return value(*self.args)
        else:
            return value

where self.args were previously given to the instance of myclass.

However, I need to work with float arrays at the end, and I can't simply convert this array into a dtype=float array with this technique. I also tried to use numpy views, which does not work either for dtype=object.

Do you have any better alternative ? Should I override the view method rather than getitem ?

Edit I will maybe have to use Cython in the future, so if you have a solution involving e.g. C pointers, I am interested.

It is an interesting approach, but I'm not sure that numpy arrays are suited for it. In general when you work with numpy, you would use vectorized operations using full arrays or slices, not element by element access. Subclassing ndarrays the way you do, you essentially lose all advantage of fast numpy operations. You might be better of just creating your own class from zero ans save everything, into pure python structures (lists etc). Performance wise it's going to be comparable. Why do you really need lazy evaluation? You can change only some elements efficiently with fancy indexing. — rth, Commented Jun 18, 2015 at 19:35
Do you only have a single function f? With constant arguments? — Bas Swinckels, Commented Jun 18, 2015 at 21:05
Are you familiar with scipy.sparse? The dok format is a dictionary, with the (i,j) tuple as keys. That and lil (list of lists) are the 2 fastest ways of accessing/changing selected items. — hpaulj, Commented Jun 19, 2015 at 6:34
@hpaulj : dok is very interesting. However, I cannot use it with dtype=object, as in the example I showed above : github.com/scipy/scipy/issues/2528 — Damlatien, Commented Jun 19, 2015 at 7:46
@rth: The reason I need lazy evaluation rather than accessing the array with key (even efficiently), is that each affectation might be related to different kind of indices. For the example above, I only set the diagonal to be variable. I could have for instance also affected one row (or smth more complicated) to an other function g. — Damlatien, Commented Jun 19, 2015 at 7:53

rth · Accepted Answer · 2015-06-19 10:18:12Z

In this case, it does not make sens to bind a transformation function, to every index of your array.

Instead, a more efficient approach would be to define a transformation, as a function, together with a subset of the array it applies to. Here is a basic implementation,

import numpy as np

class LazyEvaluation(object):
    def __init__(self):
        self.transforms = []

    def add_transform(self, function, selection=slice(None), args={}):
        self.transforms.append( (function, selection, args))

    def __call__(self, x):
        y = x.copy() 
        for function, selection, args in self.transforms:
            y[selection] = function(y[selection], **args)
        return y

that can be used as follows:

x = np.ones((6, 6))*2

le = LazyEvaluation()
le.add_transform(lambda x: 0, [[3], [0]]) # equivalent to x[3,0]
le.add_transform(lambda x: x**2, (slice(4), slice(4,6)))  # equivalent to x[4,4:6]
le.add_transform(lambda x: -1,  np.diag_indices(x.shape[0], x.ndim), ) # setting the diagonal 
result =  le(x)
print(result)

which prints,

array([[-1.,  2.,  2.,  2.,  4.,  4.],
       [ 2., -1.,  2.,  2.,  4.,  4.],
       [ 2.,  2., -1.,  2.,  4.,  4.],
       [ 0.,  2.,  2., -1.,  4.,  4.],
       [ 2.,  2.,  2.,  2., -1.,  2.],
       [ 2.,  2.,  2.,  2.,  2., -1.]])

This way you can easily support all advanced Numpy indexing (element by element access, slicing, fancy indexing etc.), while at the same time keeping your data in an array with a native data type (float, int, etc) which is much more efficient than using dtype='object'.

Thanks, I have implemented this basically as a subclass of dict, and it works pretty much the way I want. However, just as a curiosity, I would like to know if it is possible to realize something similar in C/C++ with pointers in a much more elegant way ? For example, one could declare a table of (float) pointers, and at declaration time each pointer would direct to either zero, or the result of a function, so that I can update the matrix just by calling one (or several) function. — Damlatien, Commented Jun 25, 2015 at 14:42
But to do that, reach call of a function would have to be bound to one particular pointer, and I am not sure it is doable. I hope I was clear enough. — Damlatien, Commented Jun 25, 2015 at 14:42

Collectives™ on Stack Overflow

Python lazy evaluation numpy ndarray

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
arrays
numpy
multidimensional-array
lazy-evaluation
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythonarraysnumpymultidimensional-arraylazy-evaluation or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
arrays
numpy
multidimensional-array
lazy-evaluation
or ask your own question.