0

I have what I thought would be a simple task in numpy, but I'm having trouble.

I have a function which takes an index in the array and returns the value that belongs at that index. I would like to, efficiently, write the values into a numpy array.

I have found numpy.fromfunction, but it doesn't behave remotely like the documentation suggests. It seems to "vectorise" the function, which means that instead of passing the actual indices it passes a numpy array of indices:

def vsin(i):
    return float(round(A * math.sin((2 * pi * wf) * i)))

numpy.fromfunction(vsin, (len,), dtype=numpy.int16)
# TypeError: only length-1 arrays can be converted to Python scalars

(if we use a debugger to inspect i, it is a numpy.array instance.)

So, if we try to use numpy's vectorised sin function:

def vsin(i):
    return (A * numpy.sin((2 * pi * wf) * i)).astype(numpy.int16)

numpy.fromfunction(vsin, (len,), dtype=numpy.int16)

We don't get a type error, but if len > 2**15 we get discontinuities chopping accross our oscillator, because numpy is using int16_t to represent the index!

The point here isn't about sin in particular: I want to be able to write arbitrary python functions like this (whether a numpy vectorised version exists or not) and be able to run them inside a tight C loop (rather than a roundabout python one), and not have to worry about integer wraparound.

Do I really have to write my own cython extension in order to be able to do this? Doesn't numpy have support for running python functions once per item in an array, with access to the index?

It doesn't have to be a creation function: I can use numpy.empty (or indeed, reuse an existing array from somewhere else.) So a vectorised transformation function would also do.

2
  • FYI, for now I am just running a python loop, which isn't too slow for the small arrays I'm dealing with initially. Commented May 30, 2015 at 14:38
  • Why aren't you using something like vsin(np.arange(1000)) or vsin(np.linspace(0,4,100)? Look at fromfunction code. All it does is vsin(np.indices({len,)). If indices does not produce the right i values, don't use it. Commented May 30, 2015 at 16:42

1 Answer 1

0

I think the issue of integer wraparound is unrelated to numpy's vectorized sin implementation and even the use of python or C.

If you use a 2-byte signed integer and try to generate an array of integer values ranging from 0 to above 32767, you will get a wrap-around error. The array will look like:

[0, 1, 2, ... , 32767, -32768, -32767, ...]

The simplest solution, assuming memory is not too tight, is to use more bytes for your integer array generated by fromfunction so you don't have a wrap-around problem in the first place (up to a few billion):

numpy.fromfunction(vsin, (len,), dtype=numpy.int32)

numpy is optimized to work fast on arrays by passing the whole array around between vectorized functions. I think in general the numpy tools are inconvenient for trying to run scalar functions once per array element.

1
  • Thanks for your answer. I am not trying to represent any values bigger than INT16_MAX in my array, but the array will be longer than INT16_MAX. With Cython, I can write a function which uses a C iterator rather than a python one, which speeds up jobs like this. I assumed numpy would have something similar built in, but I was apparently wrong. Commented Jun 6, 2015 at 15:45

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.