3

I am rather new to python and I am using the pandas library to work with data frames. I created a function "elevate_power" that reads in a data frame with one column of floating point values (example x) , and a degree (example lambda), and outputs a dataframe where each column contains a power of the original column (example: output is x,x^2,x^3)

The problem is when I have a degree that is above 30, I get overflow error. Is there a way around this problem ?

I am not particularly worried about the precision, so I would not mind loosing some precision.

However, (and this is important), I need the output to be type float because I then call some numpy subroutines that give me errors if I change the type.

I have tried several tricks: for example I tried using decimal inside the function but then I cannot get the format back to floats, which is a problem because then I get errors when I call dot product and linear algebra solvers from numpy.

Any suggestion will be greatly appreciated,

This is the test code (which I ran with a low degree value so it won't crash):

def elevate_power(column, degree):
    df = pd.DataFrame(column)
    dfbase=df
    if degree > 0:
        for power in range(2, degree+1): 
            # first we'll give the column a name:
            name = 'power_' + str(power)
            df[name]= 0           
            df[name] = dfbase.apply(lambda x: x**power , axis=1)
    return(df)

   import pandas as pd
   import numpy as np
   test= pd.Series([1., 2., 3.])
   test2=pd.DataFrame(test)
   degree=5
   print elevate_power(test2, degree )
   np.dot(test2['power_2'],test2['power_3'])

The printout is :

   0  power_2  power_3  power_4  power_5
0  1        1        1        1        1
1  2        4        8       16       32
2  3        9       27       81      243

276.0
2
  • For the overflow I recommend to use the identity exp(ln(a^b)) = exp(b * ln(a)). Hence compute the matrix b*ln(a). Later you take the elementwise exponential if you really have to... Commented Mar 18, 2016 at 14:52
  • Use np.vander instead of your loop Commented Mar 18, 2016 at 16:46

1 Answer 1

1

How about

import pandas as pd
import numpy as np
series = [1., 2., 3.]
degree = 5

a = pd.DataFrame({"power_" + str(power): np.power(series, power) for power in range(1, degree+1)})
print(a)
print(a.dtypes)

results in floats for me

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks, that works perfectly well! i can increase the degree, and there is no overflow error. Actually, I don't really understand why the overflow error disappeared with this way. But I am happy it is works. Thank you very much.
One last question, though Is it possible to do it in a loop; because I noticed the order of the dataframe is power_1 power_10 power_11... .. power_2 power_21 instead of power_1 power_2 power_3 ... power_10 That means I would need to do some kind of sort now, because I do need them in the true order how they came in
Well, you can get away with str(power).zfill(2)
Are you doing this to fit polynomials of degree n to your data? There is np.vander. Watch out for this. This will construct your matrix....
the .zfill(2) did the trick. Thanks again! Yes, I'm fitting polynomials. And I just looked up the fuction vander you mentioned. This is indeed what I am doing. Thanks for pointing that out.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.