Tagged Questions
1
vote
1answer
25 views
Random Forest interpretation in scikit-learn
I am using sklearn.ensemble.RandomForestRegressor to fit a random forest regressor on a dataset. Now, that I have the results, is it possible to interpret this in some format where I can then ...
1
vote
0answers
54 views
Normalization in multiple-linear regression
I have a data set for which I would like build a multiple linear regression model. In order to compare different independent variable I normalize them by their standard deviation. I used ...
0
votes
0answers
64 views
improve document classification method
I have a program to predict whether a news article is about a certain topic.
There is two main scripts:
1) bow_train.py - generates a wordlist and a model and stores them in two files (arab.model ...
-4
votes
0answers
77 views
Chapman Richards - Python [closed]
I need to learn how I can fit a Chapman Richards model (non linear regression):
y = b0 * (1 - exp(-b1 * x1))**(1 / (1 - b2))
using Numpy/Scipy.
Start parameters are for example b0=31, b1=0.1, ...
1
vote
1answer
38 views
error - Multiple Regression OLS class
I having problem to adjust with ols class the equation
y = b0 + b1x1 + b2x2
The code:
xs = numpy.loadtxt('teste.csv', skiprows=1, dtype=float, delimiter=';',
usecols=(0,1))
y = log(xs[:,0])
x ...
3
votes
1answer
115 views
Time series prediction using support vector regression
I've been trying to implement time series prediction tool using support vector regression in python language. I use SVR module from scikit-learn for non-linear Support vector regression. But I have ...
0
votes
1answer
72 views
python pandas OLS.predict, what is the correct signature?
I am having a pandas OLS model,
mid_lag_lead_df_model
-------------------------Summary of Regression Analysis-------------------------
Formula: Y ~ <1> + <2> + <3> +
Number ...
3
votes
1answer
123 views
Stepwise Regression in Python
How to perform stepwise regression in python? There are methods for OLS in SCIPY but I am not able to do stepwise. Any help in this regard would be a great help. Thanks.
0
votes
2answers
93 views
regression coefficient using numpy
I'm trying to find out the regression coefficient in multiple linear regression.I'm using numpy module for this.I have dependant and independent values.what I've tried is given below
import numpy as ...
4
votes
1answer
163 views
Cumulative OLS with Python Pandas
I am using Pandas 0.8.1, and at the moment I can't change the version. If a newer version will help the problem below, please note it in a comment rather than an answer. Also, this is for a research ...
3
votes
2answers
123 views
How to predict a continuous value (time) from text documents? [closed]
I have about 3000 text documents which are related to a duration of time when the document was "interesting". So lets say document 1 has 300 lines of text with content, which led to a duration of ...
2
votes
2answers
166 views
Ranking a list of emails' Priority
I am trying to produce a simple email ranking program (something like a priority inbox) in Python. Based on the frequency of emails received from senders, so for example have a training set of say ...
2
votes
1answer
140 views
Support Vector - / Logistic - regression: do you have benchmark results for the boston housing data?
I was going to test my implementation of the sklearn support vector regression package by running it on the boston housing prices dataset that ships with sklearn (sklearn.datasets.load_boston).
After ...
2
votes
1answer
179 views
Multiple Logistic Regression in Python
I have a data set as such:
0,1,0,1,1,0,0,1,5
1,1,0,1,1,0,0,1,3
1,1,0,0,1,0,0,1,1
0,1,1,0,1,1,0,0,4
I'm looking for a way to run logistic regression in python which uses several discrete values (0 ...
0
votes
1answer
65 views
fmin_bfgs does not complete
I'm trying to minimize a function with lots of parameters (a little over 7000) using fmin_bfgs() or fmin_l_bfgs_b(). When I enter the command
opt_pars = fmin_l_bfgs_b(obj_f, pars, approx_grad=1)
...
0
votes
1answer
66 views
fmin with really poor performance. fmin_bfgs with precision loss. Minimization does not fit well
currently i'm working on an implementation of Logistic regression. Nothing really complex, just working with a simple dataset (Andrew Ng's house buying prediction). Here is what i'm doing:
My Cost ...
0
votes
0answers
68 views
I need a model that can predict based on multiple variables. How do I get started? [closed]
I have a problem where I have to predict a variable X that is dependent on several other variables a,b,c,d... I have the data containing the values of these variables a,b,c,d.. and also X up to a ...
0
votes
3answers
956 views
Multiple linear regression in python without fitting the origin?
I found this chunk of code on http://rosettacode.org/wiki/Multiple_regression#Python, which does a multiple linear regression in python. Print b in the following code gives you the coefficients of x1, ...
2
votes
1answer
490 views
How is Elastic Net used?
This is a beginner question on regularization with regression. Most information about Elastic Net and Lasso Regression online replicates the information from Wikipedia or the original 2005 paper by ...
7
votes
1answer
408 views
Distinguishing overfitting vs good prediction
These are questions on how to calculate & reduce overfitting in machine learning. I think many new to machine learning will have the same questions, so I tried to be clear with my examples and ...
1
vote
0answers
355 views
python scipy using fmin_bfgs for logistic regression
I use the formula below as my hypothesis:
And the formula below as the cost function:
So the object function I try to minimize is :
And the gradient is:
the csv file is formatted like:
...
5
votes
1answer
2k views
Multivariate (polynomial) best fit curve in python?
How do you calculate a best fit line in python, and then plot it on a scatterplot in matplotlib?
I was I calculate the linear best-fit line using Ordinary Least Squares Regression as follows:
from ...
1
vote
1answer
170 views
Lossy Polynomial Regression using numpy
The issue at hand is I need a mathematical method to model the sign of an set of x,y values. Specifically, I know there are methods to use polynomial regression, however, if I only care about the ...
1
vote
1answer
265 views
Collapse a Pandas multiindex or run OLS regression on a multiindexed dataframe
I used pivot to reshape my data and now have a column multiindex. I want the resulting columns to be the X variables in a simple OLS regression. The Y's are another series with the same row index.
...
1
vote
4answers
309 views
linearRegression() returns list within list (sklearn)
I'm doing multivariate linear regression in Python (sklearn), but for some reason, the coefficients are not correctly returned as a list. Instead, a list IN A LIST is returned:
from sklearn import ...
0
votes
2answers
643 views
Least-Squares Regression of Matrices with Numpy
If this has been answered somewhere I couldn't find, feel free to forum slap me.
I'm looking to calculate least squares linear regression from an N by M matrix and a set of known, ground-truth ...
2
votes
1answer
219 views
Winsorize data in Pandas for Python
I am trying to run a Winsorized regression in Pandas for Python. The very helpful user manual offers this example code:
winz = rets.copy()
std_1year = rolling_std(rets, 250, min_periods=20)
...
0
votes
2answers
601 views
Support Vector Regression with High Dimensional Output using python's libsvm
I would like to ask if anyone has an idea or example of how to do support vector regression in python with high dimensional output( more than one) using a python binding of libsvm? I checked the ...
3
votes
2answers
1k views
Multivariate polynomial regression with numpy
I have many samples (y_i, (a_i, b_i, c_i)) where y is presumed to vary as a polynomial in a,b,c up to a certain degree. For example for a given set of data and degree 2 I might produce the model
y ...
2
votes
1answer
199 views
How can I zero the intercept in a multivariate regression using python?
Is it possible in python/scipy/numpy to zero the intercept of a multivariate regression? I couldn't find it in the OLS recipe (http://www.scipy.org/Cookbook/OLS).
I'd prefer not to have to use ...
0
votes
1answer
445 views
automatically stop scipy.optimize.fmin_bfgs after n function calls (not BFGS iterations!)
I use logistic regression with scipy.optimize.fmin_bfgs for minimizing the cost function. The cost function stays constant for my particular data set and BFGS does not converge, so I want to apply ...
7
votes
2answers
747 views
correct usage of scipy.optimize.fmin_bfgs
I am playing around with logistic regression in Python. I have implemented a version where the minimization of the cost function is done via gradient descent, and now I'd like to use the BFGS ...
6
votes
2answers
1k views
Python Pandas: how to turn a DataFrame with “factors” into a design matrix for linear regression?
If memory servies me, in R there is a data type called factor which when used within a DataFrame can be automatically unpacked into the necessary columns of a regression design matrix. For example, a ...
1
vote
2answers
261 views
simulate data from regression line in python
If I have a regression line and an r squared is there a simple numpy (or some other python library) command to randomly draw, say, y values for an x that are consistent with the regression? The same ...
1
vote
1answer
129 views
simulating data from regression line in python
If I have a regression line and an r squared is there a simple numpy (or some other library) command to randomly draw, say, y values for an x that are consistent with the regression? The same way you ...
2
votes
3answers
1k views
Orthogonal regression fitting in scipy least squares method
The leastsq method in scipy lib fits a curve to some data. And this method implies that in this data Y values depends on some X argument. And calculates the minimal distance between curve and the data ...
1
vote
1answer
594 views
Python scikits - Buffer has wrong number of dimensions (expected 1, got 2)
I am trying this code snippet. I am using scikits.learn 0.8.1
from scikits.learn import linear_model
import numpy as np
num_rows = 10000
X = np.zeros([num_rows,2])
y = np.zeros([num_rows,1])
# assume ...
2
votes
1answer
864 views
Fitting a 3d points of an arc to a circle (regression in Python)
I am relatively new to python. My problem is as follows
I have a set of noisy data points (x,y,z) on an arbitrary plane that forms a 2d arc.
I would like a best fit circle through these points and ...
7
votes
4answers
2k views
Weighted logistic regression in Python
I'm looking for a good implementation for logistic regression (not regularized) in Python. I'm looking for a package that can also get weights for each vector. Can anyone suggest a good implementation ...
0
votes
1answer
222 views
Multilinear Regression using OLS in Python not working with my own data [duplicate]
Possible Duplicate:
Python Multiple Linear Regression using OLS code with specific data?
Alright, I'm working with ols.py from scipy.org. When I input my own variables and try to initiate ...
3
votes
3answers
4k views
Python Multiple Linear Regression using OLS code with specific data?
I am using the osl.py downloaded code at http://www.scipy.org/Cookbook/OLS [the download is in the first paragraph with the bold OLS] but I need to understand rather than using random data for the ols ...
1
vote
1answer
589 views
Regression confidence using SVMs in python
I'm using regression SVMs in python and I am wondering if there is any way to get a "confidence-measure" value for its predictions.
Previously, when using SVMs for binary classification, I was able ...
3
votes
2answers
174 views
regression test dealing with hard coded path
I need to extend a python code which has plenty of hard coded path
In order not to mess everything, I want to create unit-tests for the code before my modifications: it will serve as non-regression ...
1
vote
1answer
783 views
Python or SQL Logistic Regression
Given time-series data, I want to find the best fitting logarithmic curve. What are good libraries for doing this in either Python or SQL?
Edit: Specifically, what I'm looking for is a library that ...
1
vote
3answers
4k views
Multi-variate regression using NumPy in Python?
Is it possible to perform multi-variate regression in Python using NumPy?
The documentation here suggests that it is, but I cannot find any more details on the topic.
1
vote
2answers
1k views
scipy linregress function erroneous standard error return?
I have a weird situation with scipy.stats.linregress seems to be returning an incorrect standard error:
>>> from scipy import stats
>>> x = [5.05, 6.75, 3.21, 2.66]
>>> y = ...
4
votes
2answers
9k views
Multiple regression in Python
I am currently using scipy's linregress function for single regression. I am unable to find if the same library, or another, is able to do multiple regression, that is, one dependent variable and more ...