Cholskey Decompsition for Linear Regression #1

danielhanchen · 2018-08-28T16:32:23Z

I tried N = 1,000,000 and P = 100, and Cholskey blew me away!!! (it took a whopping 400ms... YES milli seconds) to fit.

Goodness. PyTorch / Numba SVD takes at least 2 seconds!
The issue with Cholskey is STABILITY. If a matrix is near singular, the results will be horrible. So, need to do the following:

Add cholskey_solve
Add regularization default NOT = 0, but 0.0001 or something to enforce stability (Ridge Regression theory saying XTX+alpha is always invertible for alpha > 0.
Use cholskey_solve to make cholskey_stats for Conf, Pred Intervals etc.

danielhanchen · 2018-08-29T05:02:48Z

I've started solving this problem.
Solve least squares problem X*theta_hat = y using Cholesky Decomposition.

|  Method   |   Operations    | Factor * np^2 |
|-----------|-----------------|---------------|
| Cholesky  |   1/3 * np^2    |      1/3      |
|    QR     |   p^3/3 + np^2  |   1 - p/3n    |
|    SVD    |   p^3   + np^2  |    1 - p/n    |

NOTE: HyperLearn's implementation of Cholesky Solve uses L2 Regularization to enforce stability.
Cholesky is known to fail on ill-conditioned problems, so adding L2 penalties helpes it.

Note, the algorithm in this implementation is as follows:

    alpha = dtype(X).decimal    [1e-6 is float32]
    while failure {
        solve cholesky ( XTX + alpha*identity )
        alpha *= 10
    }

If MSE (Mean Squared Error) is abnormally high, it might be better to solve using stabler but
slower methods like qr_solve, svd_solve or lstsq.

https://www.quora.com/Is-it-better-to-do-QR-Cholesky-or-SVD-for-solving-least-squares-estimate-and-why

danielhanchen added the enhancement label Aug 28, 2018

danielhanchen self-assigned this Aug 28, 2018

danielhanchen added the nearly-done label Aug 29, 2018

danielhanchen / hyperlearn

Cholskey Decompsition for Linear Regression #1

Cholskey Decompsition for Linear Regression #1

danielhanchen commented Aug 28, 2018

danielhanchen commented Aug 29, 2018

danielhanchen / hyperlearn

Join GitHub today

Cholskey Decompsition for Linear Regression #1

Cholskey Decompsition for Linear Regression #1

Comments

danielhanchen commented Aug 28, 2018

danielhanchen commented Aug 29, 2018