scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

We run doctests on a CI environment with many packages installed:

https://github.com/dask/dask/blob/a87236041ff223363698f65c5b7c153415a41259/.github/workflows/additional.yml#L80:L104

So we probably don't need to skip as many as we do (grep for # doctest: +SKIP).

Bug/Feature Request Description

In [1]: import featuretools as ft                                                                                                                             

In [2]: es = ft.demo.load_mock_customer(return_entityset=True)                                                                                                

In [3]: import pandas as pd

The problem I want to use auto-sklearn on is a time-series. Can we modify sklearn to include cv with time series?

We should refactor the default settings in _config.

Currently, _config has two things related to default settings:

default "inhabitants" of key scitypes, within and outside sktime
default parameter settings for each individual concrete estimator

I would go and locate these within the boundaries of their "natural concern":

default parameter settings as an inspectable attrib

Hi! I was a bit surprised by the name of CVSplit. The name suggests that it is responsible for cross-validation, but the documentation reveals that it only trains and validates on one split. That doesn't match my understanding of cross-validation. It could [technically](https://en.m.wikipedia.org/wiki/Cross-va

When running TabularPredictor.fit(), I encounter a BrokenPipeError for some reason.
What is causing this?
Could it be due to OOM error?

Fitting model: XGBoost ...
-34.1179 = Validation root_mean_squared_error score
10.58s = Training runtime
0.03s = Validation runtime
Fitting model: NeuralNetMXNet ...
-34.2849 = Validation root_mean_squared_error score
43.63s =

Description

I'm the creator and only maintainer of the project at the moment. I'm working on adding new features and thus I would like to let this issue open for newcomers who want to contribute to the project.

Basically, I wrote the cli using argparse since it is part of the standard language already. However, I'm starting to rethin

What's wrong?

In issue #422/#423, users brought up that it's not clear from the error messages that you must fit before convert for most models.

We suspect that with KNN we could maybe also work if the model is not trained, but in general (e.g., with RandomForests) this won't work.

We need help documenting this, and also generating proper error messages. (You can see an example of an unhelpful error mess

Support Series.between

I think it could be useful, when one wants to plot only e.g. class 1, to have an option to produce consistent plots for both plot_cumulative_gain and plot_roc

At the moment, instead, only plot_roc supports such option.

Thanks a lot

Our xgboost models use the binary:logistic' objective function, however the m2cgen converted version of the models return raw scores instead of the transformed scores.

This is fine as long as the user knows this is happening! I didn't, so it took a while to figure out what was going on. I'm wondering if perhaps a useful warning could be raised for users to alert them of this issue? A warning

Details in discussion mljar/mljar-supervised#421

As pointed out in scikit-learn-contrib/metric-learn#307 (comment), the current example for SCML_Supervised is in the weakly supervised setting.

scikit-learn

Here are 1,554 public repositories matching this topic...

apachecn / AiLearning

donnemartin / data-science-ipython-notebooks

dask / dask

EpistasisLab / tpot

Yorko / mlcourse.ai

alteryx / featuretools

Bug/Feature Request Description

automl / auto-sklearn

ml-tooling / best-of-ml-python

alan-turing-institute / sktime

skorch-dev / skorch

awslabs / autogluon

DistrictDataLabs / yellowbrick

nidhaloff / igel

Description

TarrySingh / Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials

biolab / orange3

microsoft / hummingbird

mars-project / mars

reiinakano / scikit-plot

BayesWitnesses / m2cgen

juliangaal / python-cheat-sheet

ClimbsRocks / auto_ml

mljar / mljar-supervised

modAL-python / modAL

reiinakano / xcessiv

scikit-learn-contrib / metric-learn

lensacom / sparkit-learn

databricks / spark-sklearn

nok / sklearn-porter

jrieke / traingenerator

robertmartin8 / MachineLearningStocks

Related Topics