Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

These examples take quite a long time to run, and they make our documentation CI fail quite frequently due to timeout. It'd be nice to speed the up a little bit.

To contributors: if you want to work on an example, first have a look at the example, and if you think you're comfortable working on it, please mention which one you're working on.

../examples/model_selection/plot_randomized

Currently, the funnel report percentage is calculated using:
The number at a given funnel step /
Sum(everything in the funnel)

Example from blog:

Here, the Discussed Pricing (900) gets divided by 11900 (sum of all ev

When users starts Serve cluster with a set of options (http options, checkpoint path) and then connects to it with a different set of options, we should either update it, or error out.

Summary

Aesthetically trivial, yet I've spotted a discrepancy with font sizes in our tooltip (front-end + back-end screenshots below).
I believe sections #1 and #2 should have the same font size?

![image](https://user-images.githubusercontent.com/27242399/139825179-4d62e3

Currently max_epochs defaults to 1000:

If both max_epochs and max_steps aren't specified, max_epochs will default to 1000. To enable infinite training, set max_epochs = -1.

As a user, though, I would expect that if I don't specify a specific ending point, the training would continue indefinitely. In my own experiments, when the training cut off at 999 epochs, I was confused, and googling t

In recent versions (can't say from exactly when), there seems to be an off-by-one error in dcc.DatePickerRange. I set max_date_allowed = datetime.today().date(), but in the calendar, yesterday is the maximum date allowed. I see it in my apps, and it is also present in the first example on the DatePickerRange documentation page.

E

@Carreau

in #13206 there is a test that seem to always be skipped. Once merged we could delete it.

Originally posted by @Carreau in ipython/ipython#13206 (comment)

In gensim/models/fasttext.py:

    model = FastText(
        vector_size=m.dim,
        vector_size=m.dim,
        window=m.ws,
        window=m.ws,
        epochs=m.epoch,
        epochs=m.epoch,
        negative=m.neg,
        negative=m.neg,
        # FIXME: these next 2 lines read in unsupported FB FT modes (loss=3 softmax or loss=4 onevsall,
        # or model=3 supervi

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

Data Science

Here are 22,706 public repositories matching this topic...

keras-team / keras

scikit-learn / scikit-learn

apache / superset

GokuMohandas / MadeWithML

microsoft / ML-For-Beginners

CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

donnemartin / data-science-ipython-notebooks

explosion / spaCy

eriklindernoren / ML-From-Scratch

ray-project / ray

academic / awesome-datascience

eugeneyan / applied-ml

streamlit / streamlit

Summary

PyTorchLightning / pytorch-lightning

plotly / dash

AMAI-GmbH / AI-Expert-Roadmap

ipython / ipython

matplotlib / matplotlib

fastai / fastbook

virgili0 / Virgilio

RaRe-Technologies / gensim

afshinea / stanford-cs-229-machine-learning

bharathgs / Awesome-pytorch-list

microsoft / recommenders

d2l-ai / d2l-en

rasbt / python-machine-learning-book

hangtwenty / dive-into-machine-learning

allenai / allennlp

microsoft / nni

0xnr / awesome-bigdata

Related Topics