Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Describe the issue linked to the documentation

The "20 newsgroups text" dataset can be accessed within scikit-learn using defined functions. The dataset contains some text which is considered culturally insensitive.

Suggest a potential alternative/fix

Add a section in the dataset documentation, possibly above the "Recommendation" section called "Data Considerations".
https://

Is your feature request related to a problem? Please describe.
As of a couple months ago, the Elasticsearch organization has made the official python elasticsearch plugin incompatible with Amazon supported OpenSearch. If you fire up Superset using the current helm chart and attempt to connect to a recently deployed AWS "Elasticsearch" - which is now an Apache 2.0 licensed OpenSearch - you wi

From a slack message:

Hi, So I observed that if you deploy a deployment with more replicas than the available resources serve keeps trying to allocate them waiting for autoscaler.

(pid=125021) 2021-09-07 20:52:42,899    INFO http_state.py:75 -- Starting HTTP proxy with name 'pfaUeM:SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-node:192.168.1.13-0' on node 'node:192.168.1.13-0' listening on '12

Summary

If you use a slider in the sidebar with a long description text, the slider value and the description text overlap. See screenshot:

Steps to reproduce

Code snippet:

import streamlit as st

topn_ranking = st.sidebar.s

🚀 Feature

lr_find need unique temporary checkpoint filenames.

Motivation

I'm running a number of experiment in parallel that are saving to the same folder. Thus, they have the same trainer.default_root_dir. However, since they all have the same directory and filename, they are overwriting each other.

Pitch

lr_find temporary checkpoint should have unique filenames.

In recent versions (can't say from exactly when), there seems to be an off-by-one error in dcc.DatePickerRange. I set max_date_allowed = datetime.today().date(), but in the calendar, yesterday is the maximum date allowed. I see it in my apps, and it is also present in the first example on the DatePickerRange documentation page.

E

Minor, non-breaking issue found during review of #13094.

If path of the active virtualenv is a substring of another virtualenv, IPython started from the second one will not fire up any warning.

Example:

virtualenv aaa
virtualenv aaaa
. aaaa/bin/activate
python -m pip install ipython
. aaa/bin/activate
aaaa/bin/ipython

Expected behavior after executing aaaa/bin/ipython:

Bug summary

The only way (that I am aware of) to control the linewidth of hatches is through an rc parameter. But temporarily modifying the parameter with plt.rc_context has not effect.

Code for reproduction

import matplotlib.pyplot as plt

plt.figure().subplots().bar([0, 1], [1, 2], hatch=["/", "."], fc="r")

with plt.rc_context({"hatch.linewidth": 5}):
    plt.

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

Data Science

Here are 21,554 public repositories matching this topic...

keras-team / keras

scikit-learn / scikit-learn

Describe the issue linked to the documentation

Suggest a potential alternative/fix

apache / superset

GokuMohandas / MadeWithML

CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

microsoft / ML-For-Beginners

donnemartin / data-science-ipython-notebooks

explosion / spaCy

eriklindernoren / ML-From-Scratch

ray-project / ray

academic / awesome-datascience

streamlit / streamlit

Summary

Steps to reproduce

PyTorchLightning / pytorch-lightning

🚀 Feature

Motivation

Pitch

plotly / dash

ipython / ipython

eugeneyan / applied-ml

AMAI-GmbH / AI-Expert-Roadmap

matplotlib / matplotlib

Bug summary

Code for reproduction

fastai / fastbook

virgili0 / Virgilio

RaRe-Technologies / gensim

afshinea / stanford-cs-229-machine-learning

bharathgs / Awesome-pytorch-list

rasbt / python-machine-learning-book

microsoft / recommenders

d2l-ai / d2l-en

hangtwenty / dive-into-machine-learning

allenai / allennlp

microsoft / nni

0xnr / awesome-bigdata

Related Topics