Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
Here are 17,510 public repositories matching this topic...
Most functions in scipy.linalg functions (e.g. svd, qr, eig, eigh, pinv, pinv2 ...) have a default kwarg check_finite=True that we typically leave to the default value in scikit-learn.
As we already validate the input data for most estimators in scikit-learn, this check is redundant and can cause significant overhead, especially at predict / transform time. We should probably a
Current, in Explore left data panel Columns section, user see
1) actual column name that without a label e.g. job_intr_dataengn(column name, no lable)
or 2) label name of columns e.g. yt_codingtuts360 label(label name of yt_codingtuts360)
from the view.
when user hover on the text, exact same information
change: display actual column name in tooltip, when there's a label
<img wi
-
Updated
Feb 24, 2021 - Jupyter Notebook
-
Updated
Feb 18, 2021 - Python
-
Updated
Feb 28, 2021 - Python
-
Updated
Dec 21, 2020 - Python
-
Updated
Feb 27, 2021
Hi, is there an easy way to get the memory occupied by some object ref?
E.g. ray.sizeof(ray.put(object))
Travis is not going to automatically offer the free tier for all open source projects; We likely want o migrate away from travis.
Setting up github actions to replace travis would be a welcomed contribution.
In recent versions (can't say from exactly when), there seems to be an off-by-one error in dcc.DatePickerRange. I set max_date_allowed = datetime.today().date(), but in the calendar, yesterday is the maximum date allowed. I see it in my apps, and it is also present in the first example on the DatePickerRange documentation page.
E
When you run a Streamlit app, you get a few messages printed on the terminal. In some extreme cases, you'll see these:
We should flip the order of the following CLI messages to:
- Welcome to Streamlit (...)
- For bette
Similar to ModelCheckpoint(verbose=true), we can add verbose_progress_bar trainer flag, to print the logs to the screen after every epoch
-
Updated
Feb 25, 2021 - Jupyter Notebook
Not a high-priority at all, but it'd be more sensible for such a tutorial/testing utility corpus to be implemented elsewhere - maybe under /test/ or some other data- or doc- related module – rather than in gensim.models.word2vec.
Originally posted by @gojomo in RaRe-Technologies/gensim#2939 (comment)
-
Updated
May 20, 2020
-
Updated
Feb 28, 2021
-
Updated
Oct 16, 2020 - Jupyter Notebook
-
Updated
Jul 31, 2020
-
Updated
Feb 25, 2021
While setting train_parameters to False very often we also may consider disabling dropout/batchnorm, in other words, to run the pretrained model in eval mode.
We've done a little modification to PretrainedTransformerEmbedder that allows providing whether the token embedder should be forced to eval mode during the training phase.
Do you this feature might be handy? Should I open a PR?
-
Updated
Jan 25, 2021 - Python
-
Updated
Feb 14, 2021 - JavaScript
-
Updated
Feb 26, 2021 - Python
I'm using mxnet to do some work, but there is nothing when I search the mxnet trial and example.
Current pytorch implementation ignores the argument split_f in the function train_batch_ch13 as shown below.
def train_batch_ch13(net, X, y, loss, trainer, devices):
if isinstance(X, list):
# Required for BERT Fine-tuning (to be covered later)
X = [x.to(devices[0]) for x in X]
else:
X = X.to(devices[0])
...Todo: Define the argument `
-
Updated
Feb 28, 2021 - Jupyter Notebook
-
Updated
Feb 26, 2021 - Python
-
Updated
Feb 26, 2021
- Wikipedia
- Wikipedia

(e.g. for links and images), because some of these examples are now being rendered in the docs.
Added by @fchollet in requests for contributions.