Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 19,630 public repositories matching this topic...

jnothman
jnothman commented May 12, 2021

We should be using pkg_resources (or importlib.resources if our min Python version is 3.7) instead of uses of __file__.

$ get grep '__file__' sklearn/
sklearn/__check_build/__init__.py:    local_dir = os.path.split(__file__)[0]
sklearn/datasets/_base.py:    module_path = dirname(__file__)
sklearn/datasets/_base.py:    module_path = dirname(__file__)
sklearn/datasets/_base.py:    

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated May 13, 2021
  • Python
dash
pytorch-lightning
kazhang
kazhang commented Jun 24, 2021

🚀 Feature

Log non-matching keys when loading checkpoints in non-strict mode.

Motivation

When load from an older checkpoint or partially initialize the model with pre-trained weights, we call the load_from_checkpoint API with strict=False, but we also want to know which keys are missing and unexpected.

Pitch

When [loading model states](https://github.com/PyTorchLightning/

juergspaak
juergspaak commented Jun 16, 2021

When plotting plt.plot(np.ones(10), np.ones((10,0)) it raises a ZeroDivisionError, which confused me much.

Code for reproduction

import matplotlib.pyplot as plt
import numpy as np

plt.plot(np.ones(10), np.ones((10,0)))

This raises the error:

ZeroDivisionError: integer division or modulo by zero

Expected outcome

I think however, it should either r

gensim
danieldeutsch
danieldeutsch commented Jun 2, 2021

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

nni