Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 17,370 public repositories matching this topic...

ogrisel
ogrisel commented Nov 13, 2020

Most functions in scipy.linalg functions (e.g. svd, qr, eig, eigh, pinv, pinv2 ...) have a default kwarg check_finite=True that we typically leave to the default value in scikit-learn.

As we already validate the input data for most estimators in scikit-learn, this check is redundant and can cause significant overhead, especially at predict / transform time. We should probably a

superset
mistercrunch
mistercrunch commented Feb 16, 2021

BigQuery error is hard to read.

Expected results

In Explore, when creating a bad expression (say DATE_TRUNC(column_that_dont_exist, DAY)) in BigQuery, the DatabaseError is shown as a UnknownError. In SQL Lab, DatabaseErrors are surfaced properly and make sure to use a monospace font so that the formatting is preserved. For most database, the formatting doesn't matter much, but for BigQ

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Feb 18, 2021
  • Python
barakmich
barakmich commented Feb 16, 2021

This is the comment as mentioned here https://github.com/ray-project/ray/pull/14122/files#diff-10f3fda5ddb0ff3dbb8f347dd7fc53101d2dd140585e72f2d55be831bd5455dbR134

What is the problem?

In most cases, a client object's lifetime matches its ID, but this isn't so with named actors. Performance can be improved by reverting this call to non-blocking.

Reproduction outline

Here's how na

dash
wjaskowski
wjaskowski commented Dec 22, 2020

Summary

When a function has print('sth', file=sys.stderr) in the body I get:

InternalHashError: [Errno 2] No such file or directory: '<stderr>'

While caching the body of eval_models_on_all_data(), Streamlit encountered an object of type _io.TextIOWrapper, which it does not know how to hash.

Steps to reproduce

Code snippet:

@st.cache
def f():
   prin
pytorch-lightning
gensim
allenciox
allenciox commented Jan 27, 2021

Unfortunately, I can't find any examples of what it "does" do, but it appears to take a file that is NOT in json format, but has a complete json representation on every single line of the file, with two keys: "text" and "label".

The code showing that is in its _read method:
....
def _read(self, file_path):
with open(cached_path(file_path), "r") as data_file:
for line in