pydata
Here are 71 public repositories matching this topic...
Series.reindex
Implement Series.reindex.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.reindex.html
IEX has a free plan which offers 500.000 messages per month, the next cheapest paid plan gets you 5.000.000 per month.
Their API offers a way to only retrieve adjusted close for historical data, which will return a df with date, close, and volume only. open, low and high are dropped and not transmitted so you won't be billed for it. This will save you 50% messages if you don't need those values
When workers die or halted it can be useful to see when that worker was last seen by the scheduler. We should bubble this information up to the dashboard
client.scheduler_info()
'tcp://172.17.0.2:43161': {'type': 'Worker',
'id': 'tcp://172.17.0.2:43161',
'host': '172.17.0.2',
'resources': {},
'local_directory': '/notebooks/dask-worker-space/worker-k540965f',
'name':
It would be nice to add a tutorial(s) that reproduces the Matrix Profile Top Ten paper. The accompanying data at their Google sites page can be found here.
It might be best to make the individual top ten sections as separate items (i.e., sub-list) that rolls under one tutorial
UPDATE FROM MAINTAINERS: ANYBODY WHO IS INTERESTED IN THIS ISSUE, PLEASE SEE THIS COMMENT FOR PROPOSED CODE.
Brief Description
clean_names method does not work when an integer is used as column name
Minimally Reproducible Code
rankings = {
"countries_to_play_cricket": [
"Indi
Machine learning with scikit-learn tutorial at PyData Chicago 2016
-
Updated
Jul 17, 2019 - Jupyter Notebook
Notebooks for the Seattle PyData 2017 talk on Scattertext
-
Updated
Dec 6, 2019 - HTML
Problem description
The distributed scheduler usually relies on knowledge about the size of the computation result and based on this makes certain scheduling decisions (e.g. work stealing). Our main data class, the MetaPartition should implement a __sizeof__ which performs a deep size calculation (including data frames, indices, etc.) too give the scheduler the best chance on making the
In trying to write tests for #189, I'm finding very difficult to add columns to existing tests, as in some cases like the all_types table, the table is defined in a separate file than the tests and multiple tests try to write to the same table.
Additionally, our test suite doesn't prove that the data that are uploaded are the same as the data downloaded for all types.
We should consider m
Repo for my talk at the PyData Berlin 2017 conference
-
Updated
Dec 1, 2019 - Jupyter Notebook
Consider looking into pandas.applymap() (may or may not be able to use it in the presence of novel levels and missing values) and also for a data_algebra version of .transform().
A personally elaborated collection of data science notes written entirely in Jupyter notebooks.
-
Updated
Sep 10, 2019 - Jupyter Notebook
Slides and notebooks for my tutorial at PyData London 2018
-
Updated
Aug 11, 2019 - Jupyter Notebook
@matthewbrems and I presented "Recreating, Understanding, and Visualizing FiveThirtyEight's Elections Forecast" at PyData DC 2018
-
Updated
Jul 15, 2019 - Jupyter Notebook
Pydata 2017 workshop: build a clickbait detector with python
-
Updated
Aug 31, 2017 - Jupyter Notebook
Material for working alongside my workshop session at PyData Berlin 2018
-
Updated
Nov 7, 2019 - Shell
This is the code and presentation for my PyData2017 talk "Reverse Image Search Using Out-of-the-box Machine Learning Libraries
-
Updated
Nov 26, 2019 - HTML
An example of how the LIME algorithm can be used to provide real-world insight into the decision processes of a 'black-box' machine learning algorithm - in this case a Radom Forest regressor.
-
Updated
Jul 23, 2019 - Jupyter Notebook
Deployment of PyData Parallel Tutorial on Kubernetes
-
Updated
Apr 23, 2018 - Python
All the documents for PyDataBratislava
-
Updated
Apr 30, 2019 - Jupyter Notebook
Social network analyses code examples for PyCon 2019 talk
-
Updated
Oct 14, 2019 - Jupyter Notebook
Battle-hardened advice on efficient data loading for deep learning on videos.
-
Updated
Oct 24, 2019 - Python
A friendly pandas wrapper with a more composable grammar support.
-
Updated
Mar 31, 2019 - Jupyter Notebook
Improve this page
Add a description, image, and links to the pydata topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the pydata topic, visit your repo's landing page and select "manage topics."
The Dask documentation references a
to_numericmethod: https://docs.dask.org/en/latest/dataframe-api.html#dask.dataframe.DataFrame.astypeI can't seem to find where that code exists. Is
to_numericimplemented in Dask?