data-cleaning

GREAT, Sam!
janitor is wonderful.

btw:
a shortcut to get Total Sums
for BOTH rows AND cols:

mtcars %>%
tabyl(am, cyl) %>%
adorn_totals(c("row", "col"))
am 4 6 8 Total
0 3 4 12 19
1 8 3 2 13
Total 11 7 14 32

So,
(easy) SUGGESTION -
also allow keyword:
"both"
as param to:
adorn_totals("both")
or maybe simply:
adorn_totals()

less coding...easier...

Documentation problem

The current documentation demonstrates pandera usage by using the pa.PandasDtype enum, which can make things look a little unfamiliar to new-comers, especially since it now supports the use of python types and numpy scalar types, for example, see:

documentation example: https://pandera.readthedocs.io/en/stable/#quick-start
docstrings example: https://githu

Write unit test coverage for SafeDataset and SafeDataLoader, along with the functions in utils.py.

As detailed in:
https://github.com/marketplace/actions/run-circleci-artifacts-redirector?version=0.1.0

It is used to link from the PR to the docs rendered by circleci, for instance in scikit-learn or sphinx-gallery. It helps reviewing PRs.

Context

Why do we add this issue?

Our goal is to make it easy to visualise data and to make those visualisations of good quality and thus trustworthy. This also means setting limitations to what users can do, so they do not make mistakes.

Problem or idea

What is the cause?

Line charts are similar to scatter plots except that the measurement points are ordered by their x-axis va

In this lesson, the section on Dates and Numbers utilizes the simple date format syntax to output the date in a human readable format. I think the lesson could benefit from an explanation of simple date format, or at least a reference/link to the Wiki page on [GREL Date Functions](https://github.com/OpenRefine/OpenRef

data-cleaning

Here are 670 public repositories matching this topic...

johnkerl / miller

justmarkham / DAT8

justmarkham / pandas-videos

cgnorthcutt / cleanlab

ironmussa / Optimus

data-forge / data-forge-ts

sfirke / janitor

pandera-dev / pandera

Documentation problem

msamogh / nonechucks

data-cleaning / validate

jim-schwoebel / voicebook

dirty-cat / dirty_cat

ekstroem / dataMaid

ChrisMuir / refinr

ajaymache / data-analysis-using-python

akanz1 / klib

ironmussa / Bumblebee

HoloClean / HoloClean-Legacy-deprecated

akvo / akvo-lumen

Context

Problem or idea

msberends / clean

jim-schwoebel / allie

LoLei / redditcleaner

ropensci / taxa

scottythered / gratefuldata

ammsa / DTCleaner

sharmaroshan / Drugs-Recommendation-using-Reviews

dssg / pgdedupe

iam-mhaseeb / Skytrax-Data-Warehouse

ropensci / scrubr

LibraryCarpentry / lc-open-refine

Improve this page

Add this topic to your repo

Essential cookies

Always active

Analytics cookies