Skip to content
#

data-cleansing

Here are 88 public repositories matching this topic...

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

  • Updated May 24, 2022
  • C++

This is a binary classification problem related with Autistic Spectrum Disorder (ASD) screening in Adult individual. Given some attributes of a person, my model can predict whether the person would have a possibility to get ASD using different Supervised Learning Techniques and Multi-Layer Perceptron.

  • Updated May 15, 2018
  • Jupyter Notebook

This project works with data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes its blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The dataset, obtained from the UCI Machine Learning Repository, consists of a random sample of 748 donors. The task is to predict if a blood donor will donate within a given time window. The work contains a full model-building process: from inspecting the dataset to using the tpot library to automate your Machine Learning pipeline.

  • Updated Nov 28, 2019
  • Python

Improve this page

Add a description, image, and links to the data-cleansing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-cleansing topic, visit your repo's landing page and select "manage topics."

Learn more