Build software better, together

Quizzes & Assignment Solutions for Google Data Analytics Professional Certificate on Coursera. Also included a few resources on side that I found helpful.

python data-science data google r sql dataviz excel decision-making coursera data-analytics data-analysis quiz data-cleansing assignment-solutions professional-certificates

Updated Apr 20, 2022

bakdata / dedupe

Star

Java DSL for (online) deduplication

data-cleaning deduplication duplicate-detection data-cleansing duplicate-removal

Updated Sep 28, 2021
Java

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

data-science data-mining exploratory-data-analysis tabular-data feature-selection data-engineering feature-extraction data-analytics knowledge-discovery data-wrangling data-preprocessing feature-engineering spreadsheets data-exploration data-mining-algorithms data-cleaning data-profiling anomaly-detection data-cleansing correlations

Updated May 24, 2022
C++

AlexLamson / DataWrangler

Star

Open

when aligning columns, word wrap isn't always off in the new tab

AlexLamson opened Jun 27, 2018

enhancement good first issue

AP-State-Skill-Development-Corporation / Data-Science-Using-Python-Internship-EB1

Star

This repo created for sharing the required/discussed files during Online Internship training program on Data Science Using Python in May-2021

data-science machine-learning python3 data-visualisation data-analysis data-cleansing

Updated Jul 14, 2021
Jupyter Notebook

data-forge / data-forge-fs

Star

This library contains the file system extensions to Data-Forge that allow it to directly read and write CSV and JSON files in Node.js

visualization nodejs javascript linq json data csv pandas data-visualization data-analysis data-wrangling data-management data-manipulation data-cleaning data-munging data-cleansing data-forge

Updated Feb 11, 2022
TypeScript

mtimjones / dataprocessing

Star

Data cleanse, clustering with Vector Quantization and Adaptive Resonance Theory

data-science vector-quantization art1 data-cleansing

Updated Dec 10, 2017
C

LieseB-1746743 / data-cleaning

Star

Data cleaning tool.

data-clustering data-cleaning data-profiling data-cleansing cleaning-data

Updated Apr 20, 2021
JavaScript

brunocampos01 / porto-seguro-safe-driver-prediction

Star

Predict if a driver will file an insurance claim next year. (Kaggle Competition)

python challenge data-science machine-learning random-forest kaggle data-engineering dataset kaggle-competition xgboost data-cleansing porto-seguro insurance-claims

Updated Jan 11, 2022
Python

kbasu2016 / Autism-Detection-in-Adults

Star

This is a binary classification problem related with Autistic Spectrum Disorder (ASD) screening in Adult individual. Given some attributes of a person, my model can predict whether the person would have a possibility to get ASD using different Supervised Learning Techniques and Multi-Layer Perceptron.

random-forest naive-bayes-classifier supervised-learning support-vector-machine data-wrangling decision-tree-classifier k-nearest-neighbours quadratic-discriminant-analysis linear-discriminant-analysis mlp-classifier data-cleansing logistic-regression-models

Updated May 15, 2018
Jupyter Notebook

aminkhod / TA--Course-ofData-mining--Fall-2018

Star

Here is some implementation and using methods in Topics on Data mining course

data-science clustering machine-learning-algorithms cross-validation regression data-visualization feature-selection classification deep-learning-tutorial data-minig data-cleansing teaching-assistant

Updated Nov 13, 2019
Python

sontron / madis

Star

Manipulating and Analyzing Data Interactively with Shiny

visualization data-mining r shiny interactive data-manipulation linear-models data-cleansing

Updated Mar 18, 2021
R

extremecode / stress-detection-in-social-networks

Star

stress detection in social networks

social crawler data machine-learning twitter tweets sentiment-analysis stress detection media auc glmnet tf-idf doc2vec stemming stemming-algorithm livedata data-cleansing

Updated Nov 10, 2019
R

Data-Wrangling-with-JavaScript / Chapter-6

Star

Code examples for Chapter 6 of Data Wrangling with JavaScript

nodejs javascript data-science node data-analysis preparation node-js data-wrangling data-preparation data-cleaning data-cleansing

Updated Apr 9, 2022
JavaScript

JoeRegnier / horkos

Star

Data quality analysis and scoring system.

report-card data machine-learning natural-language-processing automation data-analysis scorecard data-quality data-cleansing data-scores

Updated Apr 22, 2022
Python

Ramanthan / Categorical_Feature_Binary_Variables_Encoding

Star

Categorical Binary Feature encoding script

datascience data-visualisation feature-engineering encoding-model data-cleansing missing-value-treatment

Updated Mar 15, 2020
Jupyter Notebook

astradrel / ecommercedashboard

Star

This project showcase the business dashboard for an E-Commerce company based on its 2008 - 2012 sales records

dashboard data-visualization business-intelligence data-analysis tableau data-cleansing

Updated Nov 2, 2021
Jupyter Notebook

vishaltyagi94 / Tennis-Aus-Open-Player-Stats

Star

D3 visualizations displaying the attribute comparison between the winners of each year

visualization d3 data-science data-visualization data-engineering data-analysis tennis radar-chart d3js line-chart data-cleansing attribute-comparison

Updated Apr 18, 2019
HTML

jankubierecki / python-ds

Star

some practical examples to learn data science with python

python big-data anaconda numpy scikit-learn jupyter-notebook pandas matplotlib data-cleansing

Updated Apr 15, 2019
Jupyter Notebook

siegstedt / predict_blood_donation

Star

This project works with data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes its blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The dataset, obtained from the UCI Machine Learning Repository, consists of a random sample of 748 donors. The task is to predict if a blood donor will donate within a given time window. The work contains a full model-building process: from inspecting the dataset to using the tpot library to automate your Machine Learning pipeline.

machine-learning healthcare data-manipulation life-sciences data-cleansing