#
data-pipelines
Here are 51 public repositories matching this topic...
MLeap: Deploy Spark Pipelines to Production
-
Updated
Aug 25, 2020 - Scala
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
-
Updated
Sep 11, 2020 - TypeScript
Relational data pipelines for the science lab
mysql
python
s3
databases
pipeline-framework
scientific-computing
cloud-computing
data-analysis
relational-databases
data-pipelines
workflow-management
datajoint
relational-algebra
relational-model
-
Updated
May 26, 2020 - Python
The fastest way to access and manage datasets for PyTorch and TensorFlow. Easily build scalable data pipelines. https://activeloop.ai
training
data-science
machine-learning
deep-learning
tensorflow
pytorch
distributed
data-scientists
datasets
data-pipelines
training-data
-
Updated
Sep 10, 2020 - Python
This is an Open Source PHP Reporting Framework which you can use to write perfect data reports or to construct awesome dashboards using PHP
php
framework
reporting
data-visualization
data-viz
data-analysis
reporting-engine
data-pipelines
report-generator
php-reports
mysql-reporting-tools
php-reporting-tools
data-pivot
data-summarization
reporting-tool
-
Updated
Sep 11, 2020 - PHP
Spark-Transformers: Library for exporting Apache Spark MLLIB models to use them in any Java application with no other dependencies.
java
export
machine-learning
scala
spark
apache-spark
machine-learning-algorithms
transformers
mllib
machine-learning-library
data-pipelines
-
Updated
Dec 15, 2017 - Java
Example of an ETL Pipeline using Airflow
-
Updated
Aug 30, 2017 - Python
ARAKAT - Big Data Analysis and Business Intelligence Application Development Platform
docker
distributed-systems
docker-swarm
business-intelligence
data-pipelines
big-data-analytics
predictive-maintenance
cloud-native-applications
-
Updated
Sep 7, 2020 - Python
A Pachyderm deep learning tutorial for conference workshops
python
docker
kubernetes
data-science
machine-learning
deep-learning
containers
data-engineering
data-pipelines
-
Updated
Aug 2, 2017 - Python
Framework for data processing
-
Updated
Nov 10, 2019 - Python
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
-
Updated
Sep 30, 2019 - Python
Provides an extensible solution for creating Data Processing Pipelines in F#.
-
Updated
Apr 7, 2018 - F#
Framework to quickly build and maintain Smart Data Lakes
scala
spark
hive
hadoop
transform-data
data-lake
data-pipelines
comprehensive
deltalake
smart-data-lake
-
Updated
Sep 8, 2020 - Scala
Using Apache Airflow to author, run and monitor complex data pipelines.
-
Updated
Oct 24, 2018 - Jupyter Notebook
The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
-
Updated
Jul 27, 2020 - Python
Create production-ready Dataflow projects in a zap! ⚡
-
Updated
Jan 2, 2020 - Python
aredier
commented
Nov 3, 2019
errors that occur in the server do not transmit to the server instead we get this generic errors:
ValueError: the execution of the pipeline failed, see _deployment logs for traceback
which doesn't help and is running on my nerves.
An example Pachyderm ML pipeline using Nervana Neon
-
Updated
Mar 23, 2017 - Python
Building data processing pipelines for documents processing with NLP using Apache NiFi and related services
nlp
elasticsearch
kibana
rest
data-integration
nifi
apache-nifi
data-pipelines
electronic-health-records
-
Updated
Aug 3, 2020 - Jupyter Notebook
Ease of use in-app micro-ETL framework for building data processing pipelines.
-
Updated
Oct 11, 2017 - C#
Source code for guide to run Apache Airflow on Kubernetes
-
Updated
Apr 13, 2020 - Python
A framework for microservices
aws
workflow
devops
microservices
dashboard
flexible
deployment
etl
analytics
deploy
plugins
deployments
devops-tools
flexibility
data-pipelines
devops-services
pluggable
data-pipeline
devops-workflow
etl-framework
-
Updated
Nov 5, 2018 - JavaScript
A framework for fast development of scalable data pipelines following a simple design pattern
python
data-science
data
machine-learning
data-mining
pipeline
pipelines
design-patterns
pipeline-framework
data-analytics
data-analysis
task-queue
reproducibility
data-processing
data-pipelines
pipeline-stages
data-abstraction
-
Updated
Jul 29, 2020 - Python
A suite of tools written in Pyraf, Astropy, Scipy, and Numpy to process individual QuickReduced images into single stacked images using a set of "best practices" for ODI data.
-
Updated
May 18, 2020 - Python
Project 5 - Data Engineering Nanodegree
-
Updated
Jun 26, 2019 - Python
Quick way to deploy Airflow Multi-Node Cluster (a.k.a. Airflow Celery Executor Setup)
-
Updated
Aug 6, 2020 - Python
Implemented Data Warehouse, Data Lake on AWS and Data modeling with Postgres and Apache Cassandra, Also used Apache Airflow to create data pipeline
-
Updated
Jul 2, 2020 - Jupyter Notebook
-
Updated
Jul 28, 2020 - Python
Supplementary material for DOLAP 2019 submission
-
Updated
Dec 17, 2018
Improve this page
Add a description, image, and links to the data-pipelines topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-pipelines topic, visit your repo's landing page and select "manage topics."
https://www.loom.com/share/41acf7e9e0224073b50266e89cf16aa8