Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
-
Updated
Jun 16, 2023 - Python
DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations.
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
Redpanda Console is a developer-friendly UI for managing your Kafka/Redpanda workloads. Console gives you a simple, interactive approach for gaining visibility into your topics, masking data, managing consumer groups, and exploring real-time data with time-travel debugging.
The open standard for data logging
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
Kafka Docker for development. Kafka, Zookeeper, Schema Registry, Kafka-Connect, Landoop Tools, 20+ connectors
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
Open-source data observability for analytics engineers.
Extract & Load with joy — CLI & version control for ELT without limitations. No more black box. Let your creativity flow.
Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
A list of tools for annotating data, managing annotations, etc.
Easy data pipelines for security teams.
DataOps for the Modern Data Warehouse on Microsoft Azure. https://aka.ms/mdw-dataops.
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
Build, run and manage your data pipelines with Python or SQL on any cloud
Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way