delta-io / delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
See what the GitHub community is most excited about today.
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
The Scala 3 compiler, also known as Dotty.
CMAK is a tool for managing Apache Kafka clusters
Spark: The Definitive Guide's Code Repository
Flink & Spark development scaffold, Make stream processing easier!!! The original intention of StreamX is to make the development of Flink easier. StreamX focuses on the management of development phases and tasks. Our ultimate goal is to build a one-stop big data solution integrating stream processing, batch processing, data warehouse and data l…
Apache Spark - A unified analytics engine for large-scale data processing
sbt, the interactive build tool
A fault tolerant, protocol-agnostic RPC system
Twitter-Server defines a template from which services at Twitter are built
A scala library to write Http apps.
The enterprise-grade behavioral data engine (web, mobile, server-side, webhooks), running cloud-natively on AWS and GCP
Protocol definition for generic messages.
Mirror of Apache livy (Incubating)
Apache Spark Connector for Azure Cosmos DB
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
The pure asynchronous runtime for Scala
Modern Load Testing as Code
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Declarative, type-safe web endpoints library
Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.
Build highly concurrent, distributed, and resilient message-driven applications on the JVM
ZIO — A type-safe, composable library for async and concurrent programming in Scala
XML data source for Spark SQL and DataFrames
Manage your Kafka ACL at scale