Databricks
- wherever there is data
- https://databricks.com
Grow your team on GitHub
GitHub is home to over 50 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
Sign up
Pinned repositories
Repositories
-
koalas
Koalas: pandas API on Apache Spark
-
tech-talks
This repository contains the notebooks and presentations we use for our Databricks Tech Talks
-
-
spark-xml
XML data source for Spark SQL and DataFrames
-
spark-deep-learning
Deep Learning Pipelines for Apache Spark
-
databricks-cli
Command Line Interface for Databricks
-
containers
Sample base images for Databricks Container Services
-
jetty.project
Forked from eclipse/jetty.projectEclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more
-
jarjar
Forked from shevek/jarjarJar Jar Links is a utility that makes it easy to repackage Java libraries and embed them into your own distribution.
-
LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
-
containerregistry
Forked from google/containerregistryA set of Python libraries and tools for interacting with a Docker Registry.
-
rules_docker
Forked from bazelbuild/rules_dockerRules for building and handling Docker images with Bazel
-
-
-
subpar
Forked from google/subparSubpar is a utility for creating self-contained python executables. It is designed to work well with Bazel.
-
scala-style-guide
Databricks Scala Coding Style Guide
-
Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
-
learning-spark
Example code from Learning Spark book
-
intellij-jsonnet
Intellij Jsonnet Plugin
-
tensorframes
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
-
benchmarks
A place in which we publish scripts for reproducible benchmarks.
-
databricks-accelerators
Accelerate the use of Databricks for customers [public repo]
-
spark-sklearn Archived
(Deprecated) Scikit-learn integration package for Apache Spark
-
spark-redshift
Redshift data source for Apache Spark
-
pig-on-spark
proof-of-concept implementation of Pig-on-Spark integrated at the logical node level