Skip to content
@databrickslabs

Databricks Labs

Labs projects to accelerate use cases on the Databricks Unified Analytics Platform

Pinned

  1. dolly Public

    Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

    Python 10.3k 1.1k

  2. dbx Public

    🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

    Python 344 106

  3. mosaic Public

    An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.

    Scala 193 45

Repositories

  • overwatch Public

    Capture deep metrics on one or all assets within a Databricks workspace

  • tempo Public

    API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation

    Jupyter Notebook 262 43 24 7 Updated Jun 19, 2023
  • Scala 8 2 1 11 Updated Jun 19, 2023
  • dbx Public

    🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

    Python 344 106 64 2 Updated Jun 19, 2023
  • mosaic Public

    An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.

    Scala 193 45 28 15 Updated Jun 18, 2023
  • feature-factory Public

    Accelerator to rapidly deploy customized features for your business

    Python 48 22 1 2 Updated Jun 15, 2023
  • dbldatagen Public

    Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

    Python 171 35 4 6 Updated Jun 15, 2023
  • dbignite Public
    Python 12 5 1 3 Updated Jun 15, 2023
  • arcuate Public

    Delta Sharing + MLflow for ML model & experiment exchange (arcuate delta - a fan shaped river delta)

    Python 19 1 0 7 Updated Jun 12, 2023
  • dlt-meta Public

    This is metadata driven DLT based framework for bronze/silver pipelines

    Python 39 13 2 1 Updated Jun 9, 2023

People

This organization has no public members. You must be a member to see who’s a part of this organization.