apache-spark
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 1,132 public repositories matching this topic...
-
Updated
May 26, 2019 - Scala
-
Updated
Oct 19, 2021 - JavaScript
-
Updated
Oct 27, 2021 - Jupyter Notebook
What
being able to take a data object (or prefix, like a partition) and get back the commit that added/modified it.
Why
This is valuable lineage information that is currently available in lakeFS but not exposed easily, and mimics the behavior of git blame
How
Given the lakeFS API already supports listing the log of commits for an object or prefix (
-
Updated
Aug 16, 2021 - Java
-
Updated
Nov 11, 2021 - Go
This is to track implementation of the ML-Features: https://spark.apache.org/docs/latest/ml-features
Bucketizer has been implemented in dotnet/spark#378 but there are more features that should be implemented.
- Feature Extractors
- TF-IDF
- Word2Vec (dotnet/spark#491)
- CountVectorizer (https://github.com/dotnet/spark/p
-
Updated
Oct 30, 2021 - Shell
-
Updated
Nov 2, 2021
-
Updated
Dec 31, 2020 - Python
-
Updated
Dec 3, 2019 - Python
-
Updated
Jan 29, 2021 - C#
-
Updated
Mar 9, 2020 - Python
-
Updated
Nov 10, 2021 - R
-
Updated
Jan 24, 2017 - Scala
-
Updated
Jul 25, 2018 - Python
-
Updated
Jan 8, 2020 - Scala
-
Updated
Mar 31, 2018
-
Updated
Jun 13, 2021 - Python
-
Updated
Oct 23, 2021 - Java
-
Updated
Apr 15, 2021 - Scala
-
Updated
Oct 25, 2021
-
Updated
Oct 29, 2021 - Jupyter Notebook
-
Updated
Nov 3, 2021 - Scala
-
Updated
May 23, 2021
-
Updated
Mar 30, 2021 - Python
-
Updated
Sep 14, 2015 - Shell
Created by Matei Zaharia
Released May 26, 2014
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia
MLflow Roadmap Item
This is an MLflow Roadmap item that has been prioritized by the MLflow maintainers. We're seeking help with the implementation of roadmap items tagged with the
help wantedlabel.For requirements clarifications and implementation questions, or to request a PR review, please tag @BenWilson2 in your communications related to this issue.
Proposal Summary
Includ