Skip to content
#

Apache Spark

spark logo

Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Here are 6,618 public repositories matching this topic...

cube.js
leogodin217
leogodin217 commented Sep 17, 2021

Describe the bug
Using a time dimension on a runningTotal measure on Snowflake mixes quoted and unquoted columns in the query. This fails the query, because Snowflake has specific rules about quoted columns. Specifically:

  • All unquoted column names are treated as upper case
  • Quoted column names are case sensitive.

So "date_from" <> date_from

To Reproduce
Steps to reproduce

bug help wanted good first issue
flink-learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

  • Updated May 8, 2022
  • Java

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated May 17, 2022
  • Jupyter Notebook
tdas
tdas commented May 12, 2022

Feature request

Overview

SBT tests currently run sequentially. It would be good to reduce the total test runtime by parallelizing the SBT tests.

Motivation

SBT tests are taking longer and longer. This is not scalable. While we have already split various version of Scala tests into two CI builds in the repo, each build takes a long time. This is a burden for local testing as

enhancement good first issue
SynapseML
brunocous
brunocous commented Sep 2, 2020

I have a simple regression task (using a LightGBMRegressor) where I want to penalize negative predictions more than positive ones. Is there a way to achieve this with the default regression LightGBM objectives (see https://lightgbm.readthedocs.io/en/latest/Parameters.html)? If not, is it somehow possible to define (many example for default LightGBM model) and pass a custom regression objective?

wanshicheng
wanshicheng commented Jun 23, 2021

Used Spark version
Spark Version: 2.4.4
Used Spark Job Server version
SJS version: v0.11.1

Deployed mode
client on Spark Standalone

Actual (wrong) behavior
I can't get config, when post a job with 'sync=true'. I got it:
http://localhost:8090/jobs/ff99479b-e59c-4215-b17d-4058f8d97d25/config
{"status":"ERROR","result":"No such job ID ff99479b-e59c-4215-b17d-4058f8d97d25"

bug good first issue

Created by Matei Zaharia

Released May 26, 2014

Repository
apache/spark
Website
spark.apache.org
Wikipedia
Wikipedia

Related Topics

hadoop scala