Skip to content
#

big-data

Here are 2,155 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Jul 24, 2020
  • Python
ClickHouse
yunchat
yunchat commented Nov 13, 2018

Now insert and query share the resource ( Max Process Count control) 。 When the query with high TPS,the insert will get error (“error: too many process”). I think separator the resource for Insert and Query will makes sense. Ensure enough resource for insert。It looks like Use Yarn, Insert and Query use the different resource quota。
Or the simple way , Can we set Ratio for Insert and

Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated Aug 27, 2020
  • Jupyter Notebook
elR1co
elR1co commented Aug 18, 2020

Today IMap.values() and IMap.values(Predicate) calls are blocking.

I would like to use IMap.values(Predicate) in a Jet Pipeline, which is possible, but I need to declare it as nonCooperative, and will have an impact on the pipeline scalability.

Would it be possible to have an async (non-blocking) version for these calls ?

Thank you very much for all the hard work done !

vespa

Improve this page

Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.