Making large AI models cheaper, faster and more accessible
-
Updated
Feb 28, 2023 - Python
Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
Paddle Distributed Training Examples. 飞桨分布式训练示例 Resnet Bert GPT MOE DataParallel ModelParallel PipelineParallel HybridParallel AutoParallel Zero Sharding Recompute GradientMerge Offload AMP DGC LocalSGD Wide&Deep
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
Distributed Keras Engine, Make Keras faster with only one line of code.
Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)
Orkhon: ML Inference Framework and Server Runtime
WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.
Understanding the effects of data parallelism and sparsity on neural network training
OpenCL powered Merklization using BLAKE3
Dependence-Based Code Transformation for Coarse-Grained Parallelism
Development of Project HPGO | Hybrid Parallelism Global Orchestration
A decentralized and distributed framework for training DNNs
Batch Partitioning for Multi-PE Inference with TVM (2020)
Add a description, image, and links to the data-parallelism topic page so that developers can more easily learn about it.
To associate your repository with the data-parallelism topic, visit your repo's landing page and select "manage topics."