Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
-
Updated
Aug 12, 2023 - Python
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
Operating LLMs in production
A high-throughput and memory-efficient inference and serving engine for LLMs
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
Ray Aviary - evaluate multiple LLMs easily
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
本项目旨在分享大模型相关技术原理以及实战经验。
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
Your cross-cloud AI substrate
Deploy and Scale LLM-based applications
Ray and Anyscale for UC Berkeley AI Hackathon!
A collection of all available inference solutions for the LLMs
Hinglish Chatbot powered by Azure Cognitive Services, Google Translate and Open AI
Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.
To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."