llm-inference
Here are 188 public repositories matching this topic...
Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ
-
Updated
Dec 14, 2023 - Jupyter Notebook
Operating LLMs in production
-
Updated
Dec 13, 2023 - Python
Reference implementation of Mistral AI 7B v0.1 model.
-
Updated
Dec 13, 2023 - Jupyter Notebook
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
-
Updated
Dec 14, 2023 - C++
Open-source ChatGPT equivalent experience for both open and close source LLMs, embedders, and vector databases. Supports unlimited documents, threads, and concurrent users and management all in a very clean UI.
-
Updated
Dec 14, 2023 - JavaScript
🔮 SuperDuperDB. Bring AI to your database; integrate, train and manage any AI models and APIs directly with your database and your data.
-
Updated
Dec 14, 2023 - Python
Sparsity-aware deep learning inference runtime for CPUs
-
Updated
Dec 14, 2023 - Python
本项目旨在分享大模型相关技术原理以及实战经验。
-
Updated
Dec 11, 2023 - Jupyter Notebook
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps.
-
Updated
Nov 27, 2023 - Jupyter Notebook
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
-
Updated
Dec 14, 2023 - C++
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
-
Updated
Dec 14, 2023 - C++
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
-
Updated
Dec 10, 2023 - Jupyter Notebook
RayLLM - LLMs on Ray
-
Updated
Dec 6, 2023 - Python
LLMFlows - Simple, Explicit and Transparent LLM Apps
-
Updated
Oct 18, 2023 - Python
LLMs as Copilots for Theorem Proving in Lean
-
Updated
Dec 13, 2023 - C++
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
-
Updated
Dec 14, 2023 - Python
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
-
Updated
Dec 14, 2023 - Python
irresponsible innovation. Try now at https://chat.dev/
-
Updated
Dec 14, 2023 - Python
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
-
Updated
Dec 4, 2023 - Jupyter Notebook
Improve this page
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."