#

inference

Here are 981 public repositories matching this topic...

ColossalAI

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

ai deep-learning hpc distributed-computing inference big-model large-scale data-parallelism model-parallelism pipeline-parallelism foundation-models heterogeneous-training

Updated Jul 31, 2023
Python

microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated Jul 31, 2023
Python

google / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

android c-plus-plus calculator machine-learning framework computer-vision deep-learning inference pipeline-framework stream-processing video-processing perception mobile-development audio-processing graph-framework graph-based mediapipe

Updated Jul 28, 2023
C++

whisper.cpp

ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++

inference transformer speech-recognition openai speech-to-text whisper

Updated Jul 30, 2023
C

Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Updated Jul 31, 2023
C++

aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

training aws data-science machine-learning reinforcement-learning deep-learning examples jupyter-notebook inference sagemaker mlops

Updated Jul 27, 2023
Jupyter Notebook

ts-pattern

gvergnaud / ts-pattern

🎨 The exhaustive Pattern Matching library for TypeScript, with smart type inference.

javascript typescript matching pattern pattern-matching branching inference ts conditions type-inference exhaustive

Updated Jul 25, 2023
TypeScript

nebuly-ai / nebuly

The next-generation platform to monitor and optimize your AI costs in one place 🚀

machine-learning deep-learning neural-network compiler tensorflow gpu optimization pypi transformers inference pytorch computing quantization tensorrt edge-computing tvm onnx openvino huggingface

Updated Jun 18, 2023
Python

NVIDIA / TensorRT

NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.

deep-learning inference nvidia tensorrt

Updated Jul 22, 2023
C++

Linzaer / Ultra-Light-Fast-Generic-Face-Detector-1MB

💎1MB lightweight face detection model (1MB轻量级人脸检测模型)

arm inference face-detection mnn ncnn

Updated May 29, 2023
Python

jetson-inference

dusty-nv / jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Updated Jul 27, 2023
C++

gcanti / io-ts

Runtime type system for IO decoding/encoding

typescript validation types runtime inference

Updated Jul 2, 2023
TypeScript

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

machine-learning cloud deep-learning gpu inference edge datacenter

Updated Jul 30, 2023
Python

openvinotoolkit / openvino

OpenVINO™ Toolkit repository

performance deep-learning inference inference-engine openvino model-optimizer

Updated Jul 31, 2023
C++

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

inference pytorch transformer gpt model-serving mlops llm llmops llm-serving

Updated Jul 31, 2023
Python

Tencent / TNN

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and …

ocr deep-learning tensorflow inference pytorch tengine face-detection tensorrt mnn coreml ncnn openvino hairsegmentaion

Updated Jul 21, 2023
C++

NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter

inference pytorch classification tensorrt jetson-tx2 jetson-xavier jetson-nano

Updated Apr 28, 2023
Python

guillaumekln / faster-whisper

Faster Whisper transcription with CTranslate2

deep-learning inference transformer speech-recognition openai speech-to-text quantization whisper

Updated Jul 28, 2023
Python

Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

python machine-learning privacy ai attack extraction inference artificial-intelligence evasion red-team poisoning adversarial-machine-learning blue-team adversarial-examples adversarial-attacks trusted-ai trustworthy-ai

Updated Jul 28, 2023
Python

huggingface / text-generation-inference

Large Language Model Text Generation Inference

nlp bloom deep-learning inference pytorch falcon transformer gpt starcoder

Updated Jul 31, 2023
Python

Improve this page

Add a description, image, and links to the inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."