OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
-
Updated
Aug 5, 2023 - Python
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
The official GitHub page for the survey paper "A Survey of Large Language Models".
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan, Qwen)
A curated list of reinforcement learning with human feedback resources (continually updated)
A Doctor for your data
Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Xtreme1 - The Next GEN Platform for Multimodal Training Data. #3D annotation, 3D segmentation, lidar-camera fusion annotation, image annotation and rlhf tools are supported!
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
聚宝盆(Cornucopia): 基于中文金融知识的LLaMA微调模型;涉及SFT、RLHF、GPU训练部署等
Aligning Large Language Models with Human: A Survey
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
Chain-of-Hindsight, a simpler and more effective alternative to RLHF
Implementation of Reinforcement Learning from Human Feedback (RLHF)
Add a description, image, and links to the rlhf topic page so that developers can more easily learn about it.
To associate your repository with the rlhf topic, visit your repo's landing page and select "manage topics."