rlhf

Here are 17 public repositories matching this topic...

LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

python machine-learning ai nextjs discord-bot assistant language-model chatgpt rlhf

Updated Apr 9, 2023
Python

opendilab / awesome-RLHF

Star

A curated list of reinforcement learning with human feedback resources (continually updated)

reinforcement-learning deep-learning deep-reinforcement-learning large-language-models human-feedback rlhf

Updated Apr 6, 2023

voidful / TextRL

Star

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

nlp reinforcement-learning pytorch nlg language-model gpt-2 gpt-3 controlled-nlg chatgpt rlhf

Updated Apr 2, 2023
Python

RUCAIBox / LLMSurvey

Star

A collection of papers and resources related to Large Language Models.

natural-language-processing pre-training pre-trained-language-models in-context-learning large-language-models llm llms chain-of-thought chatgpt rlhf instruction-tuning

Updated Apr 8, 2023

xrsrke / instructGOOSE

Star

Implementation of Reinforcement Learning from Human Feedback (RLHF)

reinforcement-learning chatgpt human-feedback rlhf instructgpt

Updated Apr 7, 2023
Jupyter Notebook

tomekkorbak / pretraining-with-human-feedback

Star

Code accompanying the paper Pretraining Language Models with Human Preferences

reinforcement-learning gpt language-models ai-safety ai-alignment pretraining decision-transformers rlhf

Updated Mar 1, 2023
Python

csmile-1006 / PreferenceTransformer

Star

Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)

robotics rl rlhf

Updated Mar 8, 2023
Python

cogment / cogment-verse

Star

Library of Environments, Human Actor UIs and Agent implementation for Human In the Loop Learning & Reinforcement Learning

reinforcement-learning human-in-the-loop-learning cogment rlhf

Updated Apr 7, 2023
Python

arunprsh / ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO

Star

A Practical Guide to Developing a Reliable FAQ Chatbot with Reinforcement Learning and Human Feedback using GPT-2 on AWS

aws reinforcement-learning chatbot transformers question-answering sagemaker gpt-2 gpt2 rlhf

Updated Feb 11, 2023
Jupyter Notebook

jianzhnie / open-chatgpt

Star

The open source implementation of chatgpt and RLHF. 从0开始实现一个ChatGPT.

reinforcement-learning transformer llama gpt ppo a2c llm chatgpt rlhf stanford-alpaca

Updated Apr 9, 2023
Python

vicgalle / zero-shot-reward-models

Sponsor

Star

Zero-Shot Reward Models with the trlx library

reinforcement-learning zero-shot llm rlhf reward-models trlx

Updated Mar 21, 2023
Python

jasonvanf / llama-trl

Star

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

adapter transformer llama gpt lora ppo peft trl chatgpt rlhf

Updated Mar 30, 2023

AmirMotefaker / Create-your-own-ChatGPT

Star

Create your own ChatGPT with Python

python machine-learning ai ml artificial-intelligence llm chatgpt chatgpt-api chatgpt3 rlhf large-language-model

Updated Apr 6, 2023
Jupyter Notebook

G-U-N / T2I-HumanFeedback

Star

Implementations of Baseline Methods for Aligning Text2Img Diffusion Models with Human FeedBack

text2img stable-diffusion human-feedback rlhf

Updated Apr 8, 2023

jianzhnie / awesome-open-chatgpt

Star

The open source implementation of chatgpt and RLHF. ChaGPT 的开源平替解决方案

gpt4 chatgpt rlhf instruct-gpt

Updated Apr 6, 2023

DaehanKim / EasyRLHF

Star

EasyRLHF aims to providing an easy and minimal interface to train RLHF LMs, using off-the-shelf solutions and datasets

language-model rlhf

Updated Apr 3, 2023
Python

saschaschramm / tiny-chatgpt

Star

Researching the reinforcement learning algorithm of ChatGPT

gae temporal-differencing-learning ppo chatgpt rlhf general-advantage-estimation

Updated Apr 7, 2023
Jupyter Notebook

Improve this page

Add a description, image, and links to the rlhf topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rlhf topic, visit your repo's landing page and select "manage topics."

Learn more