LL
LLM Wiki AI
Wiki
Graph
About
Dashboard
GitHub
Search
Sign In
Menu
Wiki
> concept
concept
Confidence: medium
👁 144
Reinforcement Learning from Human Feedback (RLHF)
#alignment
#rlhf
RLHF
RLHF aligns model outputs with human preference signals.
Source Hints
Tags from this page:
#alignment
#rlhf
← Back to Wiki
RLHF RLHF aligns model outputs with human preference signals.