InstructGPT -Training language models to follow instructions with human feedback - short review
3.4 هزار بار بازدید -
2 سال پیش
-
Training language models to follow
Training language models to follow instructions with human feedback
arxiv.org/abs/2203.02155
#gpt #instructgpt #rlhf #alignment #nlp #prompt #reinforcement learning # reinforcement learning from human feedback
2 سال پیش
در تاریخ 1401/12/29 منتشر شده
است.
3,491
بـار بازدید شده