Feedback

Reinforcement Learning with Human in the Loop & Human Feedback

人在环路的强化学习（Reinforcement Learning with Human in the Loop, HIL） 和人类反馈的强化学习（Reinforcement

7月前800

【五】情感支撑对话论文最近进展 Emotion Support Conversation 今天给大家分享一篇在KBS的关于情感对话的论文。主要思想是从利用用户的情感反馈信息来进行策略选择，并且帮助生成支撑性的回复。

7月前560