2022 Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning Runze Liu , Fengshuo Bai, Yali Du, and Yaodong Yang In Advances in Neural Information Processing Systems (NeurIPS), 2022 HTML PDF Code Website