| 1 |
FoMo Rewards: Can we cast foundation models as reward functions? |
提出基于预训练模型的通用奖励函数,用于强化学习交互任务。 |
reinforcement learning large language model foundation model |
|
|
| 2 |
Generalized Contrastive Divergence: Joint Training of Energy-Based Model and Diffusion Model through Inverse Reinforcement Learning |
提出广义对比散度(GCD),通过逆强化学习联合训练能量模型和扩散模型 |
reinforcement learning inverse reinforcement learning |
|
|
| 3 |
Multi-Scale and Multi-Modal Contrastive Learning Network for Biomedical Time Series |
提出多尺度多模态对比学习网络MBSL,提升生物医学时间序列表征学习的鲁棒性。 |
representation learning MAE contrastive learning |
|
|
| 4 |
Pearl: A Production-ready Reinforcement Learning Agent |
Pearl:一个面向生产环境的强化学习智能体框架,解决实际部署中的多重挑战。 |
reinforcement learning |
|
|