| 16 |
Counterfactual Reward Model Training for Bias Mitigation in Multimodal Reinforcement Learning |
提出反事实奖励模型以缓解多模态强化学习中的偏见问题 |
reinforcement learning RLHF representation learning |
|
|
| 17 |
Data-Efficient Symbolic Regression via Foundation Model Distillation |
提出EQUATE框架以解决小数据集下的符号回归问题 |
distillation foundation model |
|
|
| 18 |
Adaptive Scaling of Policy Constraints for Offline Reinforcement Learning |
提出自适应缩放策略约束以解决离线强化学习中的超参数调优问题 |
reinforcement learning offline RL offline reinforcement learning |
✅ |
|
| 19 |
Dynamics-Aligned Latent Imagination in Contextual World Models for Zero-Shot Generalization |
提出DALI以解决零-shot泛化中的环境适应问题 |
reinforcement learning world model dreamer |
|
|
| 20 |
Encouraging Good Processes Without the Need for Good Answers: Reinforcement Learning for LLM Agent Planning |
提出RLTR框架以解决LLM代理规划能力不足问题 |
reinforcement learning large language model |
|
|
| 21 |
Learning Game-Playing Agents with Generative Code Optimization |
提出生成代码优化方法以学习游戏智能体 |
reinforcement learning deep reinforcement learning large language model |
|
|
| 22 |
The Role of Teacher Calibration in Knowledge Distillation |
提出教师模型校准方法以提升知识蒸馏效果 |
distillation |
|
|
| 23 |
Reinforcement Learning for Search Tree Size Minimization in Constraint Programming: New Results on Scheduling Benchmarks |
基于强化学习的约束编程搜索树大小最小化方法 |
reinforcement learning |
|
|
| 24 |
Interestingness First Classifiers |
提出EUREKA框架以构建有趣的分类器 |
Eureka large language model |
|
|
| 25 |
MicroLad: 2D-to-3D Microstructure Reconstruction and Generation via Latent Diffusion and Score Distillation |
提出MicroLad以解决3D微观结构重建问题 |
distillation |
|
|
| 26 |
PoolFlip: A Multi-Agent Reinforcement Learning Security Environment for Cyber Defense |
提出PoolFlip以解决网络防御中的决策自动化问题 |
reinforcement learning |
|
|