| 1 |
Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control |
提出分类策略以解决连续控制中的多模态探索问题 |
reinforcement learning deep reinforcement learning policy learning |
|
|
| 2 |
Revisiting Diffusion Q-Learning: From Iterative Denoising to One-Step Action Generation |
提出One-Step Flow Q-Learning以解决DQL训练与推理效率低下问题 |
reinforcement learning offline reinforcement learning diffusion policy |
|
|
| 3 |
Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS |
提出AIRL-S以统一强化学习与基于搜索的测试时缩放问题 |
reinforcement learning inverse reinforcement learning large language model |
|
|
| 4 |
Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration |
提出DARS以解决RLVR中的深度与广度探索问题 |
reinforcement learning PPO large language model |
|
|
| 5 |
MuFlex: A Scalable, Physics-based Platform for Multi-Building Flexibility Analysis and Coordination |
提出MuFlex以解决多建筑灵活性协调问题 |
reinforcement learning SAC penetration |
✅ |
|
| 6 |
Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving |
提出伦理意识安全强化学习框架以解决城市驾驶中的稀有事件风险控制问题 |
reinforcement learning |
|
|
| 7 |
Convergent Reinforcement Learning Algorithms for Stochastic Shortest Path Problem |
提出收敛强化学习算法以解决随机最短路径问题 |
reinforcement learning |
|
|
| 8 |
Reinforcement Learning-based Adaptive Path Selection for Programmable Networks |
提出基于强化学习的自适应路径选择以优化可编程网络 |
reinforcement learning |
|
|
| 9 |
MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning |
提出自注意力模块以提升多智能体强化学习中的通信效率 |
reinforcement learning |
|
|
| 10 |
A Generalized Learning Framework for Self-Supervised Contrastive Learning |
提出通用学习框架以解决自监督对比学习的约束问题 |
contrastive learning |
|
|
| 11 |
EventTSF: Event-Aware Non-Stationary Time Series Forecasting |
提出EventTSF以解决多模态非平稳时间序列预测问题 |
flow matching multimodal |
|
|
| 12 |
Formal Algorithms for Model Efficiency |
提出KMR框架以统一深度学习模型效率优化方法 |
policy learning distillation |
|
|
| 13 |
Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach |
提出基于智能体的测试支持系统以解决动态环境下传感器布局问题 |
reinforcement learning curriculum learning |
|
|