| 1 |
Self-Improving Embodied Foundation Models |
提出自提升具身基础模型,通过两阶段训练实现机器人自主技能学习。 |
reinforcement learning imitation learning large language model |
|
|
| 2 |
Exploring multimodal implicit behavior learning for vehicle navigation in simulated cities |
提出数据增强隐式行为克隆,解决城市车辆导航多模态决策问题 |
behavior cloning multimodal |
|
|
| 3 |
Fleming-R1: Toward Expert-Level Medical Reasoning via Reinforcement Learning |
Fleming-R1:通过强化学习实现专家级医学推理 |
reinforcement learning large language model chain-of-thought |
|
|
| 4 |
The Energy-Efficient Hierarchical Neural Network with Fast FPGA-Based Incremental Learning |
提出基于FPGA加速的能量高效分层神经网络,用于快速增量学习。 |
representation learning large language model foundation model |
|
|
| 5 |
FlowRL: Matching Reward Distributions for LLM Reasoning |
FlowRL:通过匹配奖励分布提升大语言模型推理能力 |
reinforcement learning PPO large language model |
|
|
| 6 |
Reinforcement Learning Agent for a 2D Shooter Game |
提出结合模仿学习与强化学习的混合训练方法,提升2D射击游戏AI智能体性能 |
reinforcement learning imitation learning |
|
|
| 7 |
Structure-Aware Contrastive Learning with Fine-Grained Binding Representations for Drug Discovery |
提出结构感知对比学习框架,结合精细结合表征,提升药物发现中DTI预测性能。 |
linear attention contrastive learning |
|
|
| 8 |
ToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool Learning |
提出DSCL框架,通过双重动态采样与课程学习提升RL工具学习效率 |
reinforcement learning curriculum learning |
|
|
| 9 |
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation |
EVOL-RL:一种无标签进化语言模型框架,通过多数驱动选择和新颖性促进变异实现自提升。 |
reinforcement learning large language model |
✅ |
|
| 10 |
Mind the Gap: Data Rewriting for Stable Off-Policy Supervised Fine-Tuning |
提出数据重写框架,解决SFT中Off-Policy学习的分布偏移问题 |
policy learning large language model |
✅ |
|
| 11 |
Stochastic Bilevel Optimization with Heavy-Tailed Noise |
提出N²SBA方法以解决带重尾噪声的双层优化问题 |
reinforcement learning large language model |
|
|
| 12 |
Self-Explaining Reinforcement Learning for Mobile Network Resource Allocation |
提出基于自解释神经网络的强化学习方法,用于解决移动网络资源分配问题。 |
reinforcement learning |
|
|
| 13 |
Leveraging Reinforcement Learning, Genetic Algorithms and Transformers for background determination in particle physics |
利用强化学习、遗传算法和Transformer解决粒子物理背景确定问题 |
reinforcement learning |
|
|