| 1 |
How Many Heads Make an SSM? A Unified Framework for Attention and State Space Models |
提出统一框架,分析Attention和状态空间模型(SSM)的表达能力与训练权衡。 |
latent dynamics SSM state space model |
|
|
| 2 |
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning |
提出G2RL:利用梯度引导强化学习提升LLM推理能力 |
reinforcement learning PPO large language model |
|
|
| 3 |
Autonomous Pressure Control in MuVacAS via Deep Reinforcement Learning and Deep Learning Surrogate Models |
提出基于深度强化学习和深度学习代理模型的MuVacAS自主压力控制方法 |
reinforcement learning deep reinforcement learning |
|
|
| 4 |
Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction |
揭示自回归语言模型与能量模型等价性,洞察其前瞻能力 |
reinforcement learning distillation large language model |
|
|
| 5 |
Automatic Reward Shaping from Multi-Objective Human Heuristics |
提出MORSE框架,通过多目标人类启发式自动进行强化学习奖励塑造 |
reinforcement learning reward shaping |
|
|
| 6 |
A Teacher-Student Perspective on the Dynamics of Learning Near the Optimal Point |
研究神经网络优化点附近的学习动态,揭示Hessian矩阵特征谱的关键作用 |
teacher-student |
|
|
| 7 |
EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning |
提出EUBRL算法,利用认知不确定性指导贝叶斯强化学习探索,提升样本效率。 |
reinforcement learning |
|
|
| 8 |
Distillation-Guided Structural Transfer for Continual Learning Beyond Sparse Distributed Memory |
提出选择性子网络蒸馏(SSD)框架,提升稀疏神经网络的持续学习能力。 |
distillation |
|
|
| 9 |
TrajSyn: Privacy-Preserving Dataset Distillation from Federated Model Trajectories for Server-Side Adversarial Training |
TrajSyn:联邦学习中基于模型轨迹的隐私保护数据集蒸馏,用于服务端对抗训练 |
distillation |
|
|
| 10 |
Feature-Centric Unsupervised Node Representation Learning Without Homophily Assumption |
FUEL:一种无需同质性假设的特征中心无监督节点表示学习方法 |
representation learning |
|
|
| 11 |
Spectral Representation-based Reinforcement Learning |
提出基于谱表示的强化学习框架,解决传统方法在复杂环境中的难题。 |
reinforcement learning |
|
|
| 12 |
SoFlow: Solution Flow Models for One-Step Generative Modeling |
SoFlow:提出解决方案流模型,实现一步到位的生成建模,提升生成效率。 |
flow matching classifier-free guidance |
|
|