cs.LG(2026-03-06)

📊 共 11 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (6) 支柱二:RL算法与架构 (RL & Architecture) (4) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
1 When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion Models 揭示多模态扩散模型后门攻击中的模态坍塌现象,强调单模态主导风险 multimodal
2 COLD-Steer: Steering Large Language Models via In-Context One-step Learning Dynamics COLD-Steer:通过上下文单步学习动态引导大语言模型 large language model
3 Adapter-Augmented Bandits for Online Multi-Constrained Multi-Modal Inference Scheduling 提出M-CMAB框架,解决在线多约束多模态推理调度问题,提升资源利用率。 large language model multimodal
4 Stem: Rethinking Causal Information Flow in Sparse Attention 提出Stem模块,通过重塑因果信息流解决稀疏注意力中的长文本处理瓶颈。 large language model
5 Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved Convergence 提出Omni-Masked Gradient Descent以解决大语言模型训练中的内存瓶颈问题 large language model
6 Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls 研究多示例提示在测试时自适应中的有效性、局限性与潜在问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
7 Synthetic Monitoring Environments for Reinforcement Learning 提出合成监控环境SMEs,用于强化学习算法的白盒诊断与性能分析。 reinforcement learning PPO SAC
8 From Entropy to Calibrated Uncertainty: Training Language Models to Reason About Uncertainty 提出一种基于熵校准的语言模型不确定性推理训练方法,提升校准性和计算效率。 reinforcement learning large language model
9 Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments 通过扩展到百万级并行环境,解决PPO训练中的学习停滞问题 PPO
10 A Persistent-State Dataflow Accelerator for Memory-Bound Linear Attention Decode on FPGA 针对内存受限的线性注意力解码,提出基于FPGA的持久状态数据流加速器 linear attention

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
11 Causal Interpretation of Neural Network Computations with Contribution Decomposition 提出CODEC方法,通过贡献分解实现神经网络计算过程的可解释性与因果干预。 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页