cs.LG(2026-03-06)
📊 共 11 篇论文
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (6)
支柱二:RL算法与架构 (RL & Architecture) (4)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion Models | 揭示多模态扩散模型后门攻击中的模态坍塌现象,强调单模态主导风险 | multimodal | ||
| 2 | COLD-Steer: Steering Large Language Models via In-Context One-step Learning Dynamics | COLD-Steer:通过上下文单步学习动态引导大语言模型 | large language model | ||
| 3 | Adapter-Augmented Bandits for Online Multi-Constrained Multi-Modal Inference Scheduling | 提出M-CMAB框架,解决在线多约束多模态推理调度问题,提升资源利用率。 | large language model multimodal | ||
| 4 | Stem: Rethinking Causal Information Flow in Sparse Attention | 提出Stem模块,通过重塑因果信息流解决稀疏注意力中的长文本处理瓶颈。 | large language model | ||
| 5 | Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved Convergence | 提出Omni-Masked Gradient Descent以解决大语言模型训练中的内存瓶颈问题 | large language model | ||
| 6 | Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls | 研究多示例提示在测试时自适应中的有效性、局限性与潜在问题 | large language model |
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Synthetic Monitoring Environments for Reinforcement Learning | 提出合成监控环境SMEs,用于强化学习算法的白盒诊断与性能分析。 | reinforcement learning PPO SAC | ||
| 8 | From Entropy to Calibrated Uncertainty: Training Language Models to Reason About Uncertainty | 提出一种基于熵校准的语言模型不确定性推理训练方法,提升校准性和计算效率。 | reinforcement learning large language model | ||
| 9 | Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments | 通过扩展到百万级并行环境,解决PPO训练中的学习停滞问题 | PPO | ||
| 10 | A Persistent-State Dataflow Accelerator for Memory-Bound Linear Attention Decode on FPGA | 针对内存受限的线性注意力解码,提出基于FPGA的持久状态数据流加速器 | linear attention |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 11 | Causal Interpretation of Neural Network Computations with Contribution Decomposition | 提出CODEC方法,通过贡献分解实现神经网络计算过程的可解释性与因果干预。 | manipulation |