cs.LG(2026-03-02)

📊 共 33 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (16 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (10 🔗1) 支柱一:机器人控制 (Robot Control) (3 🔗1) 支柱八:物理动画 (Physics-based Animation) (3) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
1 IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs IDProxy:利用多模态LLM解决小红书广告和推荐中冷启动CTR预测问题 large language model multimodal
2 Orchestrating Multimodal DNN Workloads in Wireless Neural Processing 提出O-WiN框架,通过通信-计算流水线加速无线神经处理中的多模态DNN推理。 multimodal
3 CoVAE: correlated multimodal generative modeling 提出CoVAE模型,通过捕捉模态间相关性,提升多模态生成建模的性能和不确定性量化。 multimodal
4 Causal Circuit Tracing Reveals Distinct Computational Architectures in Single-Cell Foundation Models: Inhibitory Dominance, Biological Coherence, and Cross-Model Convergence 提出因果回路追踪方法,揭示单细胞Foundation模型中独特的计算架构。 foundation model
5 SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond SafeSci:构建科学领域大语言模型安全评估与提升的综合框架 large language model
6 Frontier Models Can Take Actions at Low Probabilities 前沿模型能以极低概率执行特定动作,需警惕恶意利用 chain-of-thought
7 Symbol-Equivariant Recurrent Reasoning Models 提出符号等变循环推理模型,提升神经推理的泛化性和鲁棒性 large language model
8 Multi-Head Low-Rank Attention 提出多头低秩注意力(MLRA),解决大模型长文本推理中KV缓存的张量并行瓶颈。 large language model
9 Adam Converges Without Any Modification On Update Rules 证明Adam在适当超参数下收敛,揭示其收敛-发散相变现象 large language model
10 Probabilistic Retrofitting of Learned Simulators 通过概率追溯拟合,将预训练的确定性模拟器转化为概率模型,提升偏微分方程建模性能。 foundation model
11 Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions 评估LLM在材料科学中的知识:从潜在嵌入到可靠预测 large language model
12 Modular Memory is the Key to Continual Learning Agents 提出模块化记忆架构,融合In-Weight Learning和In-Context Learning,解决持续学习中的灾难性遗忘问题。 foundation model
13 DeLo: Dual Decomposed Low-Rank Experts Collaboration for Continual Missing Modality Learning 提出DeLo,通过双重分解低秩专家协作解决持续缺失模态学习中的模态干扰问题。 multimodal
14 One Operator to Rule Them All? On Boundary-Indexed Operator Families in Neural PDE Solvers 揭示神经PDE求解器局限性:学习边界条件索引算子族而非通用算子 foundation model
15 Quasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient Verification Quasar:通过量化自推测加速和内存高效验证,实现快速LLM推理。 large language model
16 3BASiL: An Algorithmic Framework for Sparse plus Low-Rank Compression of LLMs 提出3BASiL-TM算法框架,用于大语言模型的稀疏加低秩分解压缩,提升性能。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
17 Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality 提出基于协同注意力的内容重建预训练方法CoCoA,提升多模态嵌入质量。 contrastive learning large language model multimodal
18 UTICA: Multi-Objective Self-Distllation Foundation Model Pretraining for Time Series Classification UTICA:面向时间序列分类的多目标自蒸馏预训练基础模型 distillation foundation model
19 LFPO: Likelihood-Free Policy Optimization for Masked Diffusion Models 提出LFPO,用于优化掩码扩散模型的无似然策略,提升代码生成和推理能力。 reinforcement learning flow matching large language model
20 Expanding LLM Agent Boundaries with Strategy-Guided Exploration 提出策略引导探索(SGE)方法,提升LLM Agent在复杂任务中的探索效率与性能。 reinforcement learning large language model
21 The Expressive Limits of Diagonal SSMs for State-Tracking 揭示对角SSM在状态跟踪任务中表达能力的局限性 SSM
22 Efficient RLVR Training via Weighted Mutual Information Data Selection 提出InSight,通过加权互信息数据选择提升RLVR训练效率。 reinforcement learning large language model
23 SEAR: Sample Efficient Action Chunking Reinforcement Learning SEAR:一种样本高效的动作块强化学习算法,提升在线强化学习性能 reinforcement learning
24 D3LM: A Discrete DNA Diffusion Language Model for Bidirectional DNA Understanding and Generation D3LM:用于双向DNA理解与生成的离散DNA扩散语言模型 representation learning foundation model
25 Discrete World Models via Regularization 提出基于正则化的离散世界模型(DWMR),用于无监督布尔世界模型学习。 world model
26 GAC: Stabilizing Asynchronous RL Training for LLMs via Gradient Alignment Control 提出梯度对齐控制(GAC)方法,稳定LLM异步强化学习训练 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
27 Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning 提出耦合策略优化算法,通过KL约束调控策略多样性,提升大规模强化学习效率。 manipulation dexterous manipulation reinforcement learning
28 Temporal Representations for Exploration: Learning Complex Exploratory Behavior without Extrinsic Rewards 提出基于时间对比表示的探索方法以解决无外部奖励的复杂行为学习问题 locomotion manipulation reinforcement learning
29 Accurate, private, secure, federated U-statistics with higher degree 提出一种基于多方计算的联邦U-统计协议,提升隐私保护下的计算精度。 MPC

🔬 支柱八:物理动画 (Physics-based Animation) (3 篇)

#题目一句话要点标签🔗
30 TRAKNN: Efficient Trajectory Aware Spatiotemporal kNN for Rare Meteorological Trajectory Detection 提出TRAKNN,用于高效检测气象时空轨迹中的罕见模式 spatiotemporal
31 DGNet: Discrete Green Networks for Data-Efficient Learning of Spatiotemporal PDEs DGNet:离散格林网络,用于数据高效的时空偏微分方程学习 spatiotemporal
32 Graph neural network force fields for adiabatic dynamics of lattice Hamiltonians 提出基于图神经网络的力场模型,用于晶格哈密顿量的绝热动力学模拟。 spatiotemporal

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
33 DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking 提出DUEL框架以解决MDM困境并实现精确似然计算 MDM

⬅️ 返回 cs.LG 首页 · 🏠 返回主页