cs.LG（2026-03-02）

📊 共 33 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (16 🔗4) 支柱二：RL算法与架构 (RL & Architecture) (10 🔗1) 支柱一：机器人控制 (Robot Control) (3 🔗1) 支柱八：物理动画 (Physics-based Animation) (3) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (16 篇)

#	题目	一句话要点	标签	🔗	⭐
1	IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs	IDProxy：利用多模态LLM解决小红书广告和推荐中冷启动CTR预测问题	large language model multimodal
2	Orchestrating Multimodal DNN Workloads in Wireless Neural Processing	提出O-WiN框架，通过通信-计算流水线加速无线神经处理中的多模态DNN推理。	multimodal
3	CoVAE: correlated multimodal generative modeling	提出CoVAE模型，通过捕捉模态间相关性，提升多模态生成建模的性能和不确定性量化。	multimodal
4	Causal Circuit Tracing Reveals Distinct Computational Architectures in Single-Cell Foundation Models: Inhibitory Dominance, Biological Coherence, and Cross-Model Convergence	提出因果回路追踪方法，揭示单细胞Foundation模型中独特的计算架构。	foundation model
5	SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond	SafeSci：构建科学领域大语言模型安全评估与提升的综合框架	large language model
6	Frontier Models Can Take Actions at Low Probabilities	前沿模型能以极低概率执行特定动作，需警惕恶意利用	chain-of-thought
7	Symbol-Equivariant Recurrent Reasoning Models	提出符号等变循环推理模型，提升神经推理的泛化性和鲁棒性	large language model	✅
8	Multi-Head Low-Rank Attention	提出多头低秩注意力（MLRA），解决大模型长文本推理中KV缓存的张量并行瓶颈。	large language model	✅
9	Adam Converges Without Any Modification On Update Rules	证明Adam在适当超参数下收敛，揭示其收敛-发散相变现象	large language model
10	Probabilistic Retrofitting of Learned Simulators	通过概率追溯拟合，将预训练的确定性模拟器转化为概率模型，提升偏微分方程建模性能。	foundation model
11	Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions	评估LLM在材料科学中的知识：从潜在嵌入到可靠预测	large language model
12	Modular Memory is the Key to Continual Learning Agents	提出模块化记忆架构，融合In-Weight Learning和In-Context Learning，解决持续学习中的灾难性遗忘问题。	foundation model
13	DeLo: Dual Decomposed Low-Rank Experts Collaboration for Continual Missing Modality Learning	提出DeLo，通过双重分解低秩专家协作解决持续缺失模态学习中的模态干扰问题。	multimodal
14	One Operator to Rule Them All? On Boundary-Indexed Operator Families in Neural PDE Solvers	揭示神经PDE求解器局限性：学习边界条件索引算子族而非通用算子	foundation model
15	Quasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient Verification	Quasar：通过量化自推测加速和内存高效验证，实现快速LLM推理。	large language model	✅
16	3BASiL: An Algorithmic Framework for Sparse plus Low-Rank Compression of LLMs	提出3BASiL-TM算法框架，用于大语言模型的稀疏加低秩分解压缩，提升性能。	large language model	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
17	Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality	提出基于协同注意力的内容重建预训练方法CoCoA，提升多模态嵌入质量。	contrastive learning large language model multimodal
18	UTICA: Multi-Objective Self-Distllation Foundation Model Pretraining for Time Series Classification	UTICA：面向时间序列分类的多目标自蒸馏预训练基础模型	distillation foundation model
19	LFPO: Likelihood-Free Policy Optimization for Masked Diffusion Models	提出LFPO，用于优化掩码扩散模型的无似然策略，提升代码生成和推理能力。	reinforcement learning flow matching large language model
20	Expanding LLM Agent Boundaries with Strategy-Guided Exploration	提出策略引导探索（SGE）方法，提升LLM Agent在复杂任务中的探索效率与性能。	reinforcement learning large language model
21	The Expressive Limits of Diagonal SSMs for State-Tracking	揭示对角SSM在状态跟踪任务中表达能力的局限性	SSM
22	Efficient RLVR Training via Weighted Mutual Information Data Selection	提出InSight，通过加权互信息数据选择提升RLVR训练效率。	reinforcement learning large language model
23	SEAR: Sample Efficient Action Chunking Reinforcement Learning	SEAR：一种样本高效的动作块强化学习算法，提升在线强化学习性能	reinforcement learning
24	D3LM: A Discrete DNA Diffusion Language Model for Bidirectional DNA Understanding and Generation	D3LM：用于双向DNA理解与生成的离散DNA扩散语言模型	representation learning foundation model	✅
25	Discrete World Models via Regularization	提出基于正则化的离散世界模型(DWMR)，用于无监督布尔世界模型学习。	world model
26	GAC: Stabilizing Asynchronous RL Training for LLMs via Gradient Alignment Control	提出梯度对齐控制(GAC)方法，稳定LLM异步强化学习训练	reinforcement learning large language model

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
27	Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning	提出耦合策略优化算法，通过KL约束调控策略多样性，提升大规模强化学习效率。	manipulation dexterous manipulation reinforcement learning	✅
28	Temporal Representations for Exploration: Learning Complex Exploratory Behavior without Extrinsic Rewards	提出基于时间对比表示的探索方法以解决无外部奖励的复杂行为学习问题	locomotion manipulation reinforcement learning
29	Accurate, private, secure, federated U-statistics with higher degree	提出一种基于多方计算的联邦U-统计协议，提升隐私保护下的计算精度。	MPC

🔬 支柱八：物理动画 (Physics-based Animation) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
30	TRAKNN: Efficient Trajectory Aware Spatiotemporal kNN for Rare Meteorological Trajectory Detection	提出TRAKNN，用于高效检测气象时空轨迹中的罕见模式	spatiotemporal
31	DGNet: Discrete Green Networks for Data-Efficient Learning of Spatiotemporal PDEs	DGNet：离散格林网络，用于数据高效的时空偏微分方程学习	spatiotemporal
32	Graph neural network force fields for adiabatic dynamics of lattice Hamiltonians	提出基于图神经网络的力场模型，用于晶格哈密顿量的绝热动力学模拟。	spatiotemporal

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
33	DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking	提出DUEL框架以解决MDM困境并实现精确似然计算	MDM

⬅️ 返回 cs.LG 首页 · 🏠 返回主页