cs.LG（2026-04-01）

📊 共 22 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (8 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱一：机器人控制 (Robot Control) (3) 支柱三：空间感知与语义 (Perception & Semantics) (2) 支柱六：视频提取与匹配 (Video Extraction) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
1	MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understanding	提出MOON3.0，一种推理感知的多模态表征学习方法，用于电商产品理解。	reinforcement learning representation learning large language model
2	A Survey of On-Policy Distillation for Large Language Models	针对大语言模型的On-Policy蒸馏方法综述，解决暴露偏差问题。	imitation learning distillation large language model
3	Policy Improvement Reinforcement Learning	提出PIRL框架，通过显式优化策略迭代间的累积改进来提升LLM的推理能力。	reinforcement learning large language model
4	GUIDE: Reinforcement Learning for Behavioral Action Support in Type 1 Diabetes	提出GUIDE框架，利用强化学习为1型糖尿病患者提供行为干预决策支持。	reinforcement learning offline RL CQL
5	Focal plane wavefront control with model-based reinforcement learning	提出基于模型强化学习的焦平面波前控制方法PO4NCPA，用于校正高对比度成像中的动态和静态像差。	reinforcement learning model-based RL
6	NeuroDDAF: Neural Dynamic Diffusion-Advection Fields with Evidential Fusion for Air Quality Forecasting	NeuroDDAF：融合证据的神经动态扩散-平流场，用于空气质量预测	representation learning MAE spatiotemporal
7	Deconfounding Scores and Representation Learning for Causal Effect Estimation with Weak Overlap	提出去混淆评分以解决因果效应估计中的重叠问题	representation learning
8	Learning to Hint for Reinforcement Learning	提出HiLL框架，通过自适应提示学习提升强化学习在复杂任务中的性能。	reinforcement learning	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
9	Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction	提出谱紧凑训练(SCT)，通过截断SVD和Stiefel流形QR回撤预训练大语言模型，显著降低内存占用。	large language model
10	Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning	提出在线推理校准ORCA，通过测试时训练提升LLM推理的泛化性和效率。	large language model	✅
11	Reasoning Shift: How Context Silently Shortens LLM Reasoning	上下文干扰导致LLM推理链缩短，降低自我验证能力	large language model
12	Fast and Accurate Probing of In-Training LLMs' Downstream Performances	提出一种快速准确的探针方法，用于评估训练中LLM的下游性能	large language model
13	Optimal Brain Decomposition for Accurate LLM Low-Rank Approximation	提出最优脑分解方法以提升大语言模型低秩近似精度	large language model
14	Scalable Pretraining of Large Mixture of Experts Language Models on Aurora Super Computer	在Aurora超算上预训练大规模混合专家语言模型，实现高效扩展。	large language model
15	Exploring Silent Data Corruption as a Reliability Challenge in LLM Training	研究LLM训练中静默数据损坏问题，提出轻量级检测与重算缓解方法	large language model
16	G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs	G-Drift MIA：基于梯度诱导特征漂移的大语言模型成员推断攻击	large language model

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
17	Flow-based Policy With Distributional Reinforcement Learning in Trajectory Optimization	提出基于Flow的策略与分布强化学习算法FP-DRL，提升轨迹优化中多模态策略的表达能力。	trajectory optimization reinforcement learning DRL
18	Gradient-Based Data Valuation Improves Curriculum Learning for Game-Theoretic Motion Planning	利用梯度数据估值改进博弈运动规划的课程学习	motion planning curriculum learning
19	Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout	提出基于概率边丢弃的拜占庭容错梯度追踪方法，解决分布式优化中的恶意攻击问题。	manipulation

🔬 支柱三：空间感知与语义 (Perception & Semantics) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
20	ActivityNarrated: An Open-Ended Narrative Paradigm for Wearable Human Activity Understanding	提出ActivityNarrated框架，以开放式叙事范式提升可穿戴设备的人类活动理解能力	open-vocabulary open vocabulary language conditioned
21	Property-Level Flood Risk Assessment Using AI-Enabled Street-View Lowest Floor Elevation Extraction and ML Imputation Across Texas	利用AI街景图像和机器学习插补进行德克萨斯州房屋级洪水风险评估	elevation map

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
22	LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)	LAPIS-SHRED：利用浅层循环解码器从短时序列推断潜在相位，重建时空动态。	sparse sensors spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页