cs.LG(2026-04-01)

📊 共 22 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (8 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱一:机器人控制 (Robot Control) (3) 支柱三:空间感知与语义 (Perception & Semantics) (2) 支柱六:视频提取与匹配 (Video Extraction) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
1 MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understanding 提出MOON3.0,一种推理感知的多模态表征学习方法,用于电商产品理解。 reinforcement learning representation learning large language model
2 A Survey of On-Policy Distillation for Large Language Models 针对大语言模型的On-Policy蒸馏方法综述,解决暴露偏差问题。 imitation learning distillation large language model
3 Policy Improvement Reinforcement Learning 提出PIRL框架,通过显式优化策略迭代间的累积改进来提升LLM的推理能力。 reinforcement learning large language model
4 GUIDE: Reinforcement Learning for Behavioral Action Support in Type 1 Diabetes 提出GUIDE框架,利用强化学习为1型糖尿病患者提供行为干预决策支持。 reinforcement learning offline RL CQL
5 Focal plane wavefront control with model-based reinforcement learning 提出基于模型强化学习的焦平面波前控制方法PO4NCPA,用于校正高对比度成像中的动态和静态像差。 reinforcement learning model-based RL
6 NeuroDDAF: Neural Dynamic Diffusion-Advection Fields with Evidential Fusion for Air Quality Forecasting NeuroDDAF:融合证据的神经动态扩散-平流场,用于空气质量预测 representation learning MAE spatiotemporal
7 Deconfounding Scores and Representation Learning for Causal Effect Estimation with Weak Overlap 提出去混淆评分以解决因果效应估计中的重叠问题 representation learning
8 Learning to Hint for Reinforcement Learning 提出HiLL框架,通过自适应提示学习提升强化学习在复杂任务中的性能。 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
9 Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction 提出谱紧凑训练(SCT),通过截断SVD和Stiefel流形QR回撤预训练大语言模型,显著降低内存占用。 large language model
10 Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning 提出在线推理校准ORCA,通过测试时训练提升LLM推理的泛化性和效率。 large language model
11 Reasoning Shift: How Context Silently Shortens LLM Reasoning 上下文干扰导致LLM推理链缩短,降低自我验证能力 large language model
12 Fast and Accurate Probing of In-Training LLMs' Downstream Performances 提出一种快速准确的探针方法,用于评估训练中LLM的下游性能 large language model
13 Optimal Brain Decomposition for Accurate LLM Low-Rank Approximation 提出最优脑分解方法以提升大语言模型低秩近似精度 large language model
14 Scalable Pretraining of Large Mixture of Experts Language Models on Aurora Super Computer 在Aurora超算上预训练大规模混合专家语言模型,实现高效扩展。 large language model
15 Exploring Silent Data Corruption as a Reliability Challenge in LLM Training 研究LLM训练中静默数据损坏问题,提出轻量级检测与重算缓解方法 large language model
16 G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs G-Drift MIA:基于梯度诱导特征漂移的大语言模型成员推断攻击 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
17 Flow-based Policy With Distributional Reinforcement Learning in Trajectory Optimization 提出基于Flow的策略与分布强化学习算法FP-DRL,提升轨迹优化中多模态策略的表达能力。 trajectory optimization reinforcement learning DRL
18 Gradient-Based Data Valuation Improves Curriculum Learning for Game-Theoretic Motion Planning 利用梯度数据估值改进博弈运动规划的课程学习 motion planning curriculum learning
19 Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout 提出基于概率边丢弃的拜占庭容错梯度追踪方法,解决分布式优化中的恶意攻击问题。 manipulation

🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)

#题目一句话要点标签🔗
20 ActivityNarrated: An Open-Ended Narrative Paradigm for Wearable Human Activity Understanding 提出ActivityNarrated框架,以开放式叙事范式提升可穿戴设备的人类活动理解能力 open-vocabulary open vocabulary language conditioned
21 Property-Level Flood Risk Assessment Using AI-Enabled Street-View Lowest Floor Elevation Extraction and ML Imputation Across Texas 利用AI街景图像和机器学习插补进行德克萨斯州房屋级洪水风险评估 elevation map

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
22 LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED) LAPIS-SHRED:利用浅层循环解码器从短时序列推断潜在相位,重建时空动态。 sparse sensors spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页