cs.LG(2026-03-04)

📊 共 21 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (10 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (7 🔗1) 支柱八:物理动画 (Physics-based Animation) (2) 支柱一:机器人控制 (Robot Control) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1 🔗1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
1 Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning 提出预训练视觉-语言-动作模型以解决持续学习中的遗忘问题 policy learning behavior cloning vision-language-action
2 Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal Attacks 提出DMAST框架,提升多模态Web Agent在跨模态攻击下的鲁棒性与任务效率。 reinforcement learning imitation learning multimodal
3 Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection 提出概率导航架构以提升状态空间模型的自我意识 SSM state space model zero-shot transfer
4 What Does Flow Matching Bring To TD Learning? 提出流匹配方法以提升时序差分学习效果 reinforcement learning flow matching
5 Fairness Begins with State: Purifying Latent Preferences for Hierarchical Reinforcement Learning in Interactive Recommendation 提出DSRM-HRL框架以解决交互推荐中的公平性问题 reinforcement learning reward shaping
6 GIPO: Gaussian Importance Sampling Policy Optimization GIPO:基于高斯重要性采样策略优化,提升强化学习数据效率 reinforcement learning multimodal
7 BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning 提出BD-Merging,通过证据引导对比学习实现偏差感知的动态模型融合,提升模型在分布偏移下的鲁棒性。 contrastive learning
8 Harmonic Dataset Distillation for Time Series Forecasting 提出HDT,通过频域谐波匹配进行时间序列数据集蒸馏,提升泛化性和可扩展性 distillation
9 A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications 提出基于约束强化学习的低成本、低延迟敏感型应用数据传输方案 reinforcement learning deep reinforcement learning
10 Freezing of Gait Prediction using Proactive Agent that Learns from Selected Experience and DDQN Algorithm 提出基于DDQN和经验回放的强化学习框架,用于帕金森患者步态冻结的预测。 reinforcement learning reward shaping

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
11 Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading 评估PDE基础模型在极端载荷下材料动力学中的泛化能力 foundation model
12 LUMINA: Foundation Models for Topology Transferable ACOPF LUMINA:用于拓扑可迁移ACOPF的基础模型框架 foundation model
13 Causality Elicitation from Large Language Models 提出一种从大型语言模型中提取因果关系假设的流程框架。 large language model
14 Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization 提出对抗对齐雅可比正则化,提升Agentic AI系统鲁棒性 large language model
15 A Multi-Dimensional Quality Scoring Framework for Decentralized LLM Inference with Proof of Quality 提出多维度质量评分框架,用于去中心化LLM推理中的质量评估与激励。 large language model
16 Relational In-Context Learning via Synthetic Pre-training with Structural Prior 提出RDB-PFN,通过合成数据预训练实现关系数据库的上下文学习。 foundation model
17 MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier 提出MOOSE-Star以解决科学发现中的复杂性训练问题 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
18 Non-Invasive Reconstruction of Cardiac Activation Dynamics Using Physics-Informed Neural Networks 提出基于物理信息神经网络的非侵入式心脏激活动力学重建方法 spatiotemporal
19 Adaptive Sensing of Continuous Physical Systems for Machine Learning 提出自适应感知框架,优化物理系统的信息提取与机器学习预测。 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
20 IPD: Boosting Sequential Policy with Imaginary Planning Distillation in Offline Reinforcement Learning IPD:离线强化学习中基于想象规划蒸馏提升序列策略 MPC model predictive control reinforcement learning

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
21 mlx-vis: GPU-Accelerated Dimensionality Reduction and Visualization on Apple Silicon mlx-vis:Apple Silicon GPU加速的降维与可视化库 splatting

⬅️ 返回 cs.LG 首页 · 🏠 返回主页