cs.LG(2026-03-03)
📊 共 26 篇论文 | 🔗 3 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (11 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (9 🔗1)
支柱一:机器人控制 (Robot Control) (2)
支柱八:物理动画 (Physics-based Animation) (2)
支柱五:交互与反应 (Interaction & Reaction) (1)
支柱四:生成式动作 (Generative Motion) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (11 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models | 提出GraphSSR框架,通过自适应子图去噪提升LLM在零样本图学习中的性能 | reinforcement learning large language model | ||
| 13 | Contextual Latent World Models for Offline Meta Reinforcement Learning | 提出上下文潜在世界模型,用于离线元强化学习中的泛化任务。 | reinforcement learning world model representation learning | ||
| 14 | SaFeR-ToolKit: Structured Reasoning via Virtual Tool Calling for Multimodal Safety | SaFeR-ToolKit:通过虚拟工具调用实现多模态安全结构化推理 | DPO multimodal | ✅ | |
| 15 | CGL: Advancing Continual GUI Learning via Reinforcement Fine-Tuning | 提出CGL框架,通过强化微调提升GUI Agent的持续学习能力 | reinforcement learning large language model multimodal | ||
| 16 | Breaking the Prototype Bias Loop: Confidence-Aware Federated Contrastive Learning for Highly Imbalanced Clients | 提出信心感知的联邦对比学习以解决客户端数据不平衡问题 | contrastive learning geometric consistency | ||
| 17 | Next Embedding Prediction Makes World Models Stronger | NE-Dreamer:基于Transformer的下一嵌入预测增强世界模型 | reinforcement learning world model dreamer | ||
| 18 | Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling | 提出基于记忆增强改进搜索的MIStar框架,解决柔性作业车间调度问题 | reinforcement learning deep reinforcement learning DRL | ||
| 19 | Heterogeneous Agent Collaborative Reinforcement Learning | 提出HACRL框架,通过异构智能体协同强化学习提升样本利用率和知识迁移。 | reinforcement learning distillation | ||
| 20 | Reinforcement Learning with Symbolic Reward Machines | 提出符号奖励机(SRM),解决强化学习中奖励函数人工标注问题 | reinforcement learning |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 21 | Real-Time Generative Policy via Langevin-Guided Flow Matching for Autonomous Driving | 提出基于Langevin引导的Flow Matching实时生成策略DACER-F,用于自动驾驶。 | humanoid reinforcement learning flow matching | ||
| 22 | Improving Diffusion Planners by Self-Supervised Action Gating with Energies | SAGE:通过自监督能量动作门控改进扩散规划器,提升动态一致性。 | locomotion manipulation reinforcement learning |
🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 23 | Physics-informed post-processing of stabilized finite element solutions for transient convection-dominated problems | 提出基于物理信息的后处理方法,提升对流占优瞬态问题稳定有限元解的精度 | spatiotemporal | ||
| 24 | SynthCharge: An Electric Vehicle Routing Instance Generator with Feasibility Screening to Enable Learning-Based Optimization and Benchmarking | SynthCharge:一种电动汽车路径规划实例生成器,支持学习优化与基准测试。 | spatiotemporal |
🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 25 | Integrating Homomorphic Encryption and Synthetic Data in FL for Privacy and Learning Quality | 提出Alt-FL:结合同态加密与合成数据,提升联邦学习隐私与模型质量 | OMOMO |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 26 | Bridging Diffusion Guidance and Anderson Acceleration via Hopfield Dynamics | 通过Hopfield动态桥接扩散引导与Anderson加速,提升生成质量。 | classifier-free guidance |