cs.LG(2026-04-10)

📊 共 25 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (9) 支柱八:物理动画 (Physics-based Animation) (3) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Revisiting the Capacity Gap in Chain-of-Thought Distillation from a Practical Perspective 重新审视CoT蒸馏中的能力差距,关注实际应用场景 teacher-student distillation chain-of-thought
2 WOMBET: World Model-based Experience Transfer for Robust and Sample-efficient Reinforcement Learning WOMBET:基于世界模型的经验迁移,提升强化学习的鲁棒性和样本效率 reinforcement learning world model world models
3 Toward World Models for Epidemiology 提出流行病学世界模型框架,解决流行病决策中潜变量推理与反事实推断难题 world model world models latent dynamics
4 SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning SafeAdapt:深度强化学习中基于Rashomon集的策略安全更新 reinforcement learning deep reinforcement learning
5 On the Role of DAG topology in Energy-Aware Cloud Scheduling : A GNN-Based Deep Reinforcement Learning Approach 提出基于GNN的深度强化学习调度器以优化云计算资源分配 reinforcement learning deep reinforcement learning
6 Truncated Rectified Flow Policy for Reinforcement Learning with One-Step Sampling 提出截断修正流策略TRFP,解决最大熵强化学习中策略建模的局限性问题。 reinforcement learning flow matching multimodal
7 Bridging SFT and RL: Dynamic Policy Optimization for Robust Reasoning 提出DYPO框架,通过动态策略优化提升LLM在复杂推理任务中的鲁棒性。 reinforcement learning distillation large language model
8 Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning 提出高效分层隐式流Q学习,解决离线目标条件强化学习中的长程控制问题 reinforcement learning
9 A Closer Look at the Application of Causal Inference in Graph Representation Learning 针对图表示学习中因果推断应用,提出基于最小不可分单元的因果建模方法。 representation learning
10 Post-Hoc Guidance for Consistency Models by Joint Flow Distribution Learning 提出JFDL,无需DM教师即可对预训练一致性模型进行后验引导 distillation classifier-free guidance
11 Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication 提出SeqComm-DFL以解决多智能体决策中的信息共享问题 world model world models

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
12 How does Chain of Thought decompose complex tasks? 通过分解复杂任务,链式思考(CoT)能显著降低大语言模型的分类误差。 large language model chain-of-thought
13 Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs 提出DACO框架,通过概念字典对齐控制多模态LLM,提升安全性。 large language model multimodal
14 Modality-Aware Zero-Shot Pruning and Sparse Attention for Efficient Multimodal Edge Inference SentryFuse框架通过模态感知零样本剪枝和稀疏注意力实现高效多模态边缘推理。 multimodal
15 The nextAI Solution to the NeurIPS 2023 LLM Efficiency Challenge 在单A100 GPU上高效微调70B LLaMa2模型,提升资源利用率 large language model foundation model
16 Integrated electro-optic attention nonlinearities for transformers 利用集成电光注意力非线性单元加速Transformer推理 large language model
17 OASIS: Online Activation Subspace Learning for Memory-Efficient Training OASIS:在线激活子空间学习,用于内存高效的大模型训练 large language model
18 Nexus: Same Pretraining Loss, Better Downstream Generalization via Common Minima Nexus优化器:通过寻找共同极小值提升大语言模型下游泛化能力 large language model
19 DiffHLS: Differential Learning for High-Level Synthesis QoR Prediction with GNNs and LLM Code Embeddings DiffHLS:利用GNN和LLM代码嵌入进行HLS QoR预测的差分学习框架 large language model
20 Uncertainty-Aware Transformers: Conformal Prediction for Language Models 提出CONFIDE框架,为Transformer语言模型提供不确定性量化和可解释性。 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (3 篇)

#题目一句话要点标签🔗
21 ANTIC: Adaptive Neural Temporal In-situ Compressor ANTIC:自适应神经时序原位压缩器,解决大规模偏微分方程仿真数据存储瓶颈。 spatiotemporal
22 Drift-Aware Online Dynamic Learning for Nonstationary Multivariate Time Series: Application to Sintering Quality Prediction 提出漂移感知多尺度动态学习框架,用于非平稳多元时间序列预测,应用于烧结矿质量预测。 spatiotemporal
23 PDE-regularized Dynamics-informed Diffusion with Uncertainty-aware Filtering for Long-Horizon Dynamics PDYffusion:结合PDE正则化与不确定性感知滤波的长时程动力学预测 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
24 Event-Driven Temporal Graph Networks for Asynchronous Multi-Agent Cyber Defense in NetForge_RL 提出CT-GMARL,用于NetForge_RL中异步多智能体网络防御,显著提升防御效果。 sim2real reinforcement learning zero-shot transfer
25 Continuous Orthogonal Mode Decomposition: Haptic Signal Prediction in Tactile Internet 提出基于连续正交模式分解的触觉信号预测方法,用于触觉互联网。 teleoperation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页