cs.LG(2025-12-17)

📊 共 21 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (12) 支柱九:具身大模型 (Embodied Foundation Models) (7 🔗1) 支柱四:生成式动作 (Generative Motion) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
1 How Many Heads Make an SSM? A Unified Framework for Attention and State Space Models 提出统一框架,分析Attention和状态空间模型(SSM)的表达能力与训练权衡。 latent dynamics SSM state space model
2 Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning 提出G2RL:利用梯度引导强化学习提升LLM推理能力 reinforcement learning PPO large language model
3 Autonomous Pressure Control in MuVacAS via Deep Reinforcement Learning and Deep Learning Surrogate Models 提出基于深度强化学习和深度学习代理模型的MuVacAS自主压力控制方法 reinforcement learning deep reinforcement learning
4 Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction 揭示自回归语言模型与能量模型等价性,洞察其前瞻能力 reinforcement learning distillation large language model
5 Automatic Reward Shaping from Multi-Objective Human Heuristics 提出MORSE框架,通过多目标人类启发式自动进行强化学习奖励塑造 reinforcement learning reward shaping
6 A Teacher-Student Perspective on the Dynamics of Learning Near the Optimal Point 研究神经网络优化点附近的学习动态,揭示Hessian矩阵特征谱的关键作用 teacher-student
7 EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning 提出EUBRL算法,利用认知不确定性指导贝叶斯强化学习探索,提升样本效率。 reinforcement learning
8 Distillation-Guided Structural Transfer for Continual Learning Beyond Sparse Distributed Memory 提出选择性子网络蒸馏(SSD)框架,提升稀疏神经网络的持续学习能力。 distillation
9 TrajSyn: Privacy-Preserving Dataset Distillation from Federated Model Trajectories for Server-Side Adversarial Training TrajSyn:联邦学习中基于模型轨迹的隐私保护数据集蒸馏,用于服务端对抗训练 distillation
10 Feature-Centric Unsupervised Node Representation Learning Without Homophily Assumption FUEL:一种无需同质性假设的特征中心无监督节点表示学习方法 representation learning
11 Spectral Representation-based Reinforcement Learning 提出基于谱表示的强化学习框架,解决传统方法在复杂环境中的难题。 reinforcement learning
12 SoFlow: Solution Flow Models for One-Step Generative Modeling SoFlow:提出解决方案流模型,实现一步到位的生成建模,提升生成效率。 flow matching classifier-free guidance

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
13 Copyright Infringement Risk Reduction via Chain-of-Thought and Task Instruction Prompting 结合思维链与任务指令提示,降低文本到图像生成模型的版权侵权风险 chain-of-thought
14 Dynamic Rebatching for Efficient Early-Exit Inference with DREX 提出动态重批处理以解决早期退出推理效率问题 large language model
15 Behavior Tokens Speak Louder: Disentangled Explainable Recommendation with Behavior Vocabulary BEAT:通过行为词汇实现可解释推荐,解决现有方法语义模糊和结构限制问题。 large language model
16 DEER: Draft with Diffusion, Verify with Autoregressive Models DEER:利用扩散模型进行草稿生成,自回归模型进行验证,提升LLM推理效率。 large language model
17 The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks FEAML:利用LLM桥接结构化数据与多标签任务,实现自动化特征工程 large language model
18 SeBERTis: A Framework for Producing Classifiers of Security-Related Issue Reports SEBERTIS:一个用于生成安全相关问题报告分类器的框架 large language model
19 DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding DreamPRM-Code:利用函数作为步骤的过程奖励模型,通过标签校正提升LLM代码生成能力 large language model

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
20 Robustness Evaluation of Machine Learning Models for Fault Classification and Localization In Power System Protection 提出电力系统保护中机器学习模型鲁棒性评估框架,解决恶劣工况下的可靠性问题。 penetration

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
21 PIP$^2$ Net: Physics-informed Partition Penalty Deep Operator Network 提出PIP$^2$ Net,通过物理信息分区惩罚提升DeepONet在求解参数化偏微分方程中的精度和鲁棒性。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页