cs.LG（2025-12-17）

📊 共 21 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (12) 支柱九：具身大模型 (Embodied Foundation Models) (7 🔗1) 支柱四：生成式动作 (Generative Motion) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	How Many Heads Make an SSM? A Unified Framework for Attention and State Space Models	提出统一框架，分析Attention和状态空间模型(SSM)的表达能力与训练权衡。	latent dynamics SSM state space model
2	Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning	提出G2RL：利用梯度引导强化学习提升LLM推理能力	reinforcement learning PPO large language model
3	Autonomous Pressure Control in MuVacAS via Deep Reinforcement Learning and Deep Learning Surrogate Models	提出基于深度强化学习和深度学习代理模型的MuVacAS自主压力控制方法	reinforcement learning deep reinforcement learning
4	Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction	揭示自回归语言模型与能量模型等价性，洞察其前瞻能力	reinforcement learning distillation large language model
5	Automatic Reward Shaping from Multi-Objective Human Heuristics	提出MORSE框架，通过多目标人类启发式自动进行强化学习奖励塑造	reinforcement learning reward shaping
6	A Teacher-Student Perspective on the Dynamics of Learning Near the Optimal Point	研究神经网络优化点附近的学习动态，揭示Hessian矩阵特征谱的关键作用	teacher-student
7	EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning	提出EUBRL算法，利用认知不确定性指导贝叶斯强化学习探索，提升样本效率。	reinforcement learning
8	Distillation-Guided Structural Transfer for Continual Learning Beyond Sparse Distributed Memory	提出选择性子网络蒸馏(SSD)框架，提升稀疏神经网络的持续学习能力。	distillation
9	TrajSyn: Privacy-Preserving Dataset Distillation from Federated Model Trajectories for Server-Side Adversarial Training	TrajSyn：联邦学习中基于模型轨迹的隐私保护数据集蒸馏，用于服务端对抗训练	distillation
10	Feature-Centric Unsupervised Node Representation Learning Without Homophily Assumption	FUEL：一种无需同质性假设的特征中心无监督节点表示学习方法	representation learning
11	Spectral Representation-based Reinforcement Learning	提出基于谱表示的强化学习框架，解决传统方法在复杂环境中的难题。	reinforcement learning
12	SoFlow: Solution Flow Models for One-Step Generative Modeling	SoFlow：提出解决方案流模型，实现一步到位的生成建模，提升生成效率。	flow matching classifier-free guidance

🔬 支柱九：具身大模型 (Embodied Foundation Models) (7 篇)

#	题目	一句话要点	标签	🔗	⭐
13	Copyright Infringement Risk Reduction via Chain-of-Thought and Task Instruction Prompting	结合思维链与任务指令提示，降低文本到图像生成模型的版权侵权风险	chain-of-thought
14	Dynamic Rebatching for Efficient Early-Exit Inference with DREX	提出动态重批处理以解决早期退出推理效率问题	large language model
15	Behavior Tokens Speak Louder: Disentangled Explainable Recommendation with Behavior Vocabulary	BEAT：通过行为词汇实现可解释推荐，解决现有方法语义模糊和结构限制问题。	large language model
16	DEER: Draft with Diffusion, Verify with Autoregressive Models	DEER：利用扩散模型进行草稿生成，自回归模型进行验证，提升LLM推理效率。	large language model	✅
17	The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks	FEAML：利用LLM桥接结构化数据与多标签任务，实现自动化特征工程	large language model
18	SeBERTis: A Framework for Producing Classifiers of Security-Related Issue Reports	SEBERTIS：一个用于生成安全相关问题报告分类器的框架	large language model
19	DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding	DreamPRM-Code：利用函数作为步骤的过程奖励模型，通过标签校正提升LLM代码生成能力	large language model

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Robustness Evaluation of Machine Learning Models for Fault Classification and Localization In Power System Protection	提出电力系统保护中机器学习模型鲁棒性评估框架，解决恶劣工况下的可靠性问题。	penetration

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
21	PIP$^2$ Net: Physics-informed Partition Penalty Deep Operator Network	提出PIP$^2$ Net，通过物理信息分区惩罚提升DeepONet在求解参数化偏微分方程中的精度和鲁棒性。	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页