cs.LG（2025-09-26）

📊 共 39 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (22 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (11 🔗2) 支柱四：生成式动作 (Generative Motion) (3) 支柱一：机器人控制 (Robot Control) (3 🔗1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (22 篇)

#	题目	一句话要点	标签	🔗
1	Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective	提出基于约束马尔可夫决策过程的大语言模型蒸馏方法	reinforcement learning distillation large language model
2	Aurora: Towards Universal Generative Multimodal Time Series Forecasting	Aurora：面向通用生成式多模态时间序列预测的基座模型	flow matching distillation foundation model
3	Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining	C-FREE：一种无对比多模态自监督分子图预训练方法，融合2D拓扑和3D结构信息。	representation learning multimodal
4	SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly	SpinGPT：一种基于大型语言模型解决德州扑克问题的方案	reinforcement learning large language model
5	Enriching Knowledge Distillation with Intra-Class Contrastive Learning	提出基于类内对比学习的知识蒸馏方法，提升软标签的信息丰富度	contrastive learning distillation
6	Reinforcement Learning with Discrete Diffusion Policies for Combinatorial Action Spaces	提出基于离散扩散策略的强化学习方法，解决组合动作空间难题	reinforcement learning diffusion policy
7	Adaptive Margin RLHF via Preference over Preferences	提出DPO-PoP，利用偏好之上的偏好信息自适应调整边际，提升RLHF性能。	reinforcement learning RLHF DPO
8	Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning	SPEAR：基于自模仿学习和渐进探索的Agentic强化学习方法	reinforcement learning imitation learning reward shaping
9	Linear Causal Representation Learning by Topological Ordering, Pruning, and Disentanglement	提出一种基于拓扑排序、剪枝和解耦的线性因果表示学习方法	representation learning large language model
10	Context and Diversity Matter: The Emergence of In-Context Learning in World Models	提出上下文环境学习(ICEL)框架，提升世界模型在未知环境下的适应性。	world model embodied AI
11	Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)	提出RealUID：一种通用的无GAN匹配模型逆向蒸馏框架，可利用真实数据加速生成。	flow matching distillation
12	In-Context Learning can Perform Continual Learning Like Humans	提出上下文持续学习(ICCL)，实现类人长期记忆和跨任务知识积累。	Mamba linear attention large language model
13	Adaptive Dual-Mode Distillation with Incentive Schemes for Scalable, Heterogeneous Federated Learning on Non-IID Data	提出自适应双模式蒸馏与激励机制，解决非独立同分布数据下异构联邦学习的可扩展性问题。	distillation
14	RLP: Reinforcement as a Pretraining Objective	提出RLP：一种将强化学习作为预训练目标的方法，提升模型推理能力。	reinforcement learning chain-of-thought
15	A Theoretical Analysis of Discrete Flow Matching Generative Models	为离散流匹配生成模型提供理论分析，证明其收敛性	flow matching
16	Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives	提出MA-SPL和MA-MPL算法以解决多智能体在线协调问题	policy learning
17	EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning	提出EPO算法，解决LLM Agent在多轮稀疏奖励强化学习中的探索-利用崩溃问题	reinforcement learning
18	From Parameters to Behavior: Unsupervised Compression of the Policy Space	提出无监督方法压缩策略空间以提高深度强化学习效率	reinforcement learning deep reinforcement learning DRL
19	Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning	提出MASA自对齐强化学习，提升推理模型元认知能力与泛化性	reinforcement learning
20	Fairness-Aware Reinforcement Learning (FAReL): A Framework for Transparent and Balanced Sequential Decision-Making	提出FAReL框架，解决强化学习中性能与公平性的权衡问题，应用于招聘和欺诈检测。	reinforcement learning
21	Overclocking Electrostatic Generative Models	提出逆泊松流匹配以加速电静态生成模型	flow matching distillation
22	Triple-BERT: Do We Really Need MARL for Order Dispatch on Ride-Sharing Platforms?	Triple-BERT：用于网约车订单调度的单智能体强化学习方法，性能优于多智能体强化学习。	reinforcement learning TD3	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (11 篇)

#	题目	一句话要点	标签	🔗
23	Fine-Grained Uncertainty Decomposition in Large Language Models: A Spectral Approach	提出Spectral Uncertainty，用于大语言模型中细粒度的不确定性分解	large language model
24	Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning	提出Ssiuu方法，通过抑制虚假反学习神经元实现语言模型鲁棒反学习	large language model instruction following
25	OptiMind: Teaching LLMs to Think Like Optimization Experts	OptiMind：教导LLM像优化专家一样思考，提升混合整数线性规划建模精度	large language model
26	SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights	SINQ：用于免校准低精度LLM权重 Sinkhorn 归一化量化	large language model	✅
27	Boundary on the Table: Efficient Black-Box Decision-Based Attacks for Structured Data	针对表格数据的黑盒决策型对抗攻击方法，高效攻击结构化数据模型	large language model
28	What Do They Fix? LLM-Aided Categorization of Security Patches for Critical Memory Bugs	DUALLM：利用LLM辅助识别Linux内核中关键内存漏洞的安全补丁	large language model
29	OFMU: Optimization-Driven Framework for Machine Unlearning	提出OFMU：一种优化驱动的机器学习遗忘框架，提升遗忘效果和模型效用。	large language model
30	Investigating Faithfulness in Large Audio Language Models	研究表明大型音频语言模型（LALM）的思维链（CoT）在一定程度上是可信的。	chain-of-thought
31	Stochastic activations	提出随机激活函数，提升大语言模型推理速度并增强生成文本多样性	large language model
32	HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space	HEAPr：基于Hessian的输出空间高效原子专家剪枝方法	large language model	✅
33	Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs	提出轻量级误差缓解策略，用于LLM后训练N:M激活稀疏化，提升推理效率。	large language model

🔬 支柱四：生成式动作 (Generative Motion) (3 篇)

#	题目	一句话要点	标签
34	MoveFM-R: Advancing Mobility Foundation Models via Language-driven Semantic Reasoning	MoveFM-R：通过语言驱动的语义推理提升出行基础模型性能	physically plausible large language model foundation model
35	Physically Plausible Multi-System Trajectory Generation and Symmetry Discovery	提出SPS-GAN，用于多系统轨迹生成和对称性发现，无需先验知识并泛化到未见参数。	physically plausible
36	Reversible GNS for Dissipative Fluids with Consistent Bidirectional Dynamics	提出可逆图网络模拟器解决耗散流体的双向动态问题	physically plausible

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签	🔗
37	ReLAM: Learning Anticipation Model for Rewarding Visual Robotic Manipulation	提出ReLAM，通过学习预测模型为视觉机器人操作生成奖励	manipulation reinforcement learning reward design
38	A Framework for Scalable Heterogeneous Multi-Agent Adversarial Reinforcement Learning in IsaacLab	扩展IsaacLab框架，实现异构多智能体对抗强化学习的可扩展训练	manipulation reinforcement learning	✅
39	Observation-Free Attacks on Online Learning to Rank	提出针对在线排序学习的无观察攻击框架，提升目标项目排名并诱导线性遗憾。	manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2025-09-26）

🎯 兴趣领域导航

🔬 支柱二：RL算法与架构 (RL & Architecture) (22 篇)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (11 篇)

🔬 支柱四：生成式动作 (Generative Motion) (3 篇)

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册