cs.LG（2025-05-26）

📊 共 23 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (14 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO	提出细致理论分析以理解RLHF与DPO间的性能差距	reinforcement learning preference learning RLHF
2	Alignment of large language models with constrained learning	提出基于拉格朗日对偶的迭代方法以解决约束对齐问题	RLHF large language model
3	Learning a Pessimistic Reward Model in RLHF	提出PET方法以解决离线RLHF中的奖励黑客问题	reinforcement learning offline reinforcement learning RLHF
4	Rotary Masked Autoencoders are Versatile Learners	提出Rotary Masked Autoencoder以解决时间序列学习问题	representation learning masked autoencoder MAE
5	JEDI: Latent End-to-end Diffusion Mitigates Agent-Human Performance Asymmetry in Model-Based Reinforcement Learning	提出JEDI以解决模型基础强化学习中的人机性能不对称问题	reinforcement learning world model
6	The Limits of Preference Data for Post-Training	研究偏好数据对后训练优化的限制及其影响	reinforcement learning RLHF large language model
7	An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning	提出可解释的神经退行性痴呆诊断框架以提升诊断透明度	reinforcement learning distillation large language model
8	DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning	提出DISCOVER以解决稀疏奖励强化学习中的探索问题	reinforcement learning
9	Equivariant Representation Learning for Symmetry-Aware Inference with Guarantees	提出对称感知推理的等变表示学习框架	representation learning
10	Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning	提出强化学习框架以优化少步文本到多视图扩散模型	reinforcement learning
11	The challenge of hidden gifts in multi-agent reinforcement learning	提出解决多智能体强化学习中隐藏礼物问题的新方法	reinforcement learning
12	Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning	提出贝叶斯自适应强化学习方法以增强LLM的反思探索能力	reinforcement learning large language model	✅
13	Characterizing Pattern Matching and Its Limits on Compositional Task Structures	提出模式匹配的形式化框架以解决组合任务中的泛化问题	Mamba chain-of-thought
14	ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining	提出ESLM以提高大语言模型预训练的效率与鲁棒性	distillation large language model

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
15	Multimodal Federated Learning With Missing Modalities through Feature Imputation Network	提出轻量级特征翻译网络以解决多模态联邦学习中的缺失模态问题	multimodal	✅
16	BASE-Q: Bias and Asymmetric Scaling Enhanced Rotational Quantization for Large Language Models	提出BASE-Q以解决大语言模型量化中的偏差与剪切误差问题	large language model
17	HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models	提出HoPE以解决长视频理解中的位置编码问题	large language model multimodal	✅
18	LLM Web Dynamics: Tracing Model Collapse in a Network of LLMs	提出LLM Web Dynamics框架以解决模型崩溃问题	large language model
19	Towards Fully FP8 GEMM LLM Training at Scale	提出全FP8 GEMM LLM训练架构以提升大规模训练效率	large language model
20	The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology	利用持久同调分析LLM的对抗影响特征	large language model
21	Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents	提出基于LLM的框架以模拟不同认知水平学生的学习行为	large language model
22	MESS+: Dynamically Learned Inference-Time LLM Routing in Model Zoos with Service Level Guarantees	提出MESS+以优化LLM请求路由并确保服务质量	large language model

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
23	ETS: Open Vocabulary Electroencephalography-To-Text Decoding and Sentiment Classification	提出ETS框架以解决开放词汇脑电图到文本解码问题	open-vocabulary open vocabulary

⬅️ 返回 cs.LG 首页 · 🏠 返回主页