cs.CL(2025-08-23)
📊 共 14 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation | 提出DeAR以解决文档重排序中的推理与评分平衡问题 | distillation large language model chain-of-thought | ✅ | |
| 10 | KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF | 提出KL正则化Q学习以优化语言模型的强化学习 | reinforcement learning PPO RLHF | ||
| 11 | Decoding Alignment: A Critical Survey of LLM Development Initiatives through Value-setting and Data-centric Lens | 通过价值设定与数据中心视角审视大型语言模型的对齐问题 | reinforcement learning RLHF large language model | ||
| 12 | Being Kind Isn't Always Being Safe: Diagnosing Affective Hallucination in LLMs | 提出AHaBench与AHaPairs以解决大语言模型的情感幻觉问题 | DPO direct preference optimization large language model | ✅ | |
| 13 | Dream to Chat: Model-based Reinforcement Learning on Dialogues with User Belief Modeling | 提出对话世界模型以解决用户信念建模问题 | reinforcement learning world model | ||
| 14 | Learning from Diverse Reasoning Paths with Routing and Collaboration | 提出QR-Distill以解决知识蒸馏中的路径质量问题 | distillation large language model | ✅ |