cs.CL(2025-09-25)
📊 共 35 篇论文 | 🔗 6 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (25 🔗4)
支柱二:RL算法与架构 (RL & Architecture) (9 🔗2)
支柱三:空间感知与语义 (Perception & Semantics) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (25 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 26 | Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective | 针对大语言模型优化,提出多目标强化学习的远景框架 | reinforcement learning large language model | ||
| 27 | Retrieval over Classification: Integrating Relation Semantics for Multimodal Relation Extraction | 提出ROC框架,将多模态关系抽取重构为检索任务,提升细粒度关系理解能力。 | contrastive learning large language model multimodal | ||
| 28 | Painless Activation Steering: An Automated, Lightweight Approach for Post-Training Large Language Models | 提出Painless Activation Steering (PAS),一种全自动、轻量级的后训练大语言模型激活向量调控方法。 | reinforcement learning large language model | ||
| 29 | SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines | SciReasoner:构建跨学科的科学推理基础模型 | reinforcement learning reward shaping foundation model | ✅ | |
| 30 | Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning | 提出双头推理蒸馏(DHRD),在不牺牲推理速度的前提下提升分类器精度。 | distillation chain-of-thought | ||
| 31 | Learning to Reason with Mixture of Tokens | 提出混合Token生成方法,提升LLM在可验证奖励强化学习中的推理能力。 | reinforcement learning large language model chain-of-thought | ||
| 32 | Hallucination reduction with CASAL: Contrastive Activation Steering For Amortized Learning | CASAL:对比激活引导的摊销学习,有效降低大语言模型幻觉 | DPO large language model | ||
| 33 | A State-of-the-Art SQL Reasoning Model using RLVR | 利用可验证奖励的强化学习,提出SQL推理模型RLVR,在BIRD数据集上达到SOTA。 | reinforcement learning offline RL | ||
| 34 | RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards | 提出RLBFF,结合人类反馈和可验证奖励,提升LLM对齐效果并支持推理时自定义原则。 | reinforcement learning RLHF | ✅ |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 35 | Vision Language Models Cannot Plan, but Can They Formalize? | 提出VLM作为形式化工具以解决多模态规划问题 | open-vocabulary open vocabulary multimodal |