cs.CL(2026-04-06)

📊 共 38 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (27) 支柱二:RL算法与架构 (RL & Architecture) (10) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (27 篇)

#题目一句话要点标签🔗
1 SocioEval: A Template-Based Framework for Evaluating Socioeconomic Status Bias in Foundation Models SocioEval:一个基于模板的框架,用于评估基础模型中的社会经济地位偏见 large language model foundation model
2 Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy 提出ChomskyBench,通过乔姆斯基谱系系统评估大语言模型的形式推理能力。 large language model
3 Social Meaning in Large Language Models: Structure, Magnitude, and Pragmatic Prompting 提出ESR和CDS指标,并利用语用提示提升LLM社会推理能力 large language model
4 Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation 综述LLM上下文增强技术:从上下文提示到因果检索增强生成 large language model
5 When Modalities Remember: Continual Learning for Multimodal Knowledge Graphs 提出MRCKG模型,解决持续多模态知识图谱推理中的灾难性遗忘问题。 multimodal
6 BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence 提出行为对齐分数(BAS)评估LLM置信度,优化决策并避免过度自信。 large language model
7 Debating Truth: Debate-driven Claim Verification with Multiple Large Language Model Agents 提出DebateCV框架,利用多智能体辩论驱动的声明验证,提升复杂声明的验证精度。 large language model
8 Quick on the Uptake: Eliciting Implicit Intents from Human Demonstrations for Personalized Mobile-Use Agents 提出IFRAgent框架,通过显式和隐式意图学习增强个性化移动代理 large language model multimodal
9 VeriOS: Query-Driven Proactive Human-Agent-GUI Interaction for Trustworthy OS Agents 提出VeriOS,通过查询驱动的人机交互提升OS Agent在不可信环境下的可靠性 large language model multimodal
10 Borderless Long Speech Synthesis 提出Borderless长语音合成框架,实现Agent驱动的、无边界的语音生成。 instruction following chain-of-thought
11 Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language Models 揭示大语言模型中的确认偏差并提出干预策略以提升规则发现能力 large language model
12 LLM Analysis of 150+ years of German Parliamentary Debates on Migration Reveals Shift from Post-War Solidarity to Anti-Solidarity in the Last Decade 利用LLM分析德国议会百年辩论,揭示从战后团结到反团结的转变 large language model
13 Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems 通过先验知识降低多智能体系统中谄媚行为,提升讨论准确性 large language model
14 Council Mode: Mitigating Hallucination and Bias in LLMs via Multi-Agent Consensus 提出Council Mode,通过多Agent共识机制缓解LLM中的幻觉和偏见问题。 large language model
15 How Annotation Trains Annotators: Competence Development in Social Influence Recognition 研究标注过程对标注者能力的影响,提升社交影响力识别任务的数据质量。 large language model
16 LogicPoison: Logical Attacks on Graph Retrieval-Augmented Generation 提出LogicPoison,针对图检索增强生成系统的逻辑连接进行攻击。 large language model
17 Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control 提出基于LLM表征空间中效价-唤醒子空间的情感控制方法,实现多行为操控。 large language model
18 Human Psychometric Questionnaires Mischaracterize LLM Psychology: Evidence from Generation Behavior 揭示人类心理测量问卷在刻画LLM心理特征上的局限性,提出基于生成行为的心理测量方法。 large language model
19 What Is The Political Content in LLMs' Pre- and Post-Training Data? 分析LLM训练数据中的政治倾向,揭示数据偏差对模型政治立场的影响 large language model
20 Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding 提出SemKey框架,通过解耦语义引导实现脑电信号到文本解码的突破。 large language model
21 SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy 提出SWAY指标与反事实CoT缓解策略,以应对大语言模型的谄媚问题 large language model
22 Beyond Precision: Importance-Aware Recall for Factuality Evaluation in Long-Form LLM Generation 提出重要性感知召回指标,用于评估长文本生成的事实性 large language model
23 Detecting and Correcting Reference Hallucinations in Commercial LLMs and Deep Research Agents 系统性检测并校正商业LLM和深度研究Agent中的引用幻觉 large language model
24 Measuring What Cannot Be Surveyed: LLMs as Instruments for Latent Cognitive Variables in Labor Economics 利用大型语言模型测量难以调查的潜在认知变量,应用于劳动经济学。 large language model
25 BibTeX Citation Hallucinations in Scientific Publishing Agents: Evaluation and Mitigation 针对科学出版代理中BibTeX引用幻觉问题,提出评估基准和clibib缓解方案 large language model
26 Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen! 揭示微调开源LLM的数据泄露风险:攻击者可通过后门提取微调数据 large language model
27 IslamicMMLU: A Benchmark for Evaluating LLMs on Islamic Knowledge IslamicMMLU:构建伊斯兰知识评估基准,评测大型语言模型性能 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
28 Future Policy Approximation for Offline Reinforcement Learning Improves Mathematical Reasoning 提出未来策略近似(FPA)方法,提升离线强化学习在数学推理任务上的性能。 reinforcement learning offline RL offline reinforcement learning
29 Rubrics to Tokens: Bridging Response-level Rubrics and Token-level Rewards in Instruction Following Tasks 提出RTT框架,通过token级别奖励弥合response级别评价标准与指令跟随任务的差距。 reinforcement learning large language model instruction following
30 Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time Selection 提出Gen-SSD,通过生成时选择进行学生模型思维链蒸馏,提升数学推理能力。 distillation chain-of-thought
31 Reinforcement Learning-based Knowledge Distillation with LLM-as-a-Judge 提出基于强化学习和LLM判定的无标签知识蒸馏框架,提升数学推理能力。 reinforcement learning distillation large language model
32 R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning R2-Write:通过深度推理中的反思与修订,提升开放式写作质量 reinforcement learning large language model chain-of-thought
33 JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency JoyAI-LLM Flash:通过Token效率提升中等规模LLM性能 reinforcement learning DPO direct preference optimization
34 Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization 提出可靠性门控的多教师蒸馏方法,用于低资源抽象摘要生成 distillation
35 Train Yourself as an LLM: Exploring Effects of AI Literacy on Persuasion via Role-playing LLM Training 提出LLMimic:通过角色扮演LLM训练提升AI素养,降低AI说服力 RLHF large language model
36 Multi-Aspect Knowledge Distillation for Language Model with Low-rank Factorization 提出多方面知识蒸馏方法MaKD,提升低秩分解语言模型性能 distillation
37 Learning the Signature of Memorization in Autoregressive Language Models 提出可迁移的自回归语言模型记忆签名学习方法,提升成员推断攻击效果 Mamba linear attention

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
38 Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models 提出ARAM自适应引导框架,解决检索增强扩散模型中的检索先验冲突问题。 MDM

⬅️ 返回 cs.CL 首页 · 🏠 返回主页