cs.CL(2026-01-06)

📊 共 40 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (29) 支柱二:RL算法与架构 (RL & Architecture) (9) 支柱一:机器人控制 (Robot Control) (1) 支柱六:视频提取与匹配 (Video Extraction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (29 篇)

#题目一句话要点标签🔗
1 Limited Linguistic Diversity in Embodied AI Datasets 分析具身AI数据集的语言多样性,揭示指令重复性问题并提出改进方向。 embodied AI vision-language-action VLA
2 Decoupling the Effect of Chain-of-Thought Reasoning: A Human Label Variation Perspective 揭示思维链推理的解耦效应:基于人类标注变异的视角 chain-of-thought
3 Learning to Diagnose and Correct Moral Errors: Towards Enhancing Moral Sensitivity in Large Language Models 提出基于语用推理的道德敏感性增强方法,提升大语言模型道德判断与纠错能力 large language model
4 NorwAI's Large Language Models: Technical Report NorwAI发布挪威语大型语言模型,提升斯堪的纳维亚语种NLP能力 large language model
5 MedDialogRubrics: A Comprehensive Benchmark and Evaluation Framework for Multi-turn Medical Consultations in Large Language Models MedDialogRubrics:构建多轮医疗咨询的综合评测基准与框架,提升医学LLM诊断能力 large language model
6 MMFormalizer: Multimodal Autoformalization in the Wild MMFormalizer:提出一种多模态自动形式化方法,解决物理世界中数学推理的挑战。 multimodal
7 Beyond the Black Box: Theory and Mechanism of Large Language Models 构建LLM理论框架:生命周期视角下的理论与机制综述 large language model
8 Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration 利用语音基础模型中的线性脚本表示实现零样本转写 foundation model
9 The performances of the Chinese and U.S. Large Language Models on the Topic of Chinese Culture 对比中美大语言模型在中文文化理解上的差异与表现 large language model
10 Punctuation-aware Hybrid Trainable Sparse Attention for Large Language Models 提出Punctuation-aware Hybrid Sparse Attention (PHSA),提升长文本建模中稀疏注意力机制的性能。 large language model
11 EComStage: Stage-wise and Orientation-specific Benchmarking for Large Language Models in E-commerce EComStage:电商大语言模型分阶段、面向场景的综合评测基准 large language model
12 Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration 提出一种多领域校准的迭代结构化剪枝方法,用于压缩大型语言模型。 large language model
13 Towards Comprehensive Stage-wise Benchmarking of Large Language Models in Fact-Checking FactArena:提出全面分阶段评测大语言模型在事实核查中表现的自动化框架 large language model
14 Lil: Less is Less When Applying Post-Training Sparse-Attention Algorithms in Long-Decode Stage 提出早期停止算法,缓解长解码阶段稀疏注意力导致的序列长度增加问题 large language model
15 Self-Verification is All You Need To Pass The Japanese Bar Examination 提出基于自验证的LLM,首次通过日本律师资格考试 large language model
16 Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencoders 提出基于稀疏自编码器的框架,用于检索和操控大语言模型中的高阶语义特征。 large language model
17 LongBench Pro: A More Realistic and Comprehensive Bilingual Long-Context Evaluation Benchmark 提出LongBench Pro,一个更真实全面的双语长文本评估基准,用于评估LLM的长文本理解能力。 large language model
18 TiMem: Temporal-Hierarchical Memory Consolidation for Long-Horizon Conversational Agents TiMem:面向长程对话Agent的时序分层记忆整合框架 large language model
19 Maximizing Local Entropy Where It Matters: Prefix-Aware Localized LLM Unlearning 提出PALU框架,通过局部熵最大化实现高效且低损的大语言模型定向遗忘。 large language model
20 Detecting Hallucinations in Retrieval-Augmented Generation via Semantic-level Internal Reasoning Graph 提出基于语义级内部推理图的RAG幻觉检测方法,提升事实一致性。 large language model
21 Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners 揭示大语言推理模型多语言潜在推理能力:并非完全多语言,存在以英语为中心的倾向 chain-of-thought
22 Stable-RAG: Mitigating Retrieval-Permutation-Induced Hallucinations in Retrieval-Augmented Generation 提出Stable-RAG以缓解RAG中检索排序引起的幻觉问题 large language model
23 Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy 提出System-2策略,提升LLM在大规模计数任务中的准确性 large language model
24 LLM-Augmented Changepoint Detection: A Framework for Ensemble Detection and Automated Explanation 提出LLM增强的变点检测框架,实现集成检测与自动解释。 large language model
25 Enhancing Multilingual RAG Systems with Debiased Language Preference-Guided Query Fusion 提出DeLP指标与DELTA框架,解决多语言RAG系统中由评估偏差导致的语言偏好问题。 large language model
26 Revisiting Data Compression with Language Modeling 利用大型语言模型改进数据压缩,在enwik9数据集上取得SOTA large language model
27 To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs 提出逆社会人口提示以解决LLMs文化对齐问题 large language model
28 SYNAPSE: Empowering LLM Agents with Episodic-Semantic Memory via Spreading Activation SYNAPSE:通过激活扩散赋能LLM Agent以情景-语义记忆,解决长期记忆的断连问题。 large language model
29 EvoRoute: Experience-Driven Self-Routing LLM Agent Systems EvoRoute:提出经验驱动的自路由LLM Agent系统,解决Agent系统三难困境 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
30 Mitigating Prompt-Induced Hallucinations in Large Language Models via Structured Reasoning 提出基于结构化推理的知识蒸馏链模型,缓解大语言模型中的提示诱导幻觉问题 distillation large language model chain-of-thought
31 Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning 提出F-DPO,通过事实感知偏好学习减少LLM中的幻觉问题 preference learning RLHF DPO
32 Correct, Concise and Complete: Multi-stage Training For Adaptive Reasoning 提出多阶段训练方法,通过自适应长度惩罚提升LLM推理效率并减少“过度思考”。 reinforcement learning large language model chain-of-thought
33 MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory MemRL:通过情景记忆上的运行时强化学习实现智能体自进化 reinforcement learning large language model
34 Do LLMs Encode Functional Importance of Reasoning Tokens? 提出贪婪剪枝方法,探究LLM推理token的功能重要性并提升蒸馏效果 distillation large language model
35 UltraLogic: Enhancing LLM Reasoning through Large-Scale Data Synthesis and Bipolar Float Reward UltraLogic:通过大规模数据合成和双极浮点奖励增强LLM推理能力 reinforcement learning large language model
36 STReasoner: Empowering LLMs for Spatio-Temporal Reasoning in Time Series via Spatial-Aware Reinforcement Learning 提出STReasoner,利用空间感知强化学习增强LLM在时序数据中的时空推理能力 reinforcement learning
37 WebAnchor: Anchoring Agent Planning to Stabilize Long-Horizon Web Reasoning WebAnchor:通过锚定Agent规划来稳定长程Web推理 reinforcement learning large language model
38 MiMo-V2-Flash Technical Report MiMo-V2-Flash:一种参数总量309B、激活参数15B的混合专家模型,旨在实现快速推理和强大的Agent能力。 reinforcement learning distillation

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
39 Window-based Membership Inference Attacks Against Fine-tuned Large Language Models 提出基于窗口比较的WBC方法,提升针对微调大语言模型的成员推理攻击效果。 WBC large language model

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
40 Who Laughs with Whom? Disentangling Influential Factors in Humor Preferences across User Clusters and LLMs 通过用户聚类和LLM分析,解耦幽默偏好中的影响因素 HuMoR large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页