cs.CL(2025-05-29)

📊 共 29 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (19 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (19 篇)

#题目一句话要点标签🔗
1 TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine 提出TCM-Ladder以解决中医多模态问答评估问题 large language model multimodal
2 A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models 提出新评估管道以解决大规模视觉语言模型的偏见与推理忠实性问题 large language model chain-of-thought
3 SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models 提出SocialMaze基准以评估大型语言模型的社会推理能力 large language model chain-of-thought
4 ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs 提出ARC框架以提升零样本长文档摘要的论点覆盖分析 large language model instruction following
5 Large Language Model Meets Constraint Propagation 提出GenCP以解决大语言模型约束执行不足问题 large language model
6 FLAT-LLM: Fine-grained Low-rank Activation Space Transformation for Large Language Model Compression 提出FLAT-LLM以解决大语言模型压缩中的效率与准确性问题 large language model
7 Retrieval Augmented Generation based Large Language Models for Causality Mining 提出基于检索增强生成的动态提示方案以提升因果关系挖掘性能 large language model
8 Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models 提出Premise Critique Bench以提升大型语言模型的前提批判能力 large language model
9 Gaussian mixture models as a proxy for interacting language models 提出交互高斯混合模型以替代复杂语言模型 large language model
10 Diversity of Transformer Layers: One Aspect of Parameter Scaling Laws 提出层间多样性分析以优化Transformer参数扩展 large language model
11 Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs 提出UCerF以解决大型语言模型公平性评估中的不确定性问题 large language model
12 SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving 提出SwingArena以解决长上下文GitHub问题的评估挑战 large language model
13 Probing Association Biases in LLM Moderation Over-Sensitivity 提出主题关联分析以解决LLM内容审核过度敏感问题 large language model
14 One Task Vector is not Enough: A Large-Scale Study for In-Context Learning 提出QuiteAFew数据集以提升上下文学习的任务向量表现 large language model
15 Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time 提出SITAlign框架以解决大型语言模型的对齐问题 large language model
16 LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition 提出DEER方法以提升命名实体识别的效果 large language model
17 Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation 提出分离评估方法以提升大语言模型在数学问题上的推理能力 large language model
18 ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions 提出ToolHaystack以解决长时间交互中工具使用评估不足的问题 large language model
19 AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale Corpora 提出AutoSchemaKG以实现自主知识图谱构建 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
20 Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation 提出主动层对比解码以减少大型语言模型生成中的幻觉问题 reinforcement learning large language model
21 DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning 提出DeepTheorem以提升大语言模型的定理证明能力 reinforcement learning IMoS large language model
22 Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation 提出LoVeC以解决长文本生成中的信心估计问题 reinforcement learning DPO large language model
23 The Surprising Soupability of Documents in State Space Models 提出文档合并策略以提升状态空间模型的推理能力 Mamba SSM state space model
24 ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering 提出ML-Agent以解决自主机器学习工程中的手动提示工程问题 reinforcement learning large language model
25 LoLA: Low-Rank Linear Attention With Sparse Caching 提出LoLA以提升线性注意力的关联记忆能力 linear attention
26 Are Reasoning Models More Prone to Hallucination? 探讨推理模型在幻觉现象中的脆弱性 distillation chain-of-thought
27 ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents 提出ChARM以解决角色扮演语言代理的奖励建模问题 preference learning direct preference optimization
28 Table-R1: Inference-Time Scaling for Table Reasoning 提出Table-R1以实现表格推理任务的推理时间扩展 reinforcement learning distillation

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
29 Hidden Persuasion: Detecting Manipulative Narratives on Social Media During the 2022 Russian Invasion of Ukraine 提出一种方法以检测社交媒体中的操控叙事 manipulation

⬅️ 返回 cs.CL 首页 · 🏠 返回主页