cs.CL（2025-05-29）

📊 共 29 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (19 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (9 🔗1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (19 篇)

#	题目	一句话要点	标签	🔗	⭐
1	TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine	提出TCM-Ladder以解决中医多模态问答评估问题	large language model multimodal
2	A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models	提出新评估管道以解决大规模视觉语言模型的偏见与推理忠实性问题	large language model chain-of-thought
3	SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models	提出SocialMaze基准以评估大型语言模型的社会推理能力	large language model chain-of-thought	✅
4	ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs	提出ARC框架以提升零样本长文档摘要的论点覆盖分析	large language model instruction following
5	Large Language Model Meets Constraint Propagation	提出GenCP以解决大语言模型约束执行不足问题	large language model
6	FLAT-LLM: Fine-grained Low-rank Activation Space Transformation for Large Language Model Compression	提出FLAT-LLM以解决大语言模型压缩中的效率与准确性问题	large language model
7	Retrieval Augmented Generation based Large Language Models for Causality Mining	提出基于检索增强生成的动态提示方案以提升因果关系挖掘性能	large language model
8	Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models	提出Premise Critique Bench以提升大型语言模型的前提批判能力	large language model	✅
9	Gaussian mixture models as a proxy for interacting language models	提出交互高斯混合模型以替代复杂语言模型	large language model
10	Diversity of Transformer Layers: One Aspect of Parameter Scaling Laws	提出层间多样性分析以优化Transformer参数扩展	large language model
11	Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs	提出UCerF以解决大型语言模型公平性评估中的不确定性问题	large language model
12	SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving	提出SwingArena以解决长上下文GitHub问题的评估挑战	large language model
13	Probing Association Biases in LLM Moderation Over-Sensitivity	提出主题关联分析以解决LLM内容审核过度敏感问题	large language model
14	One Task Vector is not Enough: A Large-Scale Study for In-Context Learning	提出QuiteAFew数据集以提升上下文学习的任务向量表现	large language model
15	Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time	提出SITAlign框架以解决大型语言模型的对齐问题	large language model
16	LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition	提出DEER方法以提升命名实体识别的效果	large language model
17	Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation	提出分离评估方法以提升大语言模型在数学问题上的推理能力	large language model
18	ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions	提出ToolHaystack以解决长时间交互中工具使用评估不足的问题	large language model
19	AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale Corpora	提出AutoSchemaKG以实现自主知识图谱构建	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation	提出主动层对比解码以减少大型语言模型生成中的幻觉问题	reinforcement learning large language model
21	DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning	提出DeepTheorem以提升大语言模型的定理证明能力	reinforcement learning IMoS large language model
22	Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation	提出LoVeC以解决长文本生成中的信心估计问题	reinforcement learning DPO large language model
23	The Surprising Soupability of Documents in State Space Models	提出文档合并策略以提升状态空间模型的推理能力	Mamba SSM state space model
24	ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering	提出ML-Agent以解决自主机器学习工程中的手动提示工程问题	reinforcement learning large language model
25	LoLA: Low-Rank Linear Attention With Sparse Caching	提出LoLA以提升线性注意力的关联记忆能力	linear attention
26	Are Reasoning Models More Prone to Hallucination?	探讨推理模型在幻觉现象中的脆弱性	distillation chain-of-thought
27	ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents	提出ChARM以解决角色扮演语言代理的奖励建模问题	preference learning direct preference optimization	✅
28	Table-R1: Inference-Time Scaling for Table Reasoning	提出Table-R1以实现表格推理任务的推理时间扩展	reinforcement learning distillation

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
29	Hidden Persuasion: Detecting Manipulative Narratives on Social Media During the 2022 Russian Invasion of Ukraine	提出一种方法以检测社交媒体中的操控叙事	manipulation

⬅️ 返回 cs.CL 首页 · 🏠 返回主页