cs.CL(2025-05-27)

📊 共 78 篇论文 | 🔗 14 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (65 🔗11) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗2) 支柱一:机器人控制 (Robot Control) (2 🔗1) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (65 篇)

#题目一句话要点标签🔗
1 Leveraging Large Language Models for Bengali Math Word Problem Solving with Chain of Thought Reasoning 提出SOMADHAN数据集以解决孟加拉数学语言问题 large language model chain-of-thought
2 Evaluating and Steering Modality Preferences in Multimodal Large Language Model 提出MC²基准以评估和引导多模态大语言模型的模态偏好 large language model multimodal
3 Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities 提出SPARCOM框架以揭示LLM指令遵循能力的稀疏神经元 large language model instruction following
4 Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective 提出MAMMQA框架以解决多模态问答中的信息综合问题 large language model multimodal
5 Explaining Large Language Models with gSMILE 提出gSMILE以解决大语言模型可解释性问题 large language model
6 LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions 提出LayerIF以解决大语言模型层质量估计问题 large language model
7 Test-Time Learning for Large Language Models 提出测试时学习方法以提升大语言模型在特定领域的适应性 large language model
8 Rethinking the Outlier Distribution in Large Language Models: An In-depth Study 深入研究大语言模型中的异常值分布以提升量化性能 large language model
9 How does Misinformation Affect Large Language Model Behaviors and Preferences? 提出MisBench以评估大型语言模型对虚假信息的反应 large language model
10 RelationalFactQA: A Benchmark for Evaluating Tabular Fact Retrieval from Large Language Models 提出RelationalFactQA以评估大型语言模型的表格事实检索能力 large language model
11 Who Reasons in the Large Language Models? 提出Stethoscope for Networks以揭示大语言模型的推理能力 large language model
12 Multi-objective Large Language Model Alignment with Hierarchical Experts 提出HoE方法以解决多目标大语言模型对齐问题 large language model
13 Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language Models 提出LLM自动传输框架以优化大语言模型的成本与准确性 large language model
14 DenseLoRA: Dense Low-Rank Adaptation of Large Language Models 提出DenseLoRA以提高大语言模型的参数效率 large language model
15 DLP: Dynamic Layerwise Pruning in Large Language Models 提出动态层级剪枝方法以提升大语言模型的推理效率 large language model
16 CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models 提出CogniBench框架以评估大型语言模型的认知可信度 large language model
17 From prosthetic memory to prosthetic denial: Auditing whether large language models are prone to mass atrocity denialism 审计大型语言模型对历史记忆的影响与否定主义风险 large language model
18 Rethinking Data Mixture for Large Language Models: A Comprehensive Survey and New Perspectives 提出数据混合方法以优化大语言模型训练效果 large language model
19 DecisionFlow: Advancing Large Language Model as Principled Decision Maker 提出DecisionFlow以解决语言模型决策透明性不足问题 large language model
20 Leveraging large language models and traditional machine learning ensembles for ADHD detection from narrative transcripts 提出基于大语言模型与传统机器学习集成的ADHD检测方法 large language model
21 Assessment of L2 Oral Proficiency using Speech Large Language Models 利用多模态大语言模型评估L2口语能力以解决自动评分问题 large language model
22 RPM: Reasoning-Level Personalization for Black-Box Large Language Models 提出RPM框架以解决黑箱大语言模型个性化问题 large language model
23 Uncertainty Unveiled: Can Exposure to More In-context Examples Mitigate Uncertainty for Large Language Models? 提出通过增加上下文示例来降低大型语言模型的不确定性 large language model
24 Research Community Perspectives on "Intelligence" and Large Language Models 调查研究者对“智能”及大型语言模型的看法 large language model
25 On VLMs for Diverse Tasks in Multimodal Meme Classification 提出多模态模型以提升表情包分类性能 multimodal
26 Automated Privacy Information Annotation in Large Language Model Interactions 构建自动隐私信息标注系统以应对LLM交互中的隐私泄露问题 large language model
27 STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models 提出STEER-BENCH以评估大型语言模型的可引导性 large language model
28 SV-TrustEval-C: Evaluating Structure and Semantic Reasoning in Large Language Models for Source Code Vulnerability Analysis 提出SV-TrustEval-C以解决LLM在代码漏洞分析中的评估问题 large language model
29 Towards Pretraining Robust ASR Foundation Model with Acoustic-Aware Data Augmentation 提出声学感知数据增强以提升ASR模型的鲁棒性 foundation model
30 Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning 提出Self-Route以解决推理模型资源消耗问题 large language model chain-of-thought
31 AutoJudger: An Agent-Driven Framework for Efficient Benchmarking of MLLMs 提出AutoJudger以解决多模态大语言模型评估成本高的问题 large language model multimodal
32 Visual Cues Enhance Predictive Turn-Taking for Two-Party Human Interaction 提出MM-VAP以解决人机交互中的预测轮流问题 multimodal
33 Predicting Implicit Arguments in Procedural Video Instructions 提出Implicit-VidSRL数据集以解决隐式参数预测问题 multimodal
34 LLMPR: A Novel LLM-Driven Transfer Learning based Petition Ranking Model 提出LLMPR以解决印度司法案件积压问题 large language model
35 R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing 提出R2R以高效导航语言模型推理路径 large language model
36 Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling 提出语言模型的事实自我意识以提高生成内容的准确性 large language model
37 Exploring the Hidden Capacity of LLMs for One-Step Text Generation 探索LLMs在一步文本生成中的隐藏能力 large language model
38 A Lightweight Multi-Expert Generative Language Model System for Engineering Information and Knowledge Extraction 提出小型语言图以解决工程信息提取中的计算资源问题 large language model
39 SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long Sequences 提出SpecExtend以解决长序列推理性能下降问题 large language model
40 CodeMirage: A Multi-Lingual Benchmark for Detecting AI-Generated and Paraphrased Source Code from Production-Level LLMs 提出CodeMirage以解决AI生成代码检测的基准问题 large language model
41 REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning 提出REAL-Prover以解决高等数学定理证明问题 large language model
42 Calibrating LLMs for Text-to-SQL Parsing by Leveraging Sub-clause Frequencies 提出基于子子句频率的校准方法以提升文本到SQL解析的可靠性 large language model
43 RefTool: Enhancing Model Reasoning with Reference-Guided Tool Creation 提出RefTool以解决工具生成不足的问题 large language model
44 Improving Research Idea Generation Through Data: An Empirical Investigation in Social Science 通过数据增强LLM生成高质量研究创意 large language model
45 Evaluating LLM Adaptation to Sociodemographic Factors: User Profile vs. Dialogue History 提出框架评估LLM对社会人口特征的适应性 large language model
46 Pretrained LLMs Learn Multiple Types of Uncertainty 研究大型语言模型捕捉多种不确定性以提升任务准确性 large language model
47 BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge 提出BLUCK数据集以评估孟加拉语言理解与文化知识 large language model
48 MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation 提出MARS-Bench以解决LLMs在复杂对话中的鲁棒性问题 large language model
49 LLM-Driven E-Commerce Marketing Content Optimization: Balancing Creativity and Conversion 提出基于LLM的电商营销内容优化框架以提升转化率 multimodal
50 Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties 提出Trans-EnV框架以评估LLMs对英语变体的语言鲁棒性 large language model
51 Calibrating LLM Confidence by Probing Perturbed Representation Stability 提出CCPS以解决大型语言模型置信度校准问题 large language model
52 Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing 提出新方法识别大型语言模型知识盲区的一致性问题 large language model
53 MAKIEval: A Multilingual Automatic WiKidata-based Framework for Cultural Awareness Evaluation for LLMs 提出MAKIEval框架以评估LLMs的文化意识 large language model
54 Are Language Models Consequentialist or Deontological Moral Reasoners? 提出道德推理分类框架以分析语言模型的伦理决策 large language model
55 Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance 探讨潜在语言一致性对LLM任务性能的影响 large language model
56 PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims 提出PEDANTIC数据集以解决专利索赔不确定性问题 large language model
57 Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead 分析884篇非洲语言NLP研究以推动包容性发展 large language model
58 rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset 提出rStar-Coder以解决大规模代码推理数据集稀缺问题 large language model
59 Towards Objective Fine-tuning: How LLMs' Prior Knowledge Causes Potential Poor Calibration? 提出CogCalib框架以解决LLMs校准不足问题 large language model
60 MSA at SemEval-2025 Task 3: High Quality Weak Labeling and LLM Ensemble Verification for Multilingual Hallucination Detection 提出多语言幻觉检测方法以解决LLM生成文本的可靠性问题 large language model
61 Concealment of Intent: A Game-Theoretic Analysis 提出意图隐藏对抗性提示以应对大语言模型的安全问题 large language model
62 CHIMERA: A Knowledge Base of Scientific Idea Recombinations for Research Analysis and Ideation 提出CHIMERA知识库以促进科学思想重组与研究分析 large language model
63 Beyond Templates: Dynamic Adaptation of Reasoning Demonstrations via Feasibility-Aware Exploration 提出DART框架以解决小语言模型推理能力不足问题 large language model
64 Long Context Scaling: Divide and Conquer via Multi-Agent Question-driven Collaboration 提出XpandA框架以解决长文本处理中的信息损失问题 large language model
65 POLAR: A Benchmark for Multilingual, Multicultural, and Multi-Event Online Polarization 提出POLAR数据集以解决在线极化问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
66 EasyDistill: A Comprehensive Toolkit for Effective Knowledge Distillation of Large Language Models 提出EasyDistill工具包以有效进行大语言模型的知识蒸馏 reinforcement learning distillation large language model
67 Towards Better Instruction Following Retrieval Models 提出InF-IR以解决指令跟随信息检索模型的不足 representation learning contrastive learning instruction following
68 Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning 提出ConciseR以解决长链推理中的冗余问题 reinforcement learning large language model chain-of-thought
69 A Course Correction in Steerability Evaluation: Revealing Miscalibration and Side Effects in LLMs 提出多维目标空间框架以评估LLM的可操控性问题 reinforcement learning large language model instruction following
70 FCKT: Fine-Grained Cross-Task Knowledge Transfer with Semantic Contrastive Learning for Targeted Sentiment Analysis 提出FCKT框架以解决目标情感分析中的知识转移问题 contrastive learning large language model
71 TAT-R1: Terminology-Aware Translation with Reinforcement Learning and Word Alignment 提出TAT-R1以解决术语翻译准确性不足问题 reinforcement learning large language model
72 Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing 提出LLM反向生成方法以解决跨领域句法分析问题 contrastive learning large language model
73 SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation 提出SeqPO-SiMT以解决同步机器翻译中的延迟与质量问题 reinforcement learning PPO RLHF
74 Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG 提出Divide-Then-Align以解决RAG系统的知识边界问题 DPO direct preference optimization large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
75 SELF-PERCEPT: Introspection Improves Large Language Models' Detection of Multi-Person Mental Manipulation in Conversations 提出SELF-PERCEPT以解决多方对话中的心理操控检测问题 manipulation large language model
76 Tracing and Reversing Rank-One Model Edits 提出一种方法以追踪和逆转知识编辑中的恶意操控 manipulation large language model

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
77 Can LLMs Learn to Map the World from Local Descriptions? 提出利用LLMs构建全球空间认知以解决局部描述映射问题 spatial relationship large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
78 Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity 提出混合分组专家模型以解决专家负载不均问题 MoGe large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页