| 1 |
Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models |
对比搜索提升大语言模型不确定性估计的有效性 |
large language model |
|
|
| 2 |
USB-Rec: An Effective Framework for Improving Conversational Recommendation Capability of Large Language Model |
提出USB-Rec框架,提升大语言模型在对话推荐系统中的训练与推理能力 |
large language model |
|
|
| 3 |
ConceptViz: A Visual Analytics Approach for Exploring Concepts in Large Language Models |
ConceptViz:一种用于探索大型语言模型概念的可视分析方法 |
large language model |
✅ |
|
| 4 |
LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts |
LLMsPark:提出基于博弈论的大语言模型战略能力评测基准 |
large language model |
✅ |
|
| 5 |
The Sound of Syntax: Finetuning and Comprehensive Evaluation of Language Models for Speech Pathology |
针对语音病理学,提出微调语言模型并进行全面评估,填补临床应用空白。 |
multimodal chain-of-thought |
|
|
| 6 |
PruneCD: Contrasting Pruned Self Model to Improve Decoding Factuality |
提出PruneCD,通过对比剪枝模型提升大型语言模型解码的事实性 |
large language model |
|
|
| 7 |
Rethinking the Role of Text Complexity in Language Model Pretraining |
研究文本复杂度对语言模型预训练的影响,揭示数据多样性与下游任务性能间的关系。 |
large language model |
|
|
| 8 |
InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative Grounding |
InteGround:提出综合性知识融合评估框架,用于评估LLM在复杂推理场景下的知识检索与验证能力。 |
large language model |
|
|
| 9 |
AIPsychoBench: Understanding the Psychometric Differences between LLMs and Humans |
AIPsychoBench:构建LLM心理测量基准,揭示其与人类的差异及多语言影响 |
large language model |
|
|
| 10 |
Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text |
评估经典机器学习与Transformer模型在AI生成研究文本检测中的性能 |
large language model |
|
|
| 11 |
Can an Individual Manipulate the Collective Decisions of Multi-Agents? |
M-Spoiler:利用单智能体知识攻击多智能体协同决策系统 |
large language model |
|
|
| 12 |
The Oracle Has Spoken: A Multi-Aspect Evaluation of Dialogue in Pythia |
通过多维度评估Pythia模型对话能力,揭示模型规模和微调的影响 |
large language model |
|
|
| 13 |
Cognitive Linguistic Identity Fusion Score (CLIFS): A Scalable Cognition-Informed Approach to Quantifying Identity Fusion from Text |
提出CLIFS,一种基于认知语言学和LLM的可扩展身份融合量化方法 |
large language model |
✅ |
|
| 14 |
EG-MLA: Embedding-Gated Multi-head Latent Attention for Scalable and Efficient LLMs |
提出EG-MLA,通过嵌入门控机制压缩KV缓存,提升LLM推理效率。 |
large language model |
|
|
| 15 |
Robust Native Language Identification through Agentic Decomposition |
提出基于Agent分解的NLI方法,提升模型在对抗性线索下的鲁棒性 |
large language model |
|
|
| 16 |
Redefining Experts: Interpretable Decomposition of Language Models for Toxicity Mitigation |
提出EigenShift方法,通过语言模型分解实现可解释的毒性内容抑制。 |
large language model |
|
|
| 17 |
Challenging the Evaluator: LLM Sycophancy Under User Rebuttal |
揭示LLM在用户反驳下的谄媚行为,警惕评估任务中的潜在风险 |
large language model |
|
|