| 1 |
Hidden in Plain Sight: Evaluation of the Deception Detection Capabilities of LLMs in Multimodal Settings |
评估大型语言模型在多模态环境中的欺骗检测能力 |
large language model multimodal chain-of-thought |
|
|
| 2 |
Causal Sufficiency and Necessity Improves Chain-of-Thought Reasoning |
提出因果框架以提升链式思维推理能力 |
large language model chain-of-thought |
|
|
| 3 |
Debunk and Infer: Multimodal Fake News Detection via Diffusion-Generated Evidence and LLM Reasoning |
提出Debunk-and-Infer框架以解决假新闻检测问题 |
large language model multimodal |
|
|
| 4 |
VAT-KG: Knowledge-Intensive Multimodal Knowledge Graph Dataset for Retrieval-Augmented Generation |
提出VAT-KG以解决多模态知识图谱的知识覆盖不足问题 |
large language model multimodal |
|
|
| 5 |
Improved Supervised Fine-Tuning for Large Language Models to Mitigate Catastrophic Forgetting |
提出改进的监督微调方法以缓解灾难性遗忘问题 |
large language model instruction following |
|
|
| 6 |
Alzheimer's Dementia Detection Using Perplexity from Paired Large Language Models |
基于配对大语言模型的困惑度检测阿尔茨海默病 |
large language model instruction following |
|
|
| 7 |
When Large Language Models are Reliable for Judging Empathic Communication |
评估大型语言模型在同理沟通判断中的可靠性 |
large language model |
|
|
| 8 |
Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages |
利用大型语言模型检测低资源巴尔干语言中的有毒语言 |
large language model |
|
|
| 9 |
CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios |
提出CRITICTOOL以评估大型语言模型在工具调用错误场景中的自我批评能力 |
large language model |
✅ |
|
| 10 |
The Emergence of Abstract Thought in Large Language Models Beyond Any Language |
提出语言无关参数空间以支持大型语言模型的抽象思维 |
large language model |
|
|
| 11 |
Continuously Updating Digital Twins using Large Language Models |
提出CALM-DT以解决数字双胞胎更新问题 |
large language model |
|
|
| 12 |
Query-Level Uncertainty in Large Language Models |
提出查询级不确定性方法以提升大语言模型的知识边界识别 |
large language model |
|
|
| 13 |
From Symbolic to Neural and Back: Exploring Knowledge Graph-Large Language Model Synergies |
探讨知识图谱与大语言模型的协同以提升推理能力 |
large language model |
|
|
| 14 |
MEDUSA: A Multimodal Deep Fusion Multi-Stage Training Framework for Speech Emotion Recognition in Naturalistic Conditions |
提出MEDUSA框架以解决自然条件下的语音情感识别问题 |
multimodal |
|
|
| 15 |
Token Constraint Decoding Improves Robustness on Question Answering for Large Language Models |
提出Token约束解码以提升大语言模型的问答鲁棒性 |
large language model |
|
|
| 16 |
DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts |
提出DIVE方法以增强大语言模型的多样性重建 |
large language model |
|
|
| 17 |
DrVoice: Parallel Speech-Text Voice Conversation Model via Dual-Resolution Speech Representations |
提出DrVoice以解决语音生成中的模态不一致问题 |
large language model foundation model |
|
|
| 18 |
Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval |
提出Q2E方法以解决零样本多语言文本到视频检索问题 |
multimodal |
|
|
| 19 |
When Meaning Stays the Same, but Models Drift: Evaluating Quality of Service under Token-Level Behavioral Instability in LLMs |
提出PBSS框架以评估LLMs在语义等价提示下的行为漂移 |
large language model |
|
|
| 20 |
Chat-of-Thought: Collaborative Multi-Agent System for Generating Domain Specific Information |
提出Chat-of-Thought以解决工业资产FMEA文档生成问题 |
large language model |
|
|
| 21 |
A quantum semantic framework for natural language processing |
提出量子语义框架以解决自然语言处理中的语义退化问题 |
large language model |
|
|
| 22 |
TaskCraft: Automated Generation of Agentic Tasks |
提出TaskCraft以解决现有代理任务生成的不足问题 |
foundation model |
|
|
| 23 |
Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs |
提出逐步指令与简化表格格式以提升LLM的依存解析准确性 |
large language model |
|
|
| 24 |
GraphLAMA: Enabling Efficient Adaptation of Graph Language Models with Limited Annotations |
提出GraphLAMA以解决图语言模型适应性不足问题 |
large language model |
|
|
| 25 |
Bench to the Future: A Pastcasting Benchmark for Forecasting Agents |
提出BTF基准以解决预测代理评估问题 |
chain-of-thought |
|
|
| 26 |
PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants |
提出PersonaLens以解决个性化评估在对话AI助手中的挑战 |
large language model |
|
|
| 27 |
Attention Head Embeddings with Trainable Deep Kernels for Hallucination Detection in LLMs |
提出基于可训练深度核的注意力头嵌入以检测LLM中的幻觉 |
large language model |
|
|
| 28 |
AI shares emotion with humans across languages and cultures |
提出情感调控方法以增强人机情感交流 |
large language model |
|
|
| 29 |
Do LLMs Give Psychometrically Plausible Responses in Educational Assessments? |
评估大型语言模型在教育评估中的心理测量合理性 |
large language model |
|
|
| 30 |
Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models |
提出Inv-Entropy框架以量化语言模型的不确定性 |
large language model |
✅ |
|
| 31 |
Is Fine-Tuning an Effective Solution? Reassessing Knowledge Editing for Unstructured Data |
提出Fine-Tuning方法以解决无结构知识编辑的局限性 |
large language model |
|
|
| 32 |
Benchmarking Debiasing Methods for LLM-based Parameter Estimates |
比较LLM基础参数估计的去偏方法以解决偏差问题 |
large language model |
|
|
| 33 |
Towards Open Foundation Language Model and Corpus for Macedonian: A Low-Resource Language |
提出马其顿开放基础语言模型及语料库以解决低资源语言问题 |
large language model |
|
|
| 34 |
Can LLMs Reason About Trust?: A Pilot Study |
探讨大型语言模型在信任推理中的应用 |
large language model |
|
|
| 35 |
Understanding and Mitigating Numerical Sources of Nondeterminism in LLM Inference |
提出LayerCast以解决LLM推理中的数值不确定性问题 |
large language model |
✅ |
|
| 36 |
ASP2LJ : An Adversarial Self-Play Laywer Augmented Legal Judgment Framework |
提出ASP2LJ框架以解决法律判决预测中的长尾分布和律师作用不足问题 |
large language model |
|
|
| 37 |
UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs |
提出UniToMBench以提升大型语言模型的心智理论能力 |
large language model |
✅ |
|
| 38 |
GigaChat Family: Efficient Russian Language Modeling Through Mixture of Experts Architecture |
提出GigaChat家族以高效建模俄语语言 |
large language model |
✅ |
|
| 39 |
Comparing human and LLM politeness strategies in free production |
比较人类与大型语言模型的礼貌策略以解决对齐挑战 |
large language model |
|
|
| 40 |
Taming SQL Complexity: LLM-Based Equivalence Evaluation for Text-to-SQL |
提出基于LLM的SQL等价评估方法以解决文本到SQL转换中的复杂性问题 |
large language model |
|
|
| 41 |
ScholarSearch: Benchmarking Scholar Searching Ability of LLMs |
提出ScholarSearch以解决学术搜索能力评估问题 |
large language model |
✅ |
|