| 1 |
Advancing Multi-Step Mathematical Reasoning in Large Language Models through Multi-Layered Self-Reflection with Auto-Prompting |
提出MAPS框架以提升大语言模型的多步数学推理能力 |
large language model chain-of-thought |
|
|
| 2 |
The Trilemma of Truth in Large Language Models |
提出sAwMIL框架以解决大语言模型真伪性验证问题 |
large language model |
|
|
| 3 |
Impact of Fine-Tuning Methods on Memorization in Large Language Models |
提出细化调优方法以解决大语言模型的记忆泄露问题 |
large language model |
|
|
| 4 |
Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data |
提出跨领域评估方法以提升表格理解能力 |
multimodal |
|
|
| 5 |
Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective |
探讨大型语言模型在数学问题解决中的局限性 |
large language model |
|
|
| 6 |
On Recipe Memorization and Creativity in Large Language Models: Is Your Model a Creative Cook, a Bad Cook, or Merely a Plagiator? |
提出自动化框架以评估大语言模型的食谱记忆与创造力 |
large language model |
|
|
| 7 |
Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective |
提出强化微调方法以更好地保留多模态大语言模型的先前知识 |
large language model multimodal |
|
|
| 8 |
Prompting as Scientific Inquiry |
将提示视为科学探究以提升大语言模型的理解与控制 |
large language model chain-of-thought |
|
|
| 9 |
Graft: Integrating the Domain Knowledge via Efficient Parameter Synergy for MLLMs |
提出统一参数集成框架以解决多模态大语言模型知识碎片化问题 |
large language model multimodal |
|
|
| 10 |
IMPACT: Inflectional Morphology Probes Across Complex Typologies |
提出IMPACT框架以评估大语言模型在形态学上的表现 |
large language model chain-of-thought |
|
|
| 11 |
EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations |
提出EXPERT以解决图像描述评估标准化问题 |
large language model |
✅ |
|
| 12 |
PBa-LLM: Privacy- and Bias-aware NLP using Named-Entity Recognition (NER) |
提出PBa-LLM以解决隐私与偏见问题 |
large language model |
|
|
| 13 |
Less Data, More Security: Advancing Cybersecurity LLMs Specialization via Resource-Efficient Domain-Adaptive Continuous Pre-training with Minimal Tokens |
通过资源高效的领域自适应预训练提升网络安全LLM专业化 |
large language model |
|
|
| 14 |
Two-Stage Reasoning-Infused Learning: Improving Classification with LLM-Generated Reasoning |
提出两阶段推理增强学习以提升文本分类性能 |
large language model |
|
|
| 15 |
User Behavior Prediction as a Generic, Robust, Scalable, and Low-Cost Evaluation Strategy for Estimating Generalization in LLMs |
提出用户行为预测以解决LLMs泛化能力评估问题 |
large language model |
|
|
| 16 |
LineRetriever: Planning-Aware Observation Reduction for Web Agents |
提出LineRetriever以解决网页导航任务中的观察信息冗余问题 |
large language model |
|
|
| 17 |
TaP: A Taxonomy-Guided Framework for Automated and Scalable Preference Data Generation |
提出TaP框架以自动化生成多语言偏好数据集 |
large language model |
|
|
| 18 |
Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders |
提出稀疏自编码器以提升文本分类中的可解释性 |
large language model |
|
|
| 19 |
Evaluating the Simulation of Human Personality-Driven Susceptibility to Misinformation with LLMs |
评估大语言模型在个性驱动的虚假信息易感性模拟中的应用 |
large language model |
|
|