| 1 |
Do MLLMs Capture How Interfaces Guide User Behavior? A Benchmark for Multimodal UI/UX Design Understanding |
提出WiserUI-Bench以解决多模态UI/UX设计理解不足的问题 |
large language model multimodal |
|
|
| 2 |
Chain-of-Thought Tokens are Computer Program Variables |
提出将链式思维令牌视为计算机程序变量以解决复杂推理问题 |
large language model chain-of-thought |
✅ |
|
| 3 |
Toward Reasonable Parrots: Why Large Language Models Should Argue with Us by Design |
提出合理的对话技术以增强论证能力 |
large language model |
|
|
| 4 |
A Benchmark Dataset and a Framework for Urdu Multimodal Named Entity Recognition |
提出U-MNER框架以解决乌尔都语多模态命名实体识别问题 |
multimodal |
|
|
| 5 |
Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders |
提出稀疏自编码器以揭示大语言模型中的语言特征 |
large language model |
✅ |
|
| 6 |
Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization |
评估大型语言模型在孟加拉消费者健康查询摘要中的表现 |
large language model |
|
|
| 7 |
Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization |
提出多阶段影响函数以解决大语言模型可扩展性问题 |
large language model |
✅ |
|
| 8 |
Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging |
通过模型合并实现视觉与推理能力的融合 |
large language model multimodal |
|
|
| 9 |
Crosslingual Reasoning through Test-Time Scaling |
通过测试时扩展实现跨语言推理能力提升 |
large language model chain-of-thought |
|
|
| 10 |
KG-HTC: Integrating Knowledge Graphs into LLMs for Effective Zero-shot Hierarchical Text Classification |
提出KG-HTC以解决零样本层次文本分类问题 |
large language model |
✅ |
|
| 11 |
ComPO: Preference Alignment via Comparison Oracles |
提出ComPO方法以解决大语言模型偏好对齐问题 |
large language model |
|
|
| 12 |
UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections |
构建UKElectionNarratives数据集以识别英国选举中的误导性叙事 |
large language model |
|
|
| 13 |
clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations |
提出clem todd框架以系统性评估任务导向对话系统 |
large language model |
|
|
| 14 |
Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data |
提出Ultra-FineWeb以解决高质量LLM训练数据过滤与验证问题 |
large language model |
|
|
| 15 |
Frame In, Frame Out: Do LLMs Generate More Biased News Headlines than Humans? |
探讨大型语言模型生成新闻标题的偏见问题 |
large language model |
|
|
| 16 |
RICo: Refined In-Context Contribution for Automatic Instruction-Tuning Data Selection |
提出RICo以解决自动指令调优数据选择问题 |
large language model |
|
|
| 17 |
Product of Experts with LLMs: Boosting Performance on ARC Is a Matter of Perspective |
提出基于专家模型的LLM方法以提升ARC-AGI表现 |
large language model |
|
|
| 18 |
Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction |
提出多尺度符合预测框架以降低机器生成文本的假阳性率 |
large language model |
|
|
| 19 |
Rethinking Invariance in In-context Learning |
提出Invariant ICL以解决上下文学习中的不变性问题 |
large language model |
✅ |
|