| 1 |
Early Stopping Chain-of-thoughts in Large Language Models |
提出ES-CoT,通过提前停止CoT生成降低大语言模型推理成本 |
large language model chain-of-thought |
|
|
| 2 |
Simulating a Bias Mitigation Scenario in Large Language Models |
构建模拟框架,评估缓解大型语言模型偏见的策略 |
large language model |
|
|
| 3 |
Annotating Training Data for Conditional Semantic Textual Similarity Measurement using Large Language Models |
利用大型语言模型重新标注条件语义文本相似度数据集,提升模型性能。 |
large language model |
✅ |
|
| 4 |
Do Large Language Models Understand Word Senses? |
评估大型语言模型对词义理解能力,并验证其在词义消歧任务中的有效性 |
large language model |
|
|
| 5 |
Large Language Models Discriminate Against Speakers of German Dialects |
大型语言模型对德语方言使用者存在歧视性偏见 |
large language model |
|
|
| 6 |
How Can Quantum Deep Learning Improve Large Language Models? |
探索量子深度学习在提升大型语言模型适应性方面的潜力 |
large language model |
|
|
| 7 |
DSCC-HS: A Dynamic Self-Reinforcing Framework for Hallucination Suppression in Large Language Models |
提出DSCC-HS框架以主动抑制大型语言模型的幻觉现象 |
large language model |
|
|
| 8 |
Can Large Language Models Robustly Perform Natural Language Inference for Japanese Comparatives? |
构建日语比较句NLI数据集,评估大语言模型在此任务上的鲁棒性 |
large language model |
|
|
| 9 |
AssoCiAm: A Benchmark for Evaluating Association Thinking while Circumventing Ambiguity |
提出AssoCiAm基准,通过混合计算方法评估多模态大语言模型的联想思维能力,并规避歧义性。 |
large language model multimodal |
|
|
| 10 |
Enhancing Time Awareness in Generative Recommendation |
提出GRUT模型,通过时间感知提升生成式推荐效果 |
large language model TAMP |
✅ |
|
| 11 |
Integrating Text and Time-Series into (Large) Language Models to Predict Medical Outcomes |
利用DSPy优化提示,将文本和时间序列融入LLM以预测医疗结果 |
large language model multimodal |
|
|
| 12 |
Estimating Semantic Alphabet Size for LLM Uncertainty Quantification |
提出改进的语义字母表大小估计器,提升LLM不确定性量化的准确性和可解释性 |
large language model |
|
|
| 13 |
Ticket-Bench: A Kickoff for Multilingual and Regionalized Agent Evaluation |
Ticket-Bench:多语言区域化Agent评估基准,提升真实场景任务性能 |
large language model |
|
|
| 14 |
Correct-Detect: Balancing Performance and Ambiguity Through the Lens of Coreference Resolution in LLMs |
揭示LLM在共指消解中性能与歧义检测的权衡:Correct-Detect困境 |
large language model |
|
|
| 15 |
Causal-Counterfactual RAG: The Integration of Causal-Counterfactual Reasoning into RAG |
提出因果-反事实RAG,将因果推理融入RAG以提升知识密集型任务性能 |
large language model |
|
|
| 16 |
Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings |
提出一种利用大型语言模型增强心理语言学规范数据集的方法 |
large language model |
|
|
| 17 |
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments |
Apertus:构建开放、合规且支持全球语言环境的大语言模型 |
large language model |
|
|
| 18 |
ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution |
ShinkaEvolve:提出一种高效、开源的程序进化框架,用于解决科学发现中的样本效率问题。 |
large language model |
|
|
| 19 |
Enhancing Multi-Agent Debate System Performance via Confidence Expression |
提出ConfMAD框架,通过置信度表达提升多智能体辩论系统性能 |
large language model |
|
|
| 20 |
Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale |
提出Hala模型以提升阿拉伯语指令与翻译的质量 |
instruction following |
|
|
| 21 |
Do LLMs Align Human Values Regarding Social Biases? Judging and Explaining Social Biases with LLMs |
评估大语言模型在社会偏见场景下的人类价值观对齐程度 |
large language model |
|
|
| 22 |
Characterizing Knowledge Graph Tasks in LLM Benchmarks Using Cognitive Complexity Frameworks |
利用认知复杂性框架表征LLM基准测试中的知识图谱任务 |
large language model |
|
|
| 23 |
Exploring Data and Parameter Efficient Strategies for Arabic Dialect Identifications |
探索数据与参数高效的阿拉伯语方言识别策略 |
large language model |
|
|
| 24 |
Thinking in a Crowd: How Auxiliary Information Shapes LLM Reasoning |
研究辅助信息对LLM推理的影响:有害信息会显著降低LLM的推理能力 |
large language model |
✅ |
|
| 25 |
Implementing a Logical Inference System for Japanese Comparatives |
提出ccg-jcomp,用于日语比较句自然语言推理的逻辑推理系统。 |
large language model |
|
|
| 26 |
DSPC: Dual-Stage Progressive Compression Framework for Efficient Long-Context Reasoning |
提出DSPC双阶段渐进压缩框架,无需训练即可高效压缩长文本上下文,提升LLM推理效率。 |
large language model |
|
|
| 27 |
Improving Context Fidelity via Native Retrieval-Augmented Reasoning |
提出CARE框架,通过原生检索增强推理提升LLM上下文忠实度 |
large language model |
|
|
| 28 |
A Simple and Efficient Jailbreak Method Exploiting LLMs' Helpfulness |
HILL:一种利用LLM助人为乐特性进行越狱的简单高效方法 |
large language model |
|
|
| 29 |
CL$^2$GEC: A Multi-Discipline Benchmark for Continual Learning in Chinese Literature Grammatical Error Correction |
提出CL$^2$GEC基准,用于评估中文语法纠错系统在多领域持续学习中的性能。 |
large language model |
|
|
| 30 |
Sparse Neurons Carry Strong Signals of Question Ambiguity in LLMs |
发现LLM中编码问题歧义的稀疏神经元,实现歧义检测与行为控制 |
large language model |
|
|
| 31 |
ZERA: Zero-init Instruction Evolving Refinement Agent -- From Zero Instructions to Structured Prompts via Principle-based Optimization |
ZERA:零初始化指令演化优化Agent,通过基于原则的优化从零指令生成结构化提示 |
large language model |
✅ |
|
| 32 |
Latent Traits and Cross-Task Transfer: Deconstructing Dataset Interactions in LLM Fine-tuning |
通过潜在特征和跨任务迁移,解构LLM微调中的数据集交互 |
large language model |
|
|