| 1 |
CoDAE: Adapting Large Language Models for Education via Chain-of-Thought Data Augmentation |
提出CoDAE框架以解决教育场景中LLM适应性不足问题 |
large language model chain-of-thought |
|
|
| 2 |
Capabilities of GPT-5 on Multimodal Medical Reasoning |
提出GPT-5作为多模态医学推理的通用解决方案 |
large language model multimodal chain-of-thought |
|
|
| 3 |
Evaluating Large Language Models as Expert Annotators |
评估大型语言模型作为专家标注者的有效性 |
large language model chain-of-thought |
|
|
| 4 |
Momentum Point-Perplexity Mechanics in Large Language Models |
提出动量点-困惑度机制以研究大语言模型的内部状态变化 |
large language model |
|
|
| 5 |
Signature vs. Substance: Evaluating the Balance of Adversarial Resistance and Linguistic Quality in Watermarking Large Language Models |
评估水印技术在大型语言模型中的对抗抵抗力与语言质量平衡 |
large language model |
|
|
| 6 |
Echoes of Agreement: Argument Driven Opinion Shifts in Large Language Models |
探讨提示对大型语言模型政治偏见的影响 |
large language model |
|
|
| 7 |
Human-Alignment and Calibration of Inference-Time Uncertainty in Large Language Models |
提出人类对齐与推理时不确定性校准方法以提升LLM用户体验 |
large language model |
|
|
| 8 |
Large Language Models for Czech Aspect-Based Sentiment Analysis |
评估大型语言模型在捷克语基于方面的情感分析中的应用 |
large language model |
|
|
| 9 |
Exploring Causal Effect of Social Bias on Faithfulness Hallucinations in Large Language Models |
探讨社会偏见对大型语言模型信实性幻觉的因果影响 |
large language model |
|
|
| 10 |
What am I missing here?: Evaluating Large Language Models for Masked Sentence Prediction |
评估大语言模型在掩码句子预测中的表现以解决长程一致性问题 |
large language model |
|
|
| 11 |
Assessing LLM Text Detection in Educational Contexts: Does Human Contribution Affect Detection? |
提出GEDE数据集以解决教育环境中的LLM文本检测问题 |
large language model |
✅ |
|
| 12 |
Jinx: Unlimited LLMs for Probing Alignment Failures |
提出Jinx以探测语言模型的对齐失败问题 |
instruction following |
|
|
| 13 |
Efficient Speculative Decoding for Llama at Scale: Challenges and Solutions |
提出高效的推测解码方法以解决大规模推理挑战 |
large language model |
|
|
| 14 |
Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models |
提出基于不确定性引导的探测方法以提高LLM的可靠性 |
large language model |
|
|
| 15 |
Dual Information Speech Language Models for Emotional Conversations |
提出双重信息语音语言模型以解决情感对话中的信息捕捉问题 |
large language model |
|
|
| 16 |
Progressive Depth Up-scaling via Optimal Transport |
提出Optimal Transport深度上采样以解决神经元排列不匹配问题 |
large language model |
|
|
| 17 |
WideSearch: Benchmarking Agentic Broad Info-Seeking |
提出WideSearch基准以评估大规模信息搜索代理的可靠性 |
large language model |
|
|
| 18 |
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL |
提出ASearcher以解决长时间搜索智能不足的问题 |
zero-shot transfer |
✅ |
|
| 19 |
Joint Transcription of Acoustic Guitar Strumming Directions and Chords |
提出一种深度学习模型以解决吉他扫弦方向与和弦的自动转录问题 |
multimodal |
|
|
| 20 |
Punctuation and Predicates in Language Models |
探讨标点符号在语言模型中的重要性及推理机制 |
large language model |
|
|
| 21 |
Can You Trick the Grader? Adversarial Persuasion of LLM Judges |
揭示语言模型评估中的说服性偏见问题 |
large language model |
|
|
| 22 |
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts |
提出Grove MoE以解决传统MoE模型的计算效率问题 |
large language model |
|
|
| 23 |
LoSemB: Logic-Guided Semantic Bridging for Inductive Tool Retrieval |
提出LoSemB框架以解决工具检索中的分布转移问题 |
large language model |
|
|
| 24 |
Keyword-Centric Prompting for One-Shot Event Detection with Self-Generated Rationale Enhancements |
提出KeyCP++以解决LLM在事件检测中的不足 |
chain-of-thought |
|
|
| 25 |
IBPS: Indian Bail Prediction System |
提出印度保释预测系统以解决司法延误问题 |
large language model |
|
|
| 26 |
Augmenting Bias Detection in LLMs Using Topological Data Analysis |
利用拓扑数据分析增强大语言模型的偏见检测 |
large language model |
|
|