| 1 |
SeLaR: Selective Latent Reasoning in Large Language Models |
提出SeLaR,通过选择性隐空间推理提升大语言模型的推理能力。 |
large language model chain-of-thought |
|
|
| 2 |
Self-Debias: Self-correcting for Debiasing Large Language Models |
Self-Debias:通过自校正机制消除大语言模型中的偏见传播 |
large language model chain-of-thought |
|
|
| 3 |
Rethinking Data Mixing from the Perspective of Large Language Models |
提出DoGraph框架,通过图约束优化重加权数据,提升大语言模型泛化能力 |
large language model |
|
|
| 4 |
GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning |
GRASS:基于梯度自适应层重要性采样,实现大语言模型高效微调。 |
large language model |
|
|
| 5 |
Distributed Multi-Layer Editing for Rule-Level Knowledge in Large Language Models |
提出分布式多层编辑(DMLE)方法,解决大语言模型中规则级知识编辑难题。 |
large language model |
✅ |
|
| 6 |
Detecting HIV-Related Stigma in Clinical Narratives Using Large Language Models |
利用大型语言模型检测临床叙述中与HIV相关的污名化现象 |
large language model |
|
|
| 7 |
Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces |
OmniBehavior:构建真实世界人类行为模拟基准,揭示LLM在复杂行为建模中的局限性 |
large language model |
|
|
| 8 |
Beyond Social Pressure: Benchmarking Epistemic Attack in Large Language Models |
提出PPT-Bench基准,用于评估大语言模型在认知攻击下的脆弱性 |
large language model |
|
|
| 9 |
Are GUI Agents Focused Enough? Automated Distraction via Semantic-level UI Element Injection |
提出语义级UI元素注入方法,用于评估GUI智能体的鲁棒性并发现潜在安全漏洞。 |
visual grounding |
|
|
| 10 |
Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers |
提出循环深度Transformer,解决Transformer在隐式推理中组合泛化能力不足的问题。 |
large language model |
|
|
| 11 |
A GAN and LLM-Driven Data Augmentation Framework for Dynamic Linguistic Pattern Modeling in Chinese Sarcasm Detection |
提出基于GAN和LLM的数据增强框架,用于动态建模中文讽刺检测中的语言模式。 |
large language model |
|
|
| 12 |
TEMPER: Testing Emotional Perturbation in Quantitative Reasoning |
TEMPER:探究情感扰动对定量推理的影响及中和方法 |
large language model |
|
|
| 13 |
Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts |
提出基于训练损失的数据剪枝方法,提升大语言模型的事实记忆能力 |
large language model |
|
|
| 14 |
What do Language Models Learn and When? The Implicit Curriculum Hypothesis |
揭示LLM预训练的隐式课程:技能以可预测的组合方式涌现 |
large language model |
|
|
| 15 |
AI generates well-liked but templatic empathic responses |
大型语言模型生成受欢迎但模板化的共情回复 |
large language model |
|
|
| 16 |
Training Data Size Sensitivity in Unsupervised Rhyme Recognition |
研究揭示了无监督韵律识别中训练数据规模对性能的影响,并提出了RhymeTagger。 |
large language model |
|
|
| 17 |
HCRE: LLM-based Hierarchical Classification for Cross-Document Relation Extraction with a Prediction-then-Verification Strategy |
提出基于LLM的分层分类模型HCRE,解决跨文档关系抽取中关系数量过多的挑战。 |
large language model |
|
|
| 18 |
Tool Retrieval Bridge: Aligning Vague Instructions with Retriever Preferences via Bridge Model |
提出工具检索桥TRB,解决LLM在模糊指令下的工具检索问题 |
large language model |
✅ |
|
| 19 |
An Empirical Analysis of Static Analysis Methods for Detection and Mitigation of Code Library Hallucinations |
利用静态分析检测和缓解代码库幻觉问题,揭示其能力上限。 |
large language model |
|
|
| 20 |
SepSeq: A Training-Free Framework for Long Numerical Sequence Processing in LLMs |
SepSeq:一种免训练框架,通过分隔符提升LLM长数值序列处理能力 |
large language model |
|
|
| 21 |
LLMs Underperform Graph-Based Parsers on Supervised Relation Extraction for Complex Graphs |
复杂图关系抽取中,图解析器性能优于大型语言模型 |
large language model |
|
|
| 22 |
Optimal Multi-bit Generative Watermarking Schemes Under Worst-Case False-Alarm Constraints |
针对大语言模型,提出在最坏情况误报约束下的最优多比特生成式水印方案 |
large language model |
|
|
| 23 |
EXAONE 4.5 Technical Report |
LG AI Research发布EXAONE 4.5,首个开源权重视觉语言模型,提升文档理解与长文本推理能力。 |
multimodal |
|
|
| 24 |
An Empirical Analysis of Static Analysis Methods for Detection and Mitigation of Code Library Hallucinations |
利用静态分析检测和缓解代码库幻觉问题,揭示其能力上限 |
large language model |
|
|