| 1 |
CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models |
CRANE:通过因果相关性分析多语言大模型中特定语言神经元 |
large language model language conditioned |
|
|
| 2 |
V-FAT: Benchmarking Visual Fidelity Against Text-bias |
V-FAT基准测试揭示多模态大语言模型中文本偏见下的视觉保真度下降问题 |
large language model multimodal visual grounding |
|
|
| 3 |
See, Explain, and Intervene: A Few-Shot Multimodal Agent Framework for Hateful Meme Moderation |
提出基于生成式AI和少样本学习的多模态框架,用于检测、解释和干预有害Meme。 |
multimodal |
|
|
| 4 |
BanglaLorica: Design and Evaluation of a Robust Watermarking Algorithm for Large Language Models in Bangla Text Generation |
BanglaLorica:提出一种针对孟加拉语LLM的鲁棒分层水印算法,抵抗跨语言翻译攻击。 |
large language model |
|
|
| 5 |
Can Large Language Models Resolve Semantic Discrepancy in Self-Destructive Subcultures? Evidence from Jirai Kei |
提出Subcultural Alignment Solver (SAS)以解决LLM在亚文化自毁行为检测中的语义差异问题 |
large language model |
|
|
| 6 |
Learning from Mistakes: Negative Reasoning Samples Enhance Out-of-Domain Generalization |
利用负样本推理提升大语言模型在域外泛化能力 |
large language model chain-of-thought |
|
|
| 7 |
THaLLE-ThaiLLM: Domain-Specialized Small LLMs for Finance and Thai -- Technical Report |
THaLLE-ThaiLLM:面向金融和泰语的领域专用小型LLM,通过模型合并实现多功能。 |
large language model instruction following |
|
|
| 8 |
Measuring and Fostering Peace through Machine Learning and Artificial Intelligence |
利用机器学习和人工智能测量并促进和平 |
large language model |
|
|
| 9 |
RelayLLM: Efficient Reasoning via Collaborative Decoding |
提出RelayLLM以解决大语言模型推理效率低下问题 |
large language model |
|
|
| 10 |
CuMA: Aligning LLMs with Sparse Cultural Values via Demographic-Aware Mixture of Adapters |
提出CuMA以解决大语言模型文化价值对齐问题 |
large language model |
✅ |
|
| 11 |
Differential syntactic and semantic encoding in LLMs |
通过分析LLM内部表征,揭示句法和语义信息的差异化编码方式 |
large language model |
|
|
| 12 |
Prior-Informed Zeroth-Order Optimization with Adaptive Direction Alignment for Memory-Efficient LLM Fine-Tuning |
提出先验引导的零阶优化方法,高效微调大规模语言模型 |
large language model |
|
|
| 13 |
SampoNLP: A Self-Referential Toolkit for Morphological Analysis of Subword Tokenizers |
SampoNLP:一种自参照工具包,用于亚词分词器的形态分析 |
large language model |
✅ |
|
| 14 |
Agent-as-a-Judge |
提出Agent-as-a-Judge框架,提升复杂AI评估的可靠性与可验证性 |
large language model |
|
|
| 15 |
Belief in Authority: Impact of Authority in Multi-Agent Evaluation Framework |
分析角色权威对多智能体评估框架的影响,揭示权威偏见产生机制。 |
large language model |
|
|
| 16 |
NC2C: Automated Convexification of Generic Non-Convex Optimization Problems |
提出NC2C,利用LLM自动将非凸优化问题转化为可解的凸形式 |
large language model |
|
|
| 17 |
PILOT-Bench: A Benchmark for Legal Reasoning in the Patent Domain with IRAC-Aligned Classification Tasks |
提出PILOT-Bench:一个针对专利领域法律推理的IRAC对齐分类基准。 |
large language model |
✅ |
|
| 18 |
RiskAtlas: Exposing Domain-Specific Risks in LLMs through Knowledge-Graph-Guided Harmful Prompt Generation |
RiskAtlas:通过知识图谱引导的有害提示生成,揭示LLM在特定领域的风险 |
large language model |
|
|
| 19 |
DSC2025 -- ViHallu Challenge: Detecting Hallucination in Vietnamese LLMs |
DSC2025 ViHallu Challenge:首个越南语LLM幻觉检测大规模共享任务。 |
large language model |
|
|
| 20 |
ToolGate: Contract-Grounded and Verified Tool Execution for LLMs |
ToolGate:为LLM工具调用提供基于合约的、可验证的执行框架 |
large language model |
|
|
| 21 |
Semantically Orthogonal Framework for Citation Classification: Disentangling Intent and Content |
提出SOFT框架,解耦引用意图与内容类型,提升引文分类性能 |
large language model |
✅ |
|
| 22 |
Multi-Disciplinary Dataset Discovery from Citation-Verified Literature Contexts |
提出一种基于引文语境的多学科数据集发现框架,提升数据集检索召回率。 |
large language model |
✅ |
|
| 23 |
GenProve: Learning to Generate Text with Fine-Grained Provenance |
GenProve:学习生成带有细粒度来源信息的文本,解决LLM幻觉问题。 |
large language model |
|
|
| 24 |
Faithful Summarisation under Disagreement via Belief-Level Aggregation |
提出基于信念层聚合的框架,解决意见型摘要中现有方法忽略分歧的问题。 |
large language model |
|
|
| 25 |
Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis |
提出Mind2Report,模拟商业分析师,合成专家级商业报告 |
large language model |
✅ |
|
| 26 |
Fame Fades, Nature Remains: Disentangling the Character Identity of Role-Playing Agents |
提出角色身份解耦框架,区分角色扮演Agent的参数化和属性化身份 |
large language model |
|
|
| 27 |
PRISM: A Unified Framework for Post-Training LLMs Without Verifiable Rewards |
PRISM:一种无需可验证奖励的LLM后训练统一框架 |
large language model |
|
|
| 28 |
Thunder-KoNUBench: A Corpus-Aligned Benchmark for Korean Negation Understanding |
提出Thunder-KoNUBench,用于评估和提升韩语否定理解能力 |
large language model |
|
|
| 29 |
LinguaGame: A Linguistically Grounded Game-Theoretic Paradigm for Multi-Agent Dialogue Generation |
LinguaGame:一种基于语言学和博弈论的多智能体对话生成范式 |
large language model |
|
|