| 1 |
Memorization in Large Language Models in Medicine: Prevalence, Characteristics, and Implications |
首个医学LLM记忆能力综合评估:揭示记忆普遍性、特征及影响 |
large language model |
|
|
| 2 |
MultimodalHugs: Enabling Sign Language Processing in Hugging Face |
MultimodalHugs:在Hugging Face中实现手语处理的框架 |
multimodal |
|
|
| 3 |
Acquiescence Bias in Large Language Models |
揭示大语言模型中的反向顺从偏差:倾向于回答“否” |
large language model |
|
|
| 4 |
DiTTO-LLM: Framework for Discovering Topic-based Technology Opportunities via Large Language Model |
DiTTO-LLM:提出基于大语言模型的主题技术机会发现框架 |
large language model |
|
|
| 5 |
ALIGNS: Unlocking nomological networks in psychological measurement through a large language model |
ALIGNS:利用大型语言模型解锁心理测量学中的因果网络。 |
large language model |
|
|
| 6 |
A Role-Aware Multi-Agent Framework for Financial Education Question Answering with LLMs |
提出基于角色感知的多智能体框架,提升LLM在金融教育问答中的准确性。 |
large language model chain-of-thought |
|
|
| 7 |
Stated Preference for Interaction and Continued Engagement (SPICE): Evaluating an LLM's Willingness to Re-engage in Conversation |
提出SPICE指标,通过意愿调查评估LLM在不同语境下的对话意愿 |
large language model |
|
|
| 8 |
Documents Are People and Words Are Items: A Psychometric Approach to Textual Data with Contextual Embeddings |
提出一种基于上下文嵌入的心理测量方法,用于分析文本数据中的潜在知识维度。 |
large language model |
|
|
| 9 |
Building High-Quality Datasets for Portuguese LLMs: From Common Crawl Snapshots to Industrial-Grade Corpora |
提出葡萄牙语LLM高质量数据集构建方法,性能媲美工业级语料库 |
large language model |
|
|
| 10 |
Evaluating LLMs Without Oracle Feedback: Agentic Annotation Evaluation Through Unsupervised Consistency Signals |
提出基于一致性信号的Agentic标注评估方法,无需人工反馈评估LLM标注质量。 |
large language model |
|
|
| 11 |
The meaning of prompts and the prompts of meaning: Semiotic reflections and modelling |
基于Peirce符号学理论,重构LLM提示工程为动态的符号互动过程 |
large language model |
|
|
| 12 |
Discrimination by LLMs: Cross-lingual Bias Assessment and Mitigation in Decision-Making and Summarisation |
评估并缓解LLM在决策和摘要任务中的跨语言偏见,关注背景、性别和年龄歧视。 |
large language model |
|
|
| 13 |
Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning |
提出AncientDoc基准,评估视觉-语言模型在古籍文档理解中的OCR和知识推理能力。 |
large language model |
|
|
| 14 |
Too Helpful, Too Harmless, Too Honest or Just Right? |
TrinityX:提出一种基于校准专家混合的模块化对齐框架,提升LLM的HHH对齐效果。 |
large language model |
|
|
| 15 |
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs |
研究表明:LLM生成的有毒文本在文本解毒任务中表现不如人工标注数据 |
large language model |
|
|