| 1 |
MSEarth: A Multimodal Scientific Dataset and Benchmark for Phenomena Uncovering in Earth Science |
提出MSEarth以解决地球科学领域多模态基准缺失问题 |
large language model multimodal |
|
|
| 2 |
Privacy-Preserving Chest X-ray Report Generation via Multimodal Federated Learning with ViT and GPT-2 |
提出多模态联邦学习框架以实现隐私保护的胸部X光报告生成 |
multimodal |
|
|
| 3 |
WDMIR: Wavelet-Driven Multimodal Intent Recognition |
提出WDMIR框架以提升多模态意图识别精度 |
multimodal |
|
|
| 4 |
Large Language Models Miss the Multi-Agent Mark |
提出多智能体系统理论以提升大型语言模型的应用效果 |
large language model |
|
|
| 5 |
Complex System Diagnostics Using a Knowledge Graph-Informed and Large Language Model-Enhanced Framework |
提出知识图谱与大语言模型结合的诊断框架以解决复杂系统诊断问题 |
large language model |
|
|
| 6 |
Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs) |
探讨系统提示对大型语言模型偏见的影响及其透明性问题 |
large language model |
|
|
| 7 |
StreamLink: Large-Language-Model Driven Distributed Data Engineering System |
提出StreamLink以解决数据工程任务效率低下问题 |
large language model |
|
|
| 8 |
CoderAgent: Simulating Student Behavior for Personalized Programming Learning with Large Language Models |
提出CoderAgent以解决个性化编程学习中的数据不足问题 |
large language model |
|
|
| 9 |
Comparisons between a Large Language Model-based Real-Time Compound Diagnostic Medical AI Interface and Physicians for Common Internal Medicine Cases using Simulated Patients |
提出基于大型语言模型的实时复合诊断医疗AI接口以提升内科诊断效率 |
large language model |
|
|
| 10 |
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs |
提出MME-Reasoning以解决多模态大语言模型逻辑推理评估不足问题 |
large language model multimodal |
|
|
| 11 |
Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming |
提出基于LLM的自动化语言偏差生成框架以提升假设生成能力 |
large language model symbolic grounding |
|
|
| 12 |
Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations |
提出ChemCoTBench以解决化学推理不足的问题 |
large language model chain-of-thought |
|
|
| 13 |
Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning |
提出基于记忆增强的上下文学习框架以预测初创企业成功 |
large language model |
|
|
| 14 |
Scientific Paper Retrieval with LLM-Guided Semantic-Based Ranking |
提出SemRank以解决科学论文检索中的语义匹配问题 |
large language model |
|
|
| 15 |
Make Planning Research Rigorous Again! |
提出将规划领域的严谨性应用于大语言模型的规划研究 |
large language model |
|
|
| 16 |
The Feasibility of Topic-Based Watermarking on Academic Peer Reviews |
提出基于主题的水印技术以解决学术同行评审中的归属问题 |
large language model |
|
|
| 17 |
The Multilingual Divide and Its Impact on Global AI Safety |
提出解决语言差距以增强全球AI安全性 |
large language model |
|
|
| 18 |
Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space |
提出扩展策略空间以解决监狱突破攻击问题 |
large language model |
✅ |
|
| 19 |
Interpreting Social Bias in LVLMs via Information Flow Analysis and Multi-Round Dialogue Evaluation |
通过信息流分析与多轮对话评估揭示LVLM中的社会偏见 |
multimodal |
|
|
| 20 |
Herd Behavior: Investigating Peer Influence in LLM-based Multi-Agent Systems |
研究群体行为以提升LLM多智能体系统的协作能力 |
large language model |
|
|
| 21 |
Agent-Environment Alignment via Automated Interface Generation |
提出ALIGN框架以解决智能体与环境不匹配问题 |
large language model |
✅ |
|
| 22 |
AITEE -- Agentic Tutor for Electrical Engineering |
提出AITEE以解决电气工程教育中的个性化学习问题 |
large language model |
|
|
| 23 |
Towards Conversational Development Environments: Using Theory-of-Mind and Multi-Agent Architectures for Requirements Refinement |
提出AlignMind以解决软件开发中需求捕捉不足的问题 |
foundation model |
|
|
| 24 |
RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving |
提出RepoMaster以解决GitHub仓库复杂任务自动探索问题 |
large language model |
✅ |
|
| 25 |
Step-Wise Formal Verification for LLM-Based Mathematical Problem Solving |
提出MATH-VF框架以解决LLM数学问题求解的验证问题 |
large language model |
|
|
| 26 |
Respond to Change with Constancy: Instruction-tuning with LLM for Non-I.I.D. Network Traffic Classification |
提出ETooL以解决非独立同分布网络流量分类问题 |
large language model |
|
|
| 27 |
An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks |
提出SE-Jury以解决软件工程任务中生成软件工件评估问题 |
large language model |
|
|
| 28 |
Code Researcher: Deep Research Agent for Large Systems Code and Commit History |
提出Code Researcher以解决系统代码补丁生成问题 |
large language model |
|
|
| 29 |
GIFARC: Synthetic Dataset for Leveraging Human-Intuitive Analogies to Elevate AI Reasoning |
提出GIFARC以提升AI推理能力,解决ARC挑战 |
large language model |
|
|
| 30 |
MIRROR: Multi-agent Intra- and Inter-Reflection for Optimized Reasoning in Tool Learning |
提出MIRROR框架以优化工具学习中的多智能体反思问题 |
large language model |
|
|