| 1 |
MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models |
提出MedSeg-R以解决医学图像分割中的推理问题 |
large language model multimodal |
|
|
| 2 |
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation |
提出Pisces以解决多模态图像理解与生成的统一模型挑战 |
large language model foundation model multimodal |
|
|
| 3 |
UrbanSense:A Framework for Quantitative Analysis of Urban Streetscapes leveraging Vision Large Language Models |
提出UrbanSense框架以解决城市街景定量分析问题 |
large language model multimodal |
|
|
| 4 |
Lifting Data-Tracing Machine Unlearning to Knowledge-Tracing for Foundation Models |
提出知识追踪机器遗忘以解决基础模型的多样化需求 |
foundation model |
|
|
| 5 |
BrainMAP: Multimodal Graph Learning For Efficient Brain Disease Localization |
提出BrainMAP以解决脑部疾病定位效率低下问题 |
multimodal |
|
|
| 6 |
MF2Summ: Multimodal Fusion for Video Summarization with Temporal Alignment |
提出MF2Summ以解决视频摘要中的多模态信息融合问题 |
multimodal |
|
|
| 7 |
GeoCAD: Local Geometry-Controllable CAD Generation with Large Language Models |
提出GeoCAD以解决局部几何可控CAD生成问题 |
large language model |
✅ |
|
| 8 |
Towards Scalable SOAP Note Generation: A Weakly Supervised Multimodal Framework |
提出弱监督多模态框架以生成SOAP笔记,解决临床文档负担问题 |
multimodal |
|
|
| 9 |
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs |
提出CDPruner以解决多模态大语言模型中的视觉token冗余问题 |
large language model multimodal |
✅ |
|
| 10 |
Prompts to Summaries: Zero-Shot Language-Guided Video Summarization |
提出零-shot视频摘要方法以解决用户意图表达不足问题 |
large language model multimodal |
|
|
| 11 |
Defensive Adversarial CAPTCHA: A Semantics-Driven Framework for Natural Adversarial Example Generation |
提出无源对抗CAPTCHA以解决传统CAPTCHA易受攻击问题 |
large language model multimodal |
|
|
| 12 |
From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations |
提出可解释的生物多样性监测框架以解决生态系统理解问题 |
large language model multimodal |
✅ |
|
| 13 |
CogStream: Context-guided Streaming Video Question Answering |
提出CogStream以解决流媒体视频问答中的上下文依赖问题 |
large language model multimodal |
✅ |
|
| 14 |
CreatiPoster: Towards Editable and Controllable Multi-Layer Graphic Design Generation |
提出CreatiPoster以解决可编辑多层图形设计生成问题 |
multimodal |
✅ |
|