cs.CV(2025-09-01)
📊 共 4 篇论文
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | A Unified Low-level Foundation Model for Enhancing Pathology Image Quality | 提出统一低级病理基础模型以增强病理图像质量 | foundation model | ||
| 2 | RT-VLM: Re-Thinking Vision Language Model with 4-Clues for Real-World Object Recognition Robustness | 提出RT-VLM以解决现实世界物体识别的鲁棒性问题 | multimodal | ||
| 3 | Do Video Language Models Really Know Where to Look? Diagnosing Attention Failures in Video Language Models | 诊断视频语言模型注意力失效问题,揭示关键帧选择的局限性 | large language model multimodal |
🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection | 提出SpectMamba以解决医学图像检测中的效率与准确性问题 | Mamba state space model |