cs.CV(2025-06-24)
📊 共 5 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (3 🔗2)
支柱一:机器人控制 (Robot Control) (1)
支柱三:空间感知与语义 (Perception & Semantics) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models | 提出MSR-Align以解决多模态模型安全对齐问题 | multimodal chain-of-thought | ✅ | |
| 2 | Mem4Nav: Boosting Vision-and-Language Navigation in Urban Environments with a Hierarchical Spatial-Cognition Long-Short Memory System | 提出Mem4Nav以解决城市环境中的视觉-语言导航问题 | VLN multimodal | ✅ | |
| 3 | Implementing blind navigation through multi-modal sensing and gait guidance | 提出多模态感知与步态引导的盲人导航系统以解决视障人士导航困难问题 | multimodal |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | Unified Vision-Language-Action Model | 提出UniVLA模型以解决视觉-语言-动作理解问题 | manipulation policy learning world model |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding | 提出Da Yu以解决水道监测中的图像描述问题 | scene understanding large language model |