cs.CV（2025-06-24）

📊 共 5 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (3 🔗2) 支柱一：机器人控制 (Robot Control) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models	提出MSR-Align以解决多模态模型安全对齐问题	multimodal chain-of-thought	✅
2	Mem4Nav: Boosting Vision-and-Language Navigation in Urban Environments with a Hierarchical Spatial-Cognition Long-Short Memory System	提出Mem4Nav以解决城市环境中的视觉-语言导航问题	VLN multimodal	✅
3	Implementing blind navigation through multi-modal sensing and gait guidance	提出多模态感知与步态引导的盲人导航系统以解决视障人士导航困难问题	multimodal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
4	Unified Vision-Language-Action Model	提出UniVLA模型以解决视觉-语言-动作理解问题	manipulation policy learning world model

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
5	Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding	提出Da Yu以解决水道监测中的图像描述问题	scene understanding large language model

⬅️ 返回 cs.CV 首页 · 🏠 返回主页