cs.CV(2025-06-23)

📊 共 6 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (3 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (1 🔗1) 支柱一:机器人控制 (Robot Control) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1 🔗1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
1 OmniGen2: Exploration to Advanced Multimodal Generation 提出OmniGen2以解决多模态生成任务的统一问题 multimodal
2 Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations 提出多模态框架以统一视觉理解与生成 large language model multimodal
3 CaughtCheating: Is Your MLLM a Good Cheating Detective? Exploring the Boundary of Visual Perception and Reasoning 提出CaughtCheating以解决多模态大语言模型的视觉推理挑战 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)

#题目一句话要点标签🔗
4 MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation 提出MCN-SLAM以解决多代理协作SLAM中的通信带宽问题 distillation visual SLAM NeRF

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
5 Drive-R1: Bridging Reasoning and Planning in VLMs for Autonomous Driving with Reinforcement Learning 提出Drive-R1以解决视觉语言模型在自动驾驶中的推理与规划问题 motion planning reinforcement learning

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
6 Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss Functions 综述多模态方法以推进对话头生成技术 NeRF neural radiance field

⬅️ 返回 cs.CV 首页 · 🏠 返回主页