cs.CV（2025-06-23）

📊 共 6 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (3 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (1 🔗1) 支柱一：机器人控制 (Robot Control) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	OmniGen2: Exploration to Advanced Multimodal Generation	提出OmniGen2以解决多模态生成任务的统一问题	multimodal	✅
2	Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations	提出多模态框架以统一视觉理解与生成	large language model multimodal
3	CaughtCheating: Is Your MLLM a Good Cheating Detective? Exploring the Boundary of Visual Perception and Reasoning	提出CaughtCheating以解决多模态大语言模型的视觉推理挑战	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
4	MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation	提出MCN-SLAM以解决多代理协作SLAM中的通信带宽问题	distillation visual SLAM NeRF	✅

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
5	Drive-R1: Bridging Reasoning and Planning in VLMs for Autonomous Driving with Reinforcement Learning	提出Drive-R1以解决视觉语言模型在自动驾驶中的推理与规划问题	motion planning reinforcement learning

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
6	Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss Functions	综述多模态方法以推进对话头生成技术	NeRF neural radiance field	✅

⬅️ 返回 cs.CV 首页 · 🏠 返回主页