cs.CV（2025-11-01）

📊 共 11 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (4 🔗1) 支柱三：空间感知 (Perception & SLAM) (4) 支柱一：机器人控制 (Robot Control) (2 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Rethinking Facial Expression Recognition in the Era of Multimodal Large Language Models: Benchmark, Datasets, and Beyond	提出UniFER-7B，提升多模态大语言模型在面部表情识别中的推理和可解释性。	reinforcement learning large language model foundation model
2	Towards classification-based representation learning for place recognition on LiDAR scans	提出基于分类的LiDAR点云表征学习方法，用于解决定位识别问题	representation learning contrastive learning
3	Saliency-R1: Incentivizing Unified Saliency Reasoning Capability in MLLM with Confidence-Guided Reinforcement Learning	Saliency-R1：利用置信度引导强化学习，提升MLLM的统一显著性推理能力	reinforcement learning
4	VinciCoder: Unifying Multimodal Code Generation via Coarse-to-fine Visual Reinforcement Learning	VinciCoder：通过粗到细视觉强化学习统一多模态代码生成	reinforcement learning	✅

🔬 支柱三：空间感知 (Perception & SLAM) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
5	4D Neural Voxel Splatting: Dynamic Scene Rendering with Voxelized Guassian Splatting	提出4D神经体素溅射，高效动态场景渲染与新视角合成	3D gaussian splatting gaussian splatting novel view synthesis
6	Weakly Supervised Pneumonia Localization from Chest X-Rays Using Deep Neural Network and Grad-CAM Explanations	提出基于弱监督深度学习和Grad-CAM的肺炎定位方法，提升胸部X光片诊断效率。	localization
7	Benchmarking individual tree segmentation using multispectral airborne laser scanning data: the FGI-EMIT dataset	FGI-EMIT：多光谱激光雷达树木分割基准数据集与深度学习方法性能评估	point cloud
8	Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models	Diff4Splat：基于动态重建模型的单图可控4D场景生成	novel view synthesis

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
9	OmniTrack++: Omnidirectional Multi-Object Tracking by Learning Large-FoV Trajectory Feedback	OmniTrack++：通过学习大视场轨迹反馈实现全向多目标跟踪	quadruped legged robot bipedal	✅
10	iFlyBot-VLA Technical Report	提出iFlyBot-VLA，一种基于双层动作表示的视觉-语言-动作大模型，提升机器人操作能力。	manipulation cross-embodiment

🔬 支柱九：具身大模型 (Embodied Foundation Models) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
11	Oitijjo-3D: Generative AI Framework for Rapid 3D Heritage Reconstruction from Street View Imagery	Oitijjo-3D：利用街景图像的快速3D遗产重建生成式AI框架	multimodal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页