cs.CV(2025-11-12)

📊 共 22 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱三:空间感知 (Perception & SLAM) (14 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (4 🔗2) 支柱一:机器人控制 (Robot Control) (2 🔗1) 支柱五:交互与反应 (Interaction & Reaction) (1 🔗1) 支柱六:视频提取与匹配 (Video Extraction & Matching) (1)

🔬 支柱三:空间感知 (Perception & SLAM) (14 篇)

#题目一句话要点标签🔗
1 DualVision ArthroNav: Investigating Opportunities to Enhance Localization and Reconstruction in Image-based Arthroscopy Navigation via External Cameras DualVision ArthroNav:利用外部相机增强图像引导关节镜导航的定位与重建 visual odometry SLAM scene reconstruction
2 PALMS+: Modular Image-Based Floor Plan Localization Leveraging Depth Foundation Model 提出PALMS+以解决室内定位精度不足问题 depth estimation monocular depth point cloud
3 OUGS: Active View Selection via Object-aware Uncertainty Estimation in 3DGS OUGS:基于对象感知不确定性估计的3DGS主动视角选择 3D gaussian splatting 3DGS gaussian splatting
4 STORM: Segment, Track, and Object Re-Localization from a Single Image 提出STORM,无需人工标注,实现单图像的物体分割、跟踪和重定位。 pose estimation localization feature matching
5 DreamPose3D: Hallucinative Diffusion with Prompt Learning for 3D Human Pose Estimation DreamPose3D:结合提示学习的幻觉扩散模型用于3D人体姿态估计 pose estimation 3D pose estimation
6 BronchOpt : Vision-Based Pose Optimization with Fine-Tuned Foundation Models for Accurate Bronchoscopy Navigation BronchOpt:基于视觉和微调基础模型的支气管镜导航位姿优化 localization navigation
7 Spatio-Temporal Data Enhanced Vision-Language Model for Traffic Scene Understanding 提出ST-CLIP模型,利用时空信息增强视觉-语言模型,用于交通场景理解。 scene understanding navigation
8 EPSegFZ: Efficient Point Cloud Semantic Segmentation for Few- and Zero-Shot Scenarios with Language Guidance 提出EPSegFZ,利用语言引导实现高效的点云少样本/零样本语义分割 point cloud
9 OG-PCL: Efficient Sparse Point Cloud Processing for Human Activity Recognition 提出OG-PCL网络,用于高效处理稀疏雷达点云的人体活动识别 point cloud
10 Task-Aware 3D Affordance Segmentation via 2D Guidance and Geometric Refinement 提出TASA框架,融合2D引导与几何优化,实现任务感知的3D可交互区域分割 affordance detection point cloud
11 RadHARSimulator V2: Video to Doppler Generator RadHARSimulator V2:提出一种视频到多普勒谱的雷达人体活动识别模拟器。 pose estimation PULSE
12 HOTFLoc++: End-to-End Hierarchical LiDAR Place Recognition, Re-Ranking, and 6-DoF Metric Localisation in Forests HOTFLoc++:森林环境下端到端分层LiDAR定位与重排序 point cloud
13 PIFF: A Physics-Informed Generative Flow Model for Real-Time Flood Depth Mapping 提出PIFF模型以解决实时洪水深度映射问题 depth estimation
14 Neural B-frame Video Compression with Bi-directional Reference Harmonization 提出BRHVC,通过双向参考帧协调优化神经B帧视频压缩性能 optical flow

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
15 SasMamba: A Lightweight Structure-Aware Stride State Space Model for 3D Human Pose Estimation SasMamba:轻量级结构感知步幅状态空间模型,用于3D人体姿态估计 Mamba SSM state space model
16 PAN: A World Model for General, Interactable, and Long-Horizon World Simulation PAN:通用、可交互、长时程世界模拟的世界模型 world model latent dynamics
17 Learning by Neighbor-Aware Semantics, Deciding by Open-form Flows: Towards Robust Zero-Shot Skeleton Action Recognition 提出Flora,通过邻域感知语义和开放式流解决鲁棒的零样本骨骼动作识别问题 flow matching geometric consistency
18 4KDehazeFlow: Ultra-High-Definition Image Dehazing via Flow Matching 提出4KDehazeFlow,通过Flow Matching实现超高清图像去雾,提升色彩保真度。 flow matching

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
19 Understanding the Representation of Older Adults in Motion Capture Locomotion Datasets 分析MoCap老年人运动数据集,揭示现有数据集对老年人步态表征的不足 locomotion gait walking
20 RF-DETR: Neural Architecture Search for Real-Time Detection Transformers RF-DETR:面向实时目标检测Transformer的神经架构搜索 running

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
21 PressTrack-HMR: Pressure-Based Top-Down Multi-Person Global Human Mesh Recovery PressTrack-HMR:提出基于压力感知的多人全局人体网格重建方法 multi-person interaction human mesh recovery HMR

🔬 支柱六:视频提取与匹配 (Video Extraction & Matching) (1 篇)

#题目一句话要点标签🔗
22 Enriching Knowledge Distillation with Cross-Modal Teacher Fusion 提出RichKD,通过跨模态CLIP知识融合提升知识蒸馏效果 feature matching

⬅️ 返回 cs.CV 首页 · 🏠 返回主页