cs.CV(2025-11-27)

📊 共 25 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱三:空间感知 (Perception & SLAM) (15 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (4) 支柱一:机器人控制 (Robot Control) (3 🔗1) 支柱七:动作重定向 (Motion Retargeting) (2) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱三:空间感知 (Perception & SLAM) (15 篇)

#题目一句话要点标签🔗
1 IE-SRGS: An Internal-External Knowledge Fusion Framework for High-Fidelity 3D Gaussian Splatting Super-Resolution 提出IE-SRGS框架,融合内外知识提升3D高斯溅射超分辨率重建质量 depth estimation 3D gaussian splatting 3DGS
2 Can Protective Watermarking Safeguard the Copyright of 3D Gaussian Splatting? 提出GSPure框架,针对3D高斯溅射的水印进行有效去除,同时保持场景完整性。 3D gaussian splatting 3DGS gaussian splatting
3 RemedyGS: Defend 3D Gaussian Splatting against Computation Cost Attacks 提出RemedyGS框架,防御针对3D高斯溅射的计算成本攻击 3D gaussian splatting 3DGS gaussian splatting
4 MG-Nav: Dual-Scale Visual Navigation via Sparse Spatial Memory 提出MG-Nav以解决零-shot视觉导航中的规划与控制问题 localization navigation VGGT
5 Splat-SAP: Feed-Forward Gaussian Splatting for Human-Centered Scene with Scale-Aware Point Map Reconstruction Splat-SAP:面向以人为中心的稀疏场景,提出基于尺度感知点图重建的前馈高斯溅射方法 stereo matching gaussian splatting
6 RoadSceneBench: A Lightweight Benchmark for Mid-Level Road Scene Understanding RoadSceneBench:轻量级道路场景理解基准,提升视觉推理能力。 scene understanding
7 DocVAL: Validated Chain-of-Thought Distillation for Grounded Document VQA 提出DocVAL:一种经验证的思维链蒸馏框架,用于提升文档VQA的空间推理能力。 localization geometric consistency
8 Gaussians on Fire: High-Frequency Reconstruction of Flames 提出基于高斯分布的时空表示方法,用于从有限视角重建火焰高频动态。 monocular depth optical flow
9 Emergent Extreme-View Geometry in 3D Foundation Models 揭示3D基础模型涌现的极端视角几何能力,并提出轻量级对齐方案。 pose estimation
10 GazeTrack: High-Precision Eye Tracking Based on Regularization and Spatial Computing GazeTrack:基于正则化和空间计算的高精度眼动追踪 localization
11 Text Condition Embedded Regression Network for Automated Dental Abutment Design 提出TCEAD框架,通过文本引导的回归网络实现自动化牙种植体基台设计。 localization
12 MoLT: Mixture of Layer-Wise Tokens for Efficient Audio-Visual Learning 提出MoLT,通过混合层级Token实现高效的音视频学习。 localization
13 Fin3R: Fine-tuning Feed-forward 3D Reconstruction Models via Monocular Knowledge Distillation Fin3R:通过单目知识蒸馏微调前馈3D重建模型,提升几何精度。 VGGT
14 Prompt-based Consistent Video Colorization 提出基于提示词的视频一致性着色方法,解决时序闪烁和人工干预问题。 optical flow
15 BrepGPT: Autoregressive B-rep Generation with Voronoi Half-Patch BrepGPT:基于Voronoi Half-Patch的单阶段自回归B-rep生成框架 point cloud

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
16 IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer IMTalker:利用隐式运动传递实现高效的音频驱动说话人脸生成 flow matching optical flow motion latent
17 SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model 提出轨迹条件下的稀疏Occupancy World Model,用于未来3D场景Occupancy预测。 world model VGGT
18 MRI-Based Brain Age Estimation with Supervised Contrastive Learning of Continuous Representation 提出基于监督对比学习的MRI脑年龄估计方法,提升神经形态学变化建模精度。 MAE contrastive learning
19 Guiding the Inner Eye: A Framework for Hierarchical and Flexible Visual Grounded Reasoning 提出GRiP框架,通过认知引导强化学习提升视觉基础推理的鲁棒性和灵活性 reinforcement learning localization

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
20 Revisiting the Necessity of Lengthy Chain-of-Thought in Vision-centric Reasoning Generalization 研究表明,在视觉推理泛化中,简洁的思维链(CoT)优于冗长的CoT。 manipulation
21 Content Adaptive Encoding For Interactive Game Streaming 提出基于编码元数据的自适应分辨率编码方法,用于交互式游戏流媒体。 running
22 DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action DualVLA:通过解耦推理与动作,构建可泛化的具身智能体 manipulation

🔬 支柱七:动作重定向 (Motion Retargeting) (2 篇)

#题目一句话要点标签🔗
23 Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration Fast3Dcache:一种无训练的几何感知缓存框架,加速3D几何体合成。 geometric consistency
24 DiffStyle360: Diffusion-Based 360° Head Stylization via Style Fusion Attention DiffStyle360:提出基于扩散模型的360°头部风格化方法,实现多视角一致的风格迁移。 structure preservation

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
25 AI killed the video star. Audio-driven diffusion model for expressive talking head generation 提出Dimitra++:一种音频驱动的扩散模型,用于生成富有表现力的说话人头部 motion diffusion

⬅️ 返回 cs.CV 首页 · 🏠 返回主页