| 1 |
C3G: Learning Compact 3D Representations with 2K Gaussians |
C3G:使用2K高斯学习紧凑的3D表示,提升场景重建与理解 |
3D gaussian splatting gaussian splatting novel view synthesis |
|
|
| 2 |
Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding |
Motion4D:学习3D一致的运动和语义信息,用于4D场景理解 |
gaussian splatting novel view synthesis scene understanding |
✅ |
|
| 3 |
SyncTrack4D: Cross-Video Motion Alignment and Video Synchronization for Multi-Video 4D Gaussian Splatting |
SyncTrack4D:面向未同步多视角视频的4D高斯溅射动态场景重建。 |
gaussian splatting |
|
|
| 4 |
Memory-Guided Point Cloud Completion for Dental Reconstruction |
提出基于记忆引导的点云补全框架,用于牙科重建,提升补全精度。 |
point cloud |
|
|
| 5 |
Mind-to-Face: Neural-Driven Photorealistic Avatar Synthesis via EEG Decoding |
Mind-to-Face:首个基于脑电信号解码的逼真人脸Avatar生成框架 |
3D gaussian splatting gaussian splatting |
|
|
| 6 |
MVRoom: Controllable 3D Indoor Scene Generation with Multi-View Diffusion Models |
MVRoom:基于多视角扩散模型的可控3D室内场景生成 |
novel view synthesis |
|
|
| 7 |
Emergent Outlier View Rejection in Visual Geometry Grounded Transformers |
发现VGGT中隐含的离群点抑制能力,提升野外图像三维重建鲁棒性 |
VGGT |
|
|
| 8 |
ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation |
提出ReCamDriving,一种纯视觉相机控制的新轨迹视频生成框架 |
3DGS |
|
|
| 9 |
Beyond Boundary Frames: Audio-Visual Semantic Guidance for Context-Aware Video Interpolation |
提出BBF框架,利用音视频语义指导上下文感知的视频插帧 |
optical flow |
|
|
| 10 |
GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models |
GAOT:提出基于文本引导扩散模型的铰接物体生成框架 |
point cloud |
|
|
| 11 |
CartoMapQA: A Fundamental Benchmark Dataset Evaluating Vision-Language Models on Cartographic Map Understanding |
CartoMapQA:提出用于评估视觉-语言模型地图理解能力的基础基准数据集。 |
navigation |
✅ |
|
| 12 |
OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation |
OpenTrack3D:面向精确和泛化的开放词汇3D实例分割 |
point cloud |
|
|
| 13 |
AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model |
提出结合YOLO和SAM的AfroBeats舞蹈动作分析框架,无需专业设备。 |
pose estimation |
|
|
| 14 |
EEA: Exploration-Exploitation Agent for Long Video Understanding |
提出EEA:一种用于长视频理解的探索-利用智能体框架 |
navigation |
|
|