cs.CV(2023-12-11)

📊 共 11 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (4) 支柱二:RL算法与架构 (RL & Architecture) (3) 支柱六:视频提取与匹配 (Video Extraction) (2) 支柱九:具身大模型 (Embodied Foundation Models) (2 🔗1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)

#题目一句话要点标签🔗
1 Creating Visual Effects with Neural Radiance Fields 提出基于Nerfstudio和Blender的NeRF视觉特效合成管线 NeRF neural radiance field
2 Gaussian Splatting SLAM 首个基于3D高斯溅射的单目SLAM系统,实现高效、高质量的实时重建。 visual SLAM 3D gaussian splatting 3DGS
3 Inferring Hybrid Neural Fluid Fields from Videos 提出混合神经流体场HyFluid,从稀疏多视角视频中重建流体密度和速度场。 optical flow physically plausible
4 Nuvo: Neural UV Mapping for Unruly 3D Representations Nuvo:提出一种神经UV映射方法,用于处理复杂三维重建和生成几何体的纹理映射。 neural radiance field

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
5 AnyHome: Open-Vocabulary Generation of Structured and Textured 3D Homes AnyHome:提出一种基于开放词汇的结构化和纹理化3D家居场景生成框架 distillation open-vocabulary open vocabulary
6 CAD: Photorealistic 3D Generation via Adversarial Distillation CAD:通过对抗蒸馏实现逼真的3D生成 distillation
7 Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior Sherpa3D:利用粗糙3D先验提升高保真文本到3D生成效果 distillation geometric consistency

🔬 支柱六:视频提取与匹配 (Video Extraction) (2 篇)

#题目一句话要点标签🔗
8 EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning EgoPlan-Bench:评估多模态大语言模型在第一人称视角下的人类水平规划能力 egocentric large language model multimodal
9 3D Hand Pose Estimation in Everyday Egocentric Images WildHands:针对日常第一视角图像的3D手部姿态估计系统 egocentric

🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)

#题目一句话要点标签🔗
10 Honeybee: Locality-enhanced Projector for Multimodal LLM Honeybee:一种局部性增强的投影器,用于提升多模态大语言模型性能 large language model multimodal
11 Evaluation of Large Language Models for Decision Making in Autonomous Driving 量化评估大型语言模型在自动驾驶决策中的空间感知与规则遵守能力 large language model

⬅️ 返回 cs.CV 首页 · 🏠 返回主页