cs.CV(2025-10-21)

📊 共 7 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (4 🔗1) 支柱六:视频提取与匹配 (Video Extraction) (2) 支柱九:具身大模型 (Embodied Foundation Models) (1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)

#题目一句话要点标签🔗
1 OpenInsGaussian: Open-vocabulary Instance Gaussian Segmentation with Context-aware Cross-view Fusion 提出OpenInsGaussian,通过上下文感知跨视角融合实现开放词汇实例高斯分割。 gaussian splatting splatting scene understanding
2 BlendCLIP: Bridging Synthetic and Real Domains for Zero-Shot 3D Object Classification with Multimodal Pretraining BlendCLIP:通过多模态预训练桥接合成与真实域,实现零样本3D物体分类 open-vocabulary open vocabulary multimodal
3 UWBench: A Comprehensive Vision-Language Benchmark for Underwater Understanding UWBench:用于水下环境理解的综合性视觉-语言基准数据集 scene understanding multimodal visual grounding
4 VelocityNet: Real-Time Crowd Anomaly Detection via Person-Specific Velocity Analysis VelocityNet:基于个体速度分析的实时人群异常检测 optical flow

🔬 支柱六:视频提取与匹配 (Video Extraction) (2 篇)

#题目一句话要点标签🔗
5 Latent-Info and Low-Dimensional Learning for Human Mesh Recovery and Parallel Optimization 提出基于潜在信息和低维学习的人体网格恢复与并行优化方法 human mesh recovery
6 Hyperbolic Space Learning Method Leveraging Temporal Motion Priors for Human Mesh Recovery 提出一种利用时序运动先验的 hyperbolic 空间学习方法,用于人体网格重建。 human mesh recovery

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
7 VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety VLSU:构建多模态AI安全评估框架,揭示视觉-语言联合理解的局限性 foundation model multimodal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页