cs.CV（2025-10-21）

📊 共 7 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱三：空间感知与语义 (Perception & Semantics) (4 🔗1) 支柱六：视频提取与匹配 (Video Extraction) (2) 支柱九：具身大模型 (Embodied Foundation Models) (1)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
1	OpenInsGaussian: Open-vocabulary Instance Gaussian Segmentation with Context-aware Cross-view Fusion	提出OpenInsGaussian，通过上下文感知跨视角融合实现开放词汇实例高斯分割。	gaussian splatting splatting scene understanding
2	BlendCLIP: Bridging Synthetic and Real Domains for Zero-Shot 3D Object Classification with Multimodal Pretraining	BlendCLIP：通过多模态预训练桥接合成与真实域，实现零样本3D物体分类	open-vocabulary open vocabulary multimodal	✅
3	UWBench: A Comprehensive Vision-Language Benchmark for Underwater Understanding	UWBench：用于水下环境理解的综合性视觉-语言基准数据集	scene understanding multimodal visual grounding
4	VelocityNet: Real-Time Crowd Anomaly Detection via Person-Specific Velocity Analysis	VelocityNet：基于个体速度分析的实时人群异常检测	optical flow

🔬 支柱六：视频提取与匹配 (Video Extraction) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
5	Latent-Info and Low-Dimensional Learning for Human Mesh Recovery and Parallel Optimization	提出基于潜在信息和低维学习的人体网格恢复与并行优化方法	human mesh recovery
6	Hyperbolic Space Learning Method Leveraging Temporal Motion Priors for Human Mesh Recovery	提出一种利用时序运动先验的 hyperbolic 空间学习方法，用于人体网格重建。	human mesh recovery

🔬 支柱九：具身大模型 (Embodied Foundation Models) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
7	VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety	VLSU：构建多模态AI安全评估框架，揭示视觉-语言联合理解的局限性	foundation model multimodal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页