cs.CV(2025-05-12)
📊 共 22 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (9 🔗2)
支柱三:空间感知与语义 (Perception & Semantics) (7 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (4)
支柱六:视频提取与匹配 (Video Extraction) (1 🔗1)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset | 提出TUM2TWIN以解决城市数字双胞胎数据集不足问题 | gaussian splatting splatting NeRF | ||
| 11 | SLAG: Scalable Language-Augmented Gaussian Splatting | 提出SLAG以解决大规模场景编码效率问题 | gaussian splatting splatting | ✅ | |
| 12 | TUGS: Physics-based Compact Representation of Underwater Scenes by Tensorized Gaussian | 提出TUGS以解决复杂水下场景重建问题 | gaussian splatting splatting NeRF | ||
| 13 | GIFStream: 4D Gaussian-based Immersive Video with Feature Stream | 提出GIFStream以解决沉浸视频存储与质量平衡问题 | gaussian splatting splatting | ✅ | |
| 14 | Geometric Prior-Guided Neural Implicit Surface Reconstruction in the Wild | 提出几何先验引导的神经隐式表面重建方法以解决复杂场景问题 | NeRF neural radiance field | ||
| 15 | Asynchronous Multi-Object Tracking with an Event Camera | 提出异步事件多目标跟踪算法以解决动态环境下的目标检测问题 | optical flow | ||
| 16 | Deep Learning Advances in Vision-Based Traffic Accident Anticipation: A Comprehensive Review of Methods, Datasets, and Future Directions | 综述深度学习在基于视觉的交通事故预测中的应用与挑战 | scene understanding |
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 17 | SAMChat: Introducing Chain of Thought Reasoning and GRPO to a Multimodal Small Language Model for Small Scale Remote Sensing | 提出SAMChat以解决小规模遥感图像分析问题 | reinforcement learning large language model multimodal | ||
| 18 | Learning to Reason and Navigate: Parameter Efficient Action Planning with Large Language Models | 提出PEAP-LLM以解决复杂室内导航问题 | DPO direct preference optimization large language model | ||
| 19 | DanceGRPO: Unleashing GRPO on Visual Generation | 提出DanceGRPO以解决视觉生成中的优化稳定性问题 | reinforcement learning RLHF foundation model | ||
| 20 | RealRep: Generalized SDR-to-HDR Conversion via Attribute-Disentangled Representation Learning | 提出RealRep以解决SDR到HDR转换中的表现多样性问题 | representation learning |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 21 | Boosting Global-Local Feature Matching via Anomaly Synthesis for Multi-Class Point Cloud Anomaly Detection | 提出GLFM方法以解决多类点云异常检测中的特征混淆问题 | feature matching | ✅ |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 22 | Hybrid Spiking Vision Transformer for Object Detection with Event Cameras | 提出混合脉冲视觉变换器以解决事件摄像头物体检测问题 | spatiotemporal |