cs.CV(2025-08-04)
📊 共 7 篇论文
🎯 兴趣领域导航
支柱一:机器人控制 (Robot Control) (2)
支柱九:具身大模型 (Embodied Foundation Models) (2)
支柱八:物理动画 (Physics-based Animation) (1)
支柱六:视频提取与匹配 (Video Extraction) (1)
支柱四:生成式动作 (Generative Motion) (1)
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis | 提出Human-X框架以解决实时人机交互的物理可行性问题 | humanoid humanoid robot reinforcement learning | ||
| 2 | Modality Bias in LVLMs: Analyzing and Mitigating Object Hallucination via Attention Lens | 提出注意力调整方法以缓解LVLM中的物体幻觉问题 | manipulation large language model multimodal |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 3 | MonoDream: Monocular Vision-Language Navigation with Panoramic Dreaming | 提出MonoDream以解决单目视觉导航性能不足问题 | VLA VLN | ||
| 4 | VisuCraft: Enhancing Large Vision-Language Models for Complex Visual-Guided Creative Content Generation via Structured Information Extraction | 提出VisuCraft以解决大型视觉语言模型在创意内容生成中的局限性 | multimodal visual grounding |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | How Would It Sound? Material-Controlled Multimodal Acoustic Profile Generation for Indoor Scenes | 提出材料控制的多模态声学特征生成以解决室内声学建模问题 | PULSE multimodal |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | Following Route Instructions using Large Vision-Language Models: A Comparison between Low-level and Panoramic Action Spaces | 利用大型视觉语言模型进行路径指引,比较低级与全景动作空间 | egocentric VLN |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | X-Actor: Emotional and Expressive Long-Range Portrait Acting from Audio | 提出X-Actor以解决长视频情感表达问题 | motion latent |