| 1 |
KinTwin: Imitation Learning with Torque and Muscle Driven Biomechanical Models Enables Precise Replication of Able-Bodied and Impaired Movement from Markerless Motion Capture |
提出KinTwin以解决运动分析中的逆动力学计算问题 |
imitation learning markerless motion capture |
|
|
| 2 |
Unlocking the Potential of Difficulty Prior in RL-based Multimodal Reasoning |
通过困难先验建模提升多模态推理的强化学习效果 |
reinforcement learning multimodal |
|
|
| 3 |
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition |
提出Mamba-Adaptor以解决视觉识别中的长程遗忘和空间建模问题 |
Mamba SSM state space model |
|
|
| 4 |
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning |
提出VLM-Gym以解决视觉语言模型在游戏中的决策能力不足问题 |
reinforcement learning multimodal |
✅ |
|
| 5 |
SPKLIP: Aligning Spike Video Streams with Natural Language |
提出SPKLIP以解决Spike视频与自然语言对齐问题 |
contrastive learning VLA multimodal |
|
|
| 6 |
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use |
提出AutoMat以解决显微镜图像转化为晶体结构的挑战 |
MAE large language model multimodal |
✅ |
|
| 7 |
BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation |
提出BusterX框架以解决AI生成视频伪造检测与解释问题 |
reinforcement learning large language model multimodal |
|
|
| 8 |
Few-Step Diffusion via Score identity Distillation |
提出Score identity Distillation以解决高分辨率图像生成问题 |
distillation classifier-free guidance |
✅ |
|
| 9 |
Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping |
提出Sat2Sound框架以解决声音景观映射问题 |
representation learning contrastive learning multimodal |
|
|
| 10 |
Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking |
提出Safe-Sora以解决AI生成视频版权保护问题 |
Mamba state space model spatiotemporal |
✅ |
|
| 11 |
DD-Ranking: Rethinking the Evaluation of Dataset Distillation |
提出DD-Ranking以解决数据集蒸馏评估不准确的问题 |
distillation |
|
|
| 12 |
RMMSS: Towards Advanced Robust Multi-Modal Semantic Segmentation with Hybrid Prototype Distillation and Feature Selection |
提出RMMSS以解决多模态语义分割中的鲁棒性问题 |
distillation |
|
|
| 13 |
Coarse Attribute Prediction with Task Agnostic Distillation for Real World Clothes Changing ReID |
提出RLQ框架以解决低质量图像下的服装变化重识别问题 |
distillation |
|
|
| 14 |
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers |
提出RoPECraft以解决视频运动转移问题 |
flow matching optical flow |
|
|
| 15 |
Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction |
提出Touch2Shape以解决3D形状重建中的局部细节捕捉问题 |
reinforcement learning reward design |
|
|
| 16 |
Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach |
提出Slow-Fast跟踪方法以解决低延迟视觉目标跟踪问题 |
representation learning distillation |
✅ |
|