| 1 |
KineST: A Kinematics-guided Spatiotemporal State Space Model for Human Motion Tracking from Sparse Signals |
KineST:一种基于运动学引导的时空状态空间模型,用于从稀疏信号中进行人体运动跟踪 |
state space model representation learning spatiotemporal |
✅ |
|
| 2 |
BrepLLM: Native Boundary Representation Understanding with Large Language Models |
BrepLLM:提出一种原生边界表示理解的大语言模型框架 |
contrastive learning semantic mapping semantic map |
|
|
| 3 |
SNOW: Spatio-Temporal Scene Understanding with World Knowledge for Open-World Embodied Reasoning |
SNOW:利用世界知识进行时空场景理解,实现开放世界具身推理 |
world model scene understanding multimodal |
|
|
| 4 |
AdaTooler-V: Adaptive Tool-Use for Images and Videos |
提出AdaTooler-V,通过自适应工具使用提升多模态大语言模型在图像和视频任务中的推理效率和性能。 |
reinforcement learning large language model multimodal |
|
|
| 5 |
Instant Expressive Gaussian Head Avatar via 3D-Aware Expression Distillation |
提出基于3D感知表达蒸馏的快速高表现力高斯头部头像方法 |
distillation gaussian splatting splatting |
|
|
| 6 |
SARMAE: Masked Autoencoder for SAR Representation Learning |
SARMAE:面向SAR图像表征学习的噪声感知掩码自编码器 |
representation learning masked autoencoder |
✅ |
|
| 7 |
The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text |
WorldCanvas:结合文本、轨迹和参考图像,实现可控的世界事件模拟。 |
world model multimodal visual grounding |
✅ |
|
| 8 |
Task-Oriented Data Synthesis and Control-Rectify Sampling for Remote Sensing Semantic Segmentation |
提出TODSynth框架,用于遥感语义分割任务的数据合成与控制优化。 |
flow matching foundation model multimodal |
|
|
| 9 |
MACL: Multi-Label Adaptive Contrastive Learning Loss for Remote Sensing Image Retrieval |
提出MACL,解决遥感图像检索中多标签语义重叠和类别不平衡问题 |
representation learning contrastive learning |
✅ |
|
| 10 |
Skeleton-Snippet Contrastive Learning with Multiscale Feature Fusion for Action Localization |
提出基于骨骼片段对比学习和多尺度特征融合的动作定位方法 |
contrastive learning |
|
|
| 11 |
MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Model for Embodied Task Planning |
提出MomaGraph,利用视觉-语言模型为具身任务规划构建状态感知的统一场景图。 |
reinforcement learning scene understanding |
|
|
| 12 |
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times |
TurboDiffusion:通过多重加速策略将视频扩散模型提速100-200倍 |
linear attention distillation |
✅ |
|