Modeling and Measuring Redundancy in Multisource Multimodal Data for Autonomous Driving

📄 arXiv: 2603.06544v1 📥 PDF

作者: Yuhan Zhou, Mehri Sattari, Haihua Chen, Kewei Sha

分类: cs.CV

发布日期: 2026-03-06

备注: This paper has been accepted by the Fourth IEEE International Conference on Mobility: Operations, Services, and Technologies (MOST) 2026

🔗 代码/项目: GITHUB


💡 一句话要点

提出冗余建模以提升自动驾驶多源多模态数据质量

🎯 匹配领域: 支柱九:具身大模型 (Embodied Foundation Models)

关键词: 自动驾驶 数据质量 冗余建模 多模态数据 目标检测 YOLOv8 传感器融合

📋 核心要点

  1. 现有自动驾驶研究主要关注算法设计,忽视了数据质量分析,尤其是冗余问题。
  2. 本文通过建模和测量多源相机及多模态图像-LiDAR数据中的冗余,提出了一种新的数据质量分析方法。
  3. 实验结果显示,去除冗余标签显著提升了YOLOv8目标检测性能,尤其在nuScenes数据集中表现突出。

📝 摘要(中文)

下一代自动驾驶汽车(AV)依赖大量多源多模态($M^2$)数据以支持实时决策。然而,数据质量(DQ)因环境条件和传感器限制而异,现有研究主要集中于算法设计而忽视DQ分析。本文关注冗余作为AV数据集中一个基本但未被充分探讨的DQ问题。通过对nuScenes和Argoverse 2(AV2)数据集的分析,建模并测量多源相机数据和多模态图像-LiDAR数据中的冗余,评估去除冗余标签对YOLOv8目标检测任务的影响。实验结果表明,选择性去除共享视场的多源图像目标标签可提升检测性能。nuScenes中,mAP${50}$在三个重叠区域的提升分别为$0.66$至$0.70$、$0.64$至$0.67$和$0.53$至$0.55$,而AV2中去除$4.1$-$8.6 ext{%}$的标签,mAP${50}$保持在$0.64$的基线附近。这些发现表明冗余是一个可测量且可操作的DQ因素,对AV性能有直接影响。

🔬 方法详解

问题定义:本文旨在解决自动驾驶数据集中冗余标签对数据质量的影响。现有方法往往忽视冗余问题,导致数据质量不均衡,影响算法性能。

核心思路:通过建模和测量多源多模态数据中的冗余,提出一种数据驱动的方法来优化数据集,提升目标检测任务的效果。

技术框架:整体流程包括数据收集、冗余建模、冗余标签去除和性能评估四个主要模块。首先收集nuScenes和AV2数据集,然后分析冗余,最后评估去除冗余后的检测性能。

关键创新:本文的创新在于系统性地分析冗余作为数据质量因素,并通过实验验证其对目标检测性能的影响,填补了现有研究的空白。

关键设计:在实验中,采用YOLOv8作为目标检测模型,设置不同的冗余去除策略,并通过mAP${50}$指标评估性能,确保了实验的严谨性和可重复性。

🖼️ 关键图片

fig_0
fig_1
fig_2

📊 实验亮点

实验结果显示,在nuScenes数据集中,去除冗余标签后,mAP${50}$在三个重叠区域分别提升至$0.70$、$0.67$和$0.55$,而在AV2中去除$4.1$-$8.6 ext{%}$的标签后,mAP${50}$保持在$0.64$的基线,表明冗余去除对检测性能的积极影响。

🎯 应用场景

该研究的潜在应用领域包括自动驾驶系统的数据优化、传感器融合以及智能交通管理。通过提升数据质量,能够显著改善自动驾驶汽车的感知能力和决策效率,推动智能交通的发展。

📄 摘要(原文)

Next-generation autonomous vehicles (AVs) rely on large volumes of multisource and multimodal ($M^2$) data to support real-time decision-making. In practice, data quality (DQ) varies across sources and modalities due to environmental conditions and sensor limitations, yet AV research has largely prioritized algorithm design over DQ analysis. This work focuses on redundancy as a fundamental but underexplored DQ issue in AV datasets. Using the nuScenes and Argoverse 2 (AV2) datasets, we model and measure redundancy in multisource camera data and multimodal image-LiDAR data, and evaluate how removing redundant labels affects the YOLOv8 object detection task. Experimental results show that selectively removing redundant multisource image object labels from cameras with shared fields of view improves detection. In nuScenes, mAP${50}$ gains from $0.66$ to $0.70$, $0.64$ to $0.67$, and from $0.53$ to $0.55$, on three representative overlap regions, while detection on other overlapping camera pairs remains at the baseline even under stronger pruning. In AV2, $4.1$-$8.6\%$ of labels are removed, and mAP${50}$ stays near the $0.64$ baseline. Multimodal analysis also reveals substantial redundancy between image and LiDAR data. These findings demonstrate that redundancy is a measurable and actionable DQ factor with direct implications for AV performance. This work highlights the role of redundancy as a data quality factor in AV perception and motivates a data-centric perspective for evaluating and improving AV datasets. Code, data, and implementation details are publicly available at: https://github.com/yhZHOU515/RedundancyAD