當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

nuScenes自动驾驶数据集：数据格式精解，格式转换，模型的数据加载 (一)

發(fā)布時間：2023/12/29 编程问答 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 nuScenes自动驾驶数据集：数据格式精解，格式转换，模型的数据加载 (一) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

nuScenes數據集及nuScenes開發(fā)工具包簡介

文章目錄

nuScenes數據集及nuScenes開發(fā)工具包簡介
- 1.1. nuScenes數據集簡介：
- 1.2 數據采集：
- - 1.2.1 傳感器布置
  - 1.2.2 數據格式及數據集結構
  - 1.2.3數據集關鍵屬性說明
- 1.3 數據標注簡介
- 1.4 devkit開發(fā)工具包簡介

??學習背景：項目需要仿照nuScenes數據集格式創(chuàng)建基于其他目標的數據集進行訓練，因此學習并記錄nuScenes數據集的學習路程（待補充），如果有不對的地方歡迎補充留言，大家一起學習，有問題可以相互交流。

針對nuScenes數據集，我發(fā)布了一系列連載文章，歡迎大家閱讀：
nuScenes自動駕駛數據集：數據格式精解，格式轉換，模型的數據加載 (一)
nuScenes自動駕駛數據集：格式轉換，模型的數據加載（二）
CenterFusion（多傳感器融合目標檢測網絡）與自動駕駛數據集nuScenes：模型的數據加載（三）
CenterFusion源碼深度解讀: CenterNet網絡結構：DLA34 (四) - 多傳感器融合目標檢測系列

1.1. nuScenes數據集簡介：

??nuImages是一個具有圖像級2d注釋的大型自動駕駛數據集。它基于scale標注工具進行標記。
?nuScenes相比于其他的數據集，例如kitti apollo scape等，增加了radar（毫米波雷達）傳感器，對于傳感器的對比，可參考Lidar vs Radar vs Camera。radar的引入，給自動駕駛系統提供了在惡劣條件下相機與激光雷達傳感器失靈的解決方案，同時其具有良好的性價比。
nuScenes的主要特點:
?1.完善的傳感器配置：一個激光雷達，五個毫米波雷達，六個相機，IMU，GPS。
?2.數據充足：1000個場景來自于不同的城市，特征復雜：引入了例如可見度信息等豐富了圖像的特征信息可以用于其他任務，同時具有龐大的標注對象：1.1B的雷達點以及手工標注的32中分類信息。

1.2 數據采集：

1.2.1 傳感器布置

圖1.1車輛傳感器位置圖

車輛傳感器布置如上圖所示。對于傳感器的數據融合和配準，必須要做的一步就是對傳感器的校準，其中包括對相機內外參數的校準以及對雷達等傳感器的外參校準。

相機外參校準：使用立方形的校準目標放置到相機于雷達前進行校準（具體方法請參考nuScenes官網）（立方體校準清參考一篇論文https://www.researchgate.net/publication/327516843）

相機內參校準：使用帶有圖案的平板校準（常用的平面棋盤校準法）

毫米波雷達外參校準：將雷達放置在車輛的水平面上，然后在都市環(huán)境中駕駛，將所收集的雷達點中的動態(tài)物體過濾，然后校準yaw軸的角度值以最小化靜態(tài)物體的補償距離變化率。

激光雷達校準：使用laser liner精準地測量雷達到車輛自身坐標系的距離。
完成以上步驟，可以進一步計算雷達與相機的坐標轉換矩陣，以開始下一步的數據采集工作。

1.2.2 數據格式及數據集結構

數據集結構

如上圖所示，其中，annotation_3sweeps是對nuscenes數據集轉換成coco數據格式的標注文件，nuscenes提供了各類轉換工具，能夠將數據轉化為各類格式；
sample存在著雷達與相機原始文件；
v1.0-mini是我下載的nuscenes的mini版數據集，里面的json文件包含了對各類數據的表示和傳感器的信息等；

??nuScenes對于數據相比于其他數據集有著更加全面的標注，使用并且建立自己的數據集，參考作者的格式定義是個很好的方法。

數據集中格式基本定義:
由于數據集本身的龐大，也加強數據集后面的擴展性和多樣性，數據集使用token關鍵詞作為全局的唯一性標識
??log是所收集到的數據的日志信息；scene是20s的視頻數據；sample是scene中某個時間戳代表的一幀圖像；instance是某一幀圖像中所觀察到的所有目標實例；sample_annotation我們特意選出的已進行標注過的目標實例。token：數據集中對所有的內容進行編碼，包括對數據集對象、傳感器、場景、關鍵幀等進行token的賦值，每個token都是獨特的編碼。

圖1.2數據集架構（自己標注的，有點亂）

關于圖1.2內容的的幾點注明：
圖中的箭頭表示的是歸屬關系，例如sample_data 歸屬于 ego_pose，表示的是一種數據依賴，關鍵幀的數據要依賴于車自身的姿態(tài)計算得到；
關鍵的數據部分包括：log, sample_data, sample_annotation這些都是直接于數據集使用時需要加載的部分；

1.2.3數據集關鍵屬性說明

數據集各類屬性關系圖

解析：對于每個場景scene，都會有很多的instance出現在很多sample中，sample每間隔一段時間選取的關鍵幀，這個sample在對應的sample_annotation和sample_data中肯定會有對應，但是sample_data還包含一些非關鍵幀，sample_data是針對某個場景的，而sample_annotation針對的是instance，sample_annotation與sample_data沒有明顯的對應關系。
通過某個sample和instance可以確定某個目標在具體場景下的狀態(tài)，如果再加上對應的sample_data，就可以求出此場景下的某個目標的描述信息以及傳感器記錄的關于這個目標的信息。

attribute：對于實例的屬性描述，例如同一個目標類別在不同狀態(tài)下的屬性描述:一輛標注的車輛停車、移動或者描述某個自行車是否有騎手。

attribute {"token": <str> -- Unique record identifier."name": <str> -- Attribute name."description": <str> -- Attribute description. }

calibrated_sensor：描述一個傳感器在車輛上安置的外參和內參矩陣等信息，所有的外參都是相對于車輛自身的坐標系。

calibrated_sensor {"token": <str> -- Unique record identifier."sensor_token": <str> -- Foreign key pointing to the sensor type."translation": <float> [3] -- Coordinate system origin in meters: x, y, z."rotation": <float> [4] -- Coordinate system orientation as quaternion: w, x, y, z."camera_intrinsic": <float> [3, 3] -- Intrinsic camera calibration. Empty for sensors that are not cameras. }

category：描述目標的種類信息，如果是某個大類的子類，在后面加‘.’進行子類的選擇：例如vehicle.door

category {"token": <str> -- Unique record identifier."name": <str> -- Category name. Subcategories indicated by period."description": <str> -- Category description."index": <int> -- The index of the label used for efficiency reasons in the .bin label files of nuScenes-lidarseg. This field did not exist previously. }

ego_pose：在某個特定的時間，車輛的姿態(tài)表示，這個姿態(tài)表示是相對于世界坐標系的，這個信息是基于雷達成像地圖的定位算法所提供的（詳情看nuScenes論文中關于自身定位的算法），輸出為二維的坐標（x, y）。

ego_pose {"token": <str> -- Unique record identifier."translation": <float> [3] -- Coordinate system origin in meters: x, y, z. Note that z is always 0."rotation": <float> [4] -- Coordinate system orientation as quaternion: w, x, y, z."timestamp": <int> -- Unix time stamp. }

instance：一個對象實例，例如特定的車輛。是作者觀察到的所有對象實例的枚舉。注意，實例不是跨場景跟蹤的，在一個scene中，instance是連續(xù)追蹤的（例如：在一個視頻中出現的同一輛車會連續(xù)追蹤并標注）。但是在不同的scene中，instance是無關聯的。

instance {"token": <str> -- Unique record identifier."category_token": <str> -- Foreign key pointing to the object category."nbr_annotations": <int> -- 某個實例在一個scene中被標注的次數"first_annotation_token": <str> -- Foreign key. Points to the first annotation of this instance."last_annotation_token": <str> -- Foreign key. Points to the last annotation of this instance. }

lidarseg：將annatation和sample_data對應到關鍵幀的雷達點云數據中

lidarseg {"token": <str> -- Unique record identifier."filename": <str> -- .bin格式的雷達標注文件名稱，以uint8的數組數據類型，以二進制格式進行存儲）"sample_data_token": <str> -- Foreign key. Sample_data corresponding to the annotated lidar pointcloud with is_key_frame=True. }

log：對于提取出數據的日志文件

log {"token": <str> -- Unique record identifier."logfile": <str> -- Log file name."vehicle": <str> -- Vehicle name."date_captured": <str> -- Date (YYYY-MM-DD)."location": <str> -- Area where log was captured, e.g. singapore-onenorth. }

map：地圖數據（自上而下的視角）以二進制語義掩碼的格式存儲

map {"token": <str> -- Unique record identifier."log_tokens": <str> [n] -- Foreign keys."category": <str> -- Map category, currently only semantic_prior for drivable surface and sidewalk."filename": <str> -- Relative path to the file with the map mask. }

sample：sample是每隔0.5s采集一次的經過標注的關鍵幀，其中數據基本是在同一時間戳下采集的作為單個雷達采集循環(huán)的一部分。

sample {"token": <str> -- Unique record identifier."timestamp": <int> -- Unix time stamp."scene_token": <str> -- Foreign key pointing to the scene."next": <str> -- Foreign key. Sample that follows this in time. Empty if end of scene."prev": <str> -- Foreign key. Sample that precedes this in time. Empty if start of scene. }

sample_annotation：用于標注某個目標在一個sample中的方向等信息的三維標注框，其中所有的定位信息都是基于世界坐標系而定的最終坐標。

sample_annotation {"token": <str> -- Unique record identifier."sample_token": <str> -- Foreign key. 說明來自哪個sample"instance_token": <str> -- Foreign key. 指向某個instance，因為一個實例可以有很多次標注"attribute_tokens": <str> [n] -- Foreign keys. 這次標注中對象的屬性，因為一個目標的屬性在不同時間一直在改變所以目標的屬性歸屬于此處管理，而不是歸于實例"visibility_token": <str> -- Foreign key 目標的可見性特征，目標的可見性會一直會改變。"translation": <float> [3] -- 標注框的中心坐標值"size": <float> [3] -- 標注框的大小"rotation": <float> [4] --標注框的方向四元數"num_lidar_pts": <int> -- 一個雷達掃描期間在標注框內的雷達點"num_radar_pts": <int> -- Number of radar points in this box. Points are counted during the radar sweep identified with this sample. This number is summed across all radar sensors without any invalid point filtering."next": <str> -- Foreign key. 同一個目標定的下一個sample_anatation"prev": <str> -- Foreign key. Sample annotation from the same object instance that precedes this in time. Empty if this is the first annotation for this object. }

sample_data：傳感器返回的數據：例如雷達點云或者是圖片。對于sample_data且其is_key_frame = true的，在時間上非常接近sample，對于值為false的sample_data其指向它臨近的sample。

sample_data {"token": <str> -- Unique record identifier."sample_token": <str> -- Foreign key. 指向sample_data所關聯的sample"ego_pose_token": <str> -- Foreign key."calibrated_sensor_token": <str> -- Foreign key."filename": <str> -- Relative path to data-blob on disk."fileformat": <str> -- Data file format.#如果數據是圖片，以下內容生效"width": <int> -- If the sample data is an image, this is the image width in pixels."height": <int> -- If the sample data is an image, this is the image height in pixels."timestamp": <int> -- Unix time stamp."is_key_frame": <bool> -- True if sample_data is part of key_frame, else False."next": <str> -- Foreign key. 來自同一傳感器的在下一時刻的數據，如果是scene的末尾，賦值為空。"prev": <str> -- Foreign key. Sample data from the same sensor that precedes this in time. Empty if start of scene. }

scene：來自日志文件中一個20s的連續(xù)幀，多個幀可以同出自于一個log，實例標記不會跨場景保存。

scene {"token": <str> -- Unique record identifier."name": <str> -- Short string identifier."description": <str> -- 例如，某一輛車正在某條路上靠右行駛等描述性詞匯"log_token": <str> -- Foreign key. 指向某個log"nbr_samples": <int> -- 場景中的sample數量"first_sample_token": <str> -- Foreign key. 場景中的第一個sample."last_sample_token": <str> -- Foreign key. Points to the last sample in scene. }

sensor：傳感器類型描述

sensor {"token": <str> -- Unique record identifier."channel": <str> -- Sensor channel name."modality": <str> {camera, lidar, radar} -- Sensor modality. Supports category(ies) in brackets. }

visibility：實例的可見性

visibility {"token": <str> -- Unique record identifier."level": <str> -- Visibility level."description": <str> -- Description of visibility level. }

1.3 數據標注簡介

??作者使用nuScenes anacator進行數據的標記。
??在收集完數據后，作者對采集的視頻進行2hz的采樣，并且使用scale進行標注，最后實現了高度精確的標注。對于所有數據集中的對象，作者都進行了語義標注，并且每個對象出現的每個場景中的每一幀都進行了3D框標注和屬性注釋。這使此數據集相比于2D的數據集擁有更精準推理目標方向和角度的能力。
在雷達點云方面，作者將每個雷達點都進行了語義標注，除了對于23個前景目標的標注，還有對于9個背景目標的標注。
以上為簡介，具體標注方法，請看作者公布的標注細節(jié)。（待更新）

1.4 devkit開發(fā)工具包簡介

時間緊張，還沒寫，后期會補充，放下其他人鏈接(點擊此處)
最后，因為編者剛入門多傳感器融合相關領域，看了許多天相關文獻，沒有找到對于這個數據集比較好的翻譯材料，就自己嘗試地翻譯＋口述，最終目的是于大家交流自己的i心得，本篇博文肯定會有一些錯誤，麻煩大家指出。

總結

以上是生活随笔為你收集整理的nuScenes自动驾驶数据集：数据格式精解，格式转换，模型的数据加载 (一)的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：页面加载前需要定义全局变量
下一篇： java科技论文20000字_科学小论文