簡易檢索 / 詳目顯示

研究生: 鍾子清
Jhong, Zih-Cing
論文名稱: 夜間城鎮交通影片之物件偵測
Object detection in nighttime urban traffic videos
指導教授: 劉晉良
Liu, Jinn-Liang
口試委員: 陳仁純
陳人豪
學位類別: 碩士
Master
系所名稱: 理學院 - 計算與建模科學研究所
Institute of Computational and Modeling Science
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 62
中文關鍵詞: 物件偵測YOLOv4深度學習夜間城鎮資料增強MosaicMixup
外文關鍵詞: Object detection, YOLOv4, Deep learning, Nighttime, Urban, Data augmentation, Mosaic, Mixup
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在本文中,我們研究了車輛前方視野的物件偵測,鎖定於實際臺灣夜間城鎮道路的駕駛場景。我們以YOLOv4 [1]神經網路及其預訓練權重為基礎,改良原始的YOLOv4,開發出我們的模型YOLOZC,能輸出物件類別、邊界框、信心水平和物件偵測平均準確度等常見結果。
    我們將實際道路場景影片分割成一幀幀圖片,手動標註其中最常見的四大類物件:人、汽車、摩托車及紅綠燈。並運用我們自定義的Mosaic和Mixup演算法進行資料增強,提升資料多樣性和模型彈性。
    為了保持實驗條件大致一致,我們在所有實驗中沿用了YOLOv4的Anchors設置。我們比較了原始YOLOv4和我們的模型YOLOZC的表現,實驗結果顯示,通過資料和標籤的改進,我們在夜間城鎮物件辨識準確度上取得了更好的效果,而資料增強演算法則進一步提升了辨識準確度。我們還比較了模型在影片偵測中的成果,使得我們可以直觀地看到YOLOZC在準確率提升及改善之處。總之,透過資料的收集和訓練,我們提升了夜間道路交通影片的物件偵測準確率,使模型更適應臺灣的交通狀況。


    In this article, we study object detection in the front view of the vehicle, focusing on the actual driving scenes of nighttime urban roads in Taiwan. Based on the YOLOv4 [1] neural network and its pre-trained weights, we improved the original YOLOv4 and developed our model YOLOZC, which can output common results such as object category, bounding box, confidence level, and average object detection accuracy.
    We split the actual road scene video into frames and manually annotated the four most common types of objects: people, cars, motorcycles, and traffic lights. We also use our customized mosaic and mixup algorithms for data augmentation to improve data diversity and model flexibility.
    To maintain consistent experimental conditions, we used the anchors settings of YOLOv4 in all experiments. We compared the performance of the original YOLOv4 and our model YOLOZC. The experimental results show that through the improvement of data and labels, we achieved better accuracy in nighttime urban object recognition, and the data augmentation algorithms further improved the recognition accuracy. We also compared the model's performance in video detection, allowing us to intuitively see the accuracy improvements and enhancements of YOLOZC.
    In conclusion, through data collection and training, we improved the object detection accuracy in nighttime road traffic videos, making the model more suitable for the traffic conditions in Taiwan.

    目錄 摘要 II Abstract III 致謝 IV 目錄 V 圖目錄 VII 表目錄 VIII 第1章 簡介 1 第2章 相關文獻 5 第3章 YOLOZC 7 3-1 架構 7 3-2 Backbone 9 3-3 Neck 13 3-3-1 SPP-Block 13 3-3-2 PANet 14 3-3-3 整體結合 15 3-4 Head 17 3-5 後處理 21 3-5-1 預測結果坐標轉換(Bounding Box Decoding) 21 3-5-2 閾值過濾(Threshold Filtering) 24 3-5-3 非極大值抑制(Non-Maximum Suppression, NMS) 24 3-5-4 邊界框聚合(Bounding Box Aggregation) 25 3-6 損失函數 27 第4章 實驗設定 31 4-1 資料前處理步驟 31 4-1-1 資料標籤與結構 31 4-1-2 資料分配與模型定義 32 4-2 資料增強步驟 34 4-2-1 Mixup演算法 36 4-2-2 Mosaic演算法 40 4-3 訓練步驟 45 4-4 其他注意事項 48 4-4-1 錨框 48 4-4-2 驗證數據集的資料型態配合原始YOLOv4的細節 49 4-4-3 收斂狀況 50 第5章 實驗結果 51 5-1 模型輸出表格呈現 51 5-2 模型輸出圖片呈現 53 第6章 結論 54 參考文獻 57 附錄 59

    參考文獻

    [1] Bochkovskiy, Alexey; Wang, Chien-Yao; Liao, Hong-Yuan Mark, “YOLOv4: Optimal Speed and Accuracy of Object Detection, ” 於 arXiv preprint, 2020.
    [2] 李碧寒, 都市夜間交通影片之實例分割, 國立清華大學碩士論文, 2021.
    [3] Zhang, Hongyi; Cisse, Moustapha; Dauphin, Yann N.; Lopez-Paz, David, “mixup: Beyond Empirical Risk Minimization, ” 於 arXiv preprint, 2017.
    [4] Wang, Chien-Yao; Bochkovskiy, Alexey; Liao, Hong-Yuan Mark, “ CSPNet: A New Backbone that can Enhance Learning Capability of CNN, ” 於 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020.
    [5] Lin, Tsung-Yi; Dollar, Piotr; Girshick, Ross; He, Kaiming; Hariharan, Bharath; Belongie, Serge, “Feature Pyramid Networks for Object Detection, ” 於 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
    [6] He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, ” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1904-1916, 2015.
    [7] Liu, Shu; Qi, Lu; Qin, Haifang; Shi, Jianping; Jia, Jiaya, “Path Aggregation Network for Instance Segmentation, ” 於 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
    [8] J. Redmon 且 A. Farhadi, “YOLOv3: An Incremental Improvement, ” 於 arXiv preprint, 2018.
    [9] Zheng, Zhaohui; Wang, Ping; Liu, Wei; Li, Jinze; Ye, Rongguang; Ren, Dongwei, “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression, ” 於 Proceedings of the AAAI Conference on Artificial Intelligence, 2020.
    [10] Neubeck, Andreas; Van Gool, Luc, “Efficient Non-Maximum Suppression, ” 於 Proceedings of the 18th International Conference on Pattern Recognition (ICPR), 2006.
    [11] Wang, Hao; Yao, Qiang, “ Multi-Scale Object Detection Algorithm Based on Feature Aggregation, ” 於 Proceedings of the IEEE International Conference on Computer and Information Technology (CIT), 2019.
    [12] J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations, ” 於 Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967.

    QR CODE