簡易檢索 / 詳目顯示

研究生: 陳昱甫
Chen, Yu Fu
論文名稱: 用於人體姿勢估測之簡化回歸方法
Simplified Regression for Human Pose Estimation
指導教授: 陳煥宗
Chen, Hwann Tzong
口試委員: 賴尚宏
Lai, Shang Hong
劉庭錄
Liu, Tyng Luh
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 27
中文關鍵詞: 捲積類神經網路人體姿勢估測
外文關鍵詞: Convolutional Neural Network, Human Pose Estimation
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 我們介紹一個用於人體姿勢估測的兩階段深度捲積類神經網路。在第一個階段,網路從輸入的圖直接提取特徵,並且結合所有特徵產生一個簡潔但有效預測關節點位置的結果,而不是對每個關節點產生一張熱圖來預測結果。之後,我們利用輸入的圖和從前一個階段生成的合成熱圖當作第二階段的輸入,得到更進一步的結果。我們在兩個資料庫上做評估:FLIC和LSP。我們的方法在FLIC上能夠達到目前最佳的效果。


    We present a two-stage deep convolutional neural network for human pose estimation. In the fi rst stage, it directly extracts features from the input image and combines all the features to generate a compact yet e ffective result for predicting the keypoint locations instead of producing one heatmap for each keypoint. Then, we use the input image and the synthetic heatmaps derived from the previous stage as the input of the second stage to get a refi ned result of pose estimation. We evaluate our method on two datasets: FLIC and LSP. Our method achieves the state-of-the-art performance on FLIC dataset.

    1 Introduction 7 2 Related Work 9 2.1 Human Pose Estimation 9 2.2 YOLO 10 3 Simplifi ed Regression for Human Pose Estimation 11 3.1 The First Stage 11 3.1.1 Network 12 3.1.2 Training 12 3.1.3 Inference 15 3.2 The Second Stage 15 3.2.1 Network 16 3.2.2 Training 16 3.2.3 Inference 16 4 Experiments 18 4.1 Dataset 18 4.2 Evaluation Metrics 19 4.3 Results 19 4.3.1 FLIC 19 4.3.2 LSP 20 5 Conclusion 24

    [1] J. Carreira, P. Agrawal, K. Fragkiadaki, and J. Malik. Human pose estimation with iterative error feedback. CoRR, abs/1507.06550, 2015.
    [2] X. Chen and A. L. Yuille. Articulated pose estimation by a graphical model with image dependent pairwise relations. In NIPS, pages 1736-1744, 2014.
    [3] M. Dantone, J. Gall, C. Leistner, and L. J. V. Gool. Human pose estimation using body parts dependent joint regressors. In CVPR, pages 3041-3048. IEEE Computer Society, 2013.
    [4] P. F. Felzenszwalb, D. A. McAllester, and D. Ramanan. A discriminatively trained, multiscale, deformable part model. In CVPR. IEEE Computer Society, 2008.
    [5] V. Ferrari, M. J. Marín-Jiménez, and A. Zisserman. Progressive search space reduction for human pose estimation. In CVPR. IEEE Computer Society, 2008.
    [6] R. B. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, pages 580-587. IEEE Computer Society, 2014.
    [7] E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele. DeeperCut: A deeper, stronger, and faster multi-person pose estimation model. CoRR, abs/1605.03170, 2016.
    [8] S. Johnson and M. Everingham. Clustered pose and nonlinear appearance models for human pose estimation. In BMVC, pages 1-11. British Machine Vision Association, 2010.
    [9] S. Johnson and M. Everingham. Learning e ffective human pose estimation from inaccurate annotation. In CVPR, pages 1465-1472. IEEE Computer Society, 2011.
    [10] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classifi cation with deep convolutional neural networks. In NIPS, pages 1106-1114, 2012.
    [11] I. Lifshitz, E. Fetaya, and S. Ullman. Human pose estimation using deep consensus voting. CoRR, abs/1603.08212, 2016.
    [12] A. Newell, K. Yang, and J. Deng. Stacked hourglass networks for human pose estimation. CoRR, abs/1603.06937, 2016.
    [13] L. Pishchulin, M. Andriluka, P. V. Gehler, and B. Schiele. Strong appearance and expressive spatial models for human pose estimation. In ICCV, pages 3487-3494. IEEE Computer Society, 2013.
    [14] M. Rajchl, M. C. H. Lee, O. Oktay, K. Kamnitsas, J. Passerat-Palmbach, W. Bai, B. Kainz, and D. Rueckert. Deepcut: Object segmentation from bounding box annotations using convolutional neural networks. CoRR, abs/1605.07866, 2016.
    [15] J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi. You only look once: Unifi ed, real-time object detection. CoRR, abs/1506.02640, 2015.
    [16] S. Ren, K. He, R. B. Girshick, X. Zhang, and J. Sun. Object detection networks on convolutional feature maps. CoRR, abs/1504.06066, 2015.
    [17] B. Sapp and B. Taskar. MODEC: multimodal decomposable models for human pose estimation. In CVPR, pages 3674-3681. IEEE Computer Society, 2013.
    [18] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, pages 1-9. IEEE Computer Society, 2015.
    [19] J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler. Efficient object localization using convolutional networks. In CVPR, pages 648-656. IEEE Computer Society, 2015.
    [20] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In NIPS, pages 1799-1807, 2014.
    [21] A. Toshev and C. Szegedy. Deeppose: Human pose estimation via deep neural networks. In CVPR, pages 1653-1660. IEEE Computer Society, 2014.
    [22] S. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh. Convolutional pose machines. CoRR, abs/1602.00134, 2016.
    [23] Y. Yang and D. Ramanan. Articulated pose estimation with flexible mixtures-of-parts. In CVPR, pages 1385-1392. IEEE Computer Society, 2011.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE