基於視覺的手語辨識系統｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	蔡博鄰 Tsai, Bo-Lin
論文名稱：	基於視覺的手語辨識系統 Vision-based Sign Language Recognition System
指導教授：	黃仲陵 Huang, Chung-Lin
口試委員:
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2009
畢業學年度：	97
語文別：	英文
論文頁數：	69
中文關鍵詞：	手語、手勢
外文關鍵詞：	sign language, hand gesture
相關次數：	點閱：80 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

摘要

台灣手語是聽障人士溝通的基本工具之一，設計一套手語辨識系統來做為溝通介面，對於一般人與聽障人土溝通上有很大的幫助。在這篇研究論文中，我們以視覺為基礎下進行台灣手語的辨識。由語言學構音(articulation)的研究，手語中大部分的手勢是由:手形、手的位置、手的動作三個音素(phoneme)所組成。我們將手語影片分割成由靜止(Hold)和移動(Movement)部分所組成的序列。將靜止部分經過轉換和分析為手形音素和手的位置音素。將移動部分經過分析和表示成手的動作音素。在靜止部分含有手形音素和手的位置音素。在移動部分含有手的動作音素。我們利用這些音素作為組成手語的基礎，任一個手語都由這些音素組成，這個方法的優點就是方便以後增加字彙，擴充性較好。我們把每一個手勢所對應的音素訓練成一個隱藏式馬可夫(HMM)模型，在辨識方面，只需把輸入的音素序列分別對已訓練的隱藏式馬可夫(HMM)模型去計算其機率值，挑選出最高機率的模型，然後跟一個閥值比較，若比閥值還高，則將被選擇為被辨識的手勢。反知則是沒有意義。在辨識句子裡我們也加入了文法修正去修正錯誤，以增加系統的性能。
我們選擇了二十句的台灣手語句子進行實驗，並且讓每個受測者拍攝這二十個句子做為我們的實驗樣本。經過測試平均，我們的系統可以獲得94％的字彙辨識率和83.3％的句子辨識率。

We present a vision-based sign language recognition system which works efficiently to recognize the Taiwan Sign Language. Sign language can be divided into a sequence of sign-words, and sign-word consists of three different phonemes, hand posture, location of the hand, and the hand movement. The number of phonemes is limited for the sign language, however, an unlimited number of words can be built from the phonemes. We use the phonemes as the basic units to represent a sign-wrod, and this strategy has the advantage for a further increase of the vocabulary size. We segment the sign-wrod to a sequence of hold and movement segments. The hold segment is analyzed and represented in terms of the phonemes of the hand posture and location. The movement segment is also analyzed and converted to the phoneme of the hand movement. A hand gesture is composed of a sequence of the phonemes. To recognize a dynamic hand gesture, we select the most probable HMM model which represents the specific gesture. In the experiments, we choose twenty Taiwan Sign Language (TSL) sentences for our system to recognize, and collect the sign-language videos made by different signers. The experimental results demonstrate that our system achieves a good performance of sign-word recognition accuracy of 94％ and sentence recognition accuracy of 83.3％.

Abstract..........................................................................................................................i
Contents........................................................................................................................ii
List of Figures...............................................................................................................iv
List of Tables................................................................................................................vi

CHAPTER 1    INTRODUCTION    ……………………………………………………1
1.1    Motivation................................................................................................1
1.2    Related Works    ................................................................................................1
1.3    System Overview............................................................................................4
1.4    Organization................................................................................................6
CHAPTER 2    PHONEME SEGMENTATION    ……................................................7
2.1    Phoneme Introduction....................................................................................7
2.2    Stokoe’s System and Movement-Hold Model...............................................7
2.3    Phoneme Segmentation.................................................................................8
CHAPTER 3    HOLD PHONEME ANALYSIS    ......................................................10
3.1    Hand Tracking..............................................................................................10
3.2    Hand Segmentation......................................................................................16
3.3    Hand Position..............................................................................................17
3.4    Hybrid-Feature Vector Composition............................................................18
3.4.1     Fourier Descriptors.....................................................................................19
3.4.2     7Hu Moments.............................................................................................21
3.4.3     Orientation of Major Axis.......................................................................24
3.4.4     Principle Component Analysis…...............................................................29
3.5    Hand Posture Recognition…........................................................................31
3.5.1     Basic Theory of Support Vector Machines….............................................31
3.5.2     Multi-class Recognition.............................................................................33
3.5.3     The Data Collection of Hand Posture…....................................................34
3.6    The Results of Hold Phoneme……..............................................................37
CHAPTER 4    MOVEMENT PHOMENE ANALSYS…........................................38
4.1    Orientation of Hand Trajectory Quantization..............................................38
4.2    Hand Movement Recognition................................................................…..40
CHAPTER 5    DYNAMIC HAND GESTURE RECOGNITION…......................42
5.1    Dynamic Hand Gesture    ..............................................................................42
5.2    Dynamic Hand Gesture Recognition    ......................................................44
5.3    Dynamic Hand Gesture Recognition Result    ..............................................47
CHAPTER 6    SIGN LANGUAGE RECOGNITION    ..............................................48
6.1    Sign Language Recognition………………..............................................48
6.2    Correction using Grammar......................................................................53
CHAPTER 7    EXPERIMENTAL RESULT................................................................56
7.1    The Result of Sign Language Recognition….............................................57
7.2    Discussion................................................................................................64
CHAPTER 8    CONCLUSION AND FUTURE WORKS........................................65
REFERENCES ……………………………..………………………………………67

                                

[1] M. Assan, and K. Grobel, “Video-based sign language recognition using hidden markov models,” LECTURE NOTES IN COMPUTER SCIENCE, vol. 1371, pp. 97-110, 1998.
[2] T. Starner, J. Weaver, and A. Pentland, “Real-time American sign language recognition using desk and wearable computer based video,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 20, no. 12, pp. 1371-1375, 1998.
[3] T. Starner, and A. Pentland, “Real-Time American Sign Language Recognition from Video Using Hidden Markov Models,” COMPUTATIONAL IMAGING AND VISION, vol. 9, pp. 227-244, 1997.
[4] L. Rung-Huei, and O. Ming, "A real-time continuous gesture recognition system for sign language," Third IEEE International Conference on Automatic Face and Gesture Recognition Proceedings, pp. 558-567, 1998.
[5] C. Vogler, and D. Metaxas, “Toward scalability in ASL recognition: Breaking down signs into phonemes,” LECTURE NOTES IN COMPUTER SCIENCE, pp. 211-226, 2000.
[6] T. Darrell, and A. Pentland, "Active gesture recognition using partially observable Markovdecision processes," Pattern Recognition, Proceedings of the 13th International Conference on, 1996.
[7] M. Yang, and N. Ahuja, "Recognizing hand gesture using motion trajectories," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 1999.
[8] C. Lee, Z. Bien, G. Park et al., "Real-time recognition system of Korean sign language based onelementary components." Fuzzy Systems, Proceedings of the Sixth IEEE International Conference on, vol. 3, 1997.
[9] P. Simpson, “Fuzzy min-max neural networks. I. Classification,” IEEE Transactions on Neural Networks, vol. 3, no. 5, pp. 776-786, 1992.
[10] O. Al-Jarrah, and A. Halawani, “Recognition of gestures in arabic sign language using neuro-fuzzy systems,” Artificial Intelligence, vol. 133, no. 1-2, pp. 117-138, 2001.
[11] J. Jang, “ANFIS: Adaptive-network-based fuzzy inference system,” IEEE transactions on systems, man, and cybernetics, 1993.

[12] K. Assaleh, T. Shanableh, M. Fanaswala et al., "Vision-based system for continuous Arabic Sign Language recognition in user dependent mode," Mechatronics and Its Applications ISMA 2008 5th International Symposium on pp. 1-5, 2008.
[13] W. Freeman, D. Anderson, P. Beardsley et al., “Computer vision for interactive computer graphics,” IEEE Computer Graphics and Applications, vol. 18, no. 3, pp. 42-53, 1998.
[14] W. Freeman, D. Anderson, P. Beardsley et al., “Computer vision for interactive computer graphics,” IEEE Computer Graphics and Applications, vol. 18, no. 3, pp. 42-53, 1998.
[15] W. Freeman, and C. Weissman, "Television control by hand gestures," International Workshop on Automatic Face and Gesture Recognition, pp. 179-183, 1995.
[16] Y. Cui, J. Weng, V. Telecom et al., “A learning-based prediction-and-verification segmentation schemefor hand sign image sequence,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 8, pp. 798-804, 1999.
[17] Y. Cui, and J. Weng, “Appearance-based hand sign recognition from intensity image sequences,” Computer Vision and Image Understanding, vol. 78, no. 2, pp. 157-176, 2000.
[18] C. Huang, and W. Huang, “Sign language recognition using model-based tracking and a 3D Hopfield neural network,” Machine vision and applications, vol. 10, no. 5, pp. 292-307, 1998.
[19] W. Stokoe Jr, “Sign language structure: An outline of the visual communication systems of the American deaf,” Journal of Deaf Studies and Deaf Education, vol. 10, no. 1, pp. 3, 2005.
[20] S. Liddell, and R. Johnson, “American Sign Language: the phonological base,” Linguistics of American Sign Language: An Introduction, vol. 85, pp. 267, 2001.
[21] J. Allen, R. Xu, and J. Jin, "Object tracking using CamShift algorithm and multiple quantized feature spaces," Proceedings of the Pan-Sydney area workshop on Visual information processing, pp. 3-7, 2004.
[22] C. Zahn, and R. Roskies, “Fourier descriptors for plane closed curves,” IEEE Transactions on computers, vol. 21, no. 3, pp. 269-281, 1972.
[23] M. Shridhar, and A. Badreldin, “High accuracy character recognition algorithm using Fourier and topological descriptors,” Pattern Recognition, vol. 17, no. 5, pp. 515-524, 1984.
[24] M. Hu, “Visual pattern recognition by moment invariants,” Information Theory, IRE Transactions on, vol. 8, no. 2, pp. 179-187, 1962.
[25] V. Vapnik, “Statistical learning theory. 1998,” NY Wiley.
[26] R. Duda, P. Hart, and D. Stork, "Pattern classification 2nd," Wiley Interscience, New York, 2001.
[27] C. Hsu, and C. Lin, “A comparison of methods for multiclass support vector machines,” IEEE Transactions on Neural Networks, vol. 13, no. 2, pp. 415-425, 2002.
[28] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989.
[29] D. Brentari, and A. ASL, “Sign language phonology: ASL,” Language, The handbook of phonological theory, pp. 615-639, 1995.
[30] R. Liang, and M. Ouhyoung, "A sign language recognition system using hidden markov model and context sensitive search," Proceedings of the ACM Symposium on Virtual Reality Software and Technology, pp. 59-66,1996.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文