從單張人臉影像中估測三維外形、表情變化、照度與光源

簡易檢索 / 詳目顯示

回結果列表

研究生：	王書凡 Wang, Shu-Fan
論文名稱：	從單張人臉影像中估測三維外形、表情變化、照度與光源 Estimating 3D Shape, Expression Deformation, Albedo, and Illumination from a Single Face Image
指導教授：	賴尚宏 Lai, Shang-Hong
口試委員:
學位類別：	博士 Doctor
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2010
畢業學年度：	99
語文別：	英文
論文頁數：	93
中文關鍵詞：	三維重建、人臉重建、流形學習
外文關鍵詞：	3D Reconstruction, Face Reconstruction, Manifold Learning
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

三維人臉模型在許多的應用中都是相當熱門的主題，例如人臉動畫、人臉辨識以及虛擬人物會議等的應用。所以如何正確的模型化臉部的幾何、材質顏色以及光源變化在電腦視覺以及圖學都漸形重要。在文獻中從單張影像模型化三維人臉主要利用了事前訓練的資訊。然而，要在有包含人臉表情變化的單張影像中正確地重建出人臉幾何仍然相當困難，因為表情變化使得三維空間中臉部表面的變形更為複雜。主要的挑戰在於必需同時分析無表情的原始臉部幾何，以及其附加的表情變化，這使得問題更為棘手。另一方面，光源的變化，常常因為臉部材質以及角度而有所不同，困難度更為提高。在此篇論文中，我們主要發展了一套完整的系統，能夠重建三維人臉幾何、表情、照度、光源等的資訊。系統包含了訓練的步驟，也就是如何自動地在人臉模型間建立點與點的對應關係，以及描述無情形幾何、表情變化的模型化方法。在決定三維表面的對應關係時，我們是將三維的模型先參數化到二維的平面上，如此問題就變成二維平面上點對應問題。利用線性的主成分分析表示人臉原始外形，而使用非線性的流行分析描述表形變化。我們也同時提出三維人臉重建的演算法，基於訓練階段的模型，我們結合了線性與非線性子空間的表示法來描述無表情人臉形狀幾何與機率化流形三維表情變化。在此系統中，我們整合了人臉幾何、表情變化、材質與光源種種的資訊，如此在解決此問題時有良好的限制而得到合理的結果。實驗中的驗證支持我們所發展的系統，而所產生的結果也能夠延伸並適用於不同的應用，例如影像合成，光源估測，表情移除及表情轉換，以及人臉特徵誇張化等。

Three-dimensional human face modeling is a very popular topic with many applications,such as facial animation, face recognition and model-based facial video communication. Therefore, how to model the facial geometry, texture intensity and illumination variation is important in computer vision and graphics. Previous works on 3D head modeling from a single face image utilized prior information on 3D head models. However, it is difficult to accurately reconstruct the 3D face model from a single face image with expression since the facial expression induces 3D face model deformation in a complex manner. The main challenge is the coupling of the neutral 3D face model and the 3D deformation due to expression, thus making the 3D model estimation from a single face image with expression very challenging. On the other hand, the illumination condition also makes the problem more difficult. In this thesis,we focus on developing a 3D face model reconstruction system including surface registration and training of 3D face models with expressional deformations as well as the estimation of the 3D neutral shape and the 3D expressional deformation from a single face image. The proposed reconstruction algorithm integrates the linear and non-linear subspace representations for a prior 3D neutral morphable model and the probabilistic manifold-based 3D expressional deformation. We incorporate the face geometry, expression deformation, texture and illumination information into the problem so that it is well constrained. The reconstructed 3D face models can also be further extended and applied to many real-world applications.

List of Figures . . .iii
List of Tables . . .vii
Chapter 1: Introduction . . .1
1 Problem Definition . . .2
2 Contributions . . .3
3 Thesis Outline . . .3
Chapter 2: Related Works . . . 6
1 Surface Registration . . . 7
2 Illumination Estimation . . . 9
3 Expression Deformation . . . 10
4 Feature Enhancement . . .11
Chapter 3: Model construction . . .14
1 Preprocessing and Rigid Alignment . . . 15
2 Non-rigid Registrations . . . 16
3 Statistical Model for Shape and Appearance . . . 27
4 Spherical Harmonic Bases for Illumination Approximation . . .29
5 Expression Deformation Modeling . . . 31
Chapter 4: 3D Model Reconstruction from a Single Face Image . . . 36
1 Initialization . . .38
2 Parameters Optimization . . .40
3 Algorithm Summary . . .44
Chapter 5: Light source estimation . . . 46
1 Lambertian Model . . .46
2 Magnitude and Direction Estimation . . . 47
Chapter 6: Manifold-Based 3D Face Caricature Generation . . . 50
1 Training: The Modeling of 3D Facial Features . . . 51
2 Training: The Modeling of Facial Expression Deformation . . .55
3 3D Face Feature Exaggeration . . . 56
Chapter 7: Experiments and Applications . . .61
1 The accuracy of non-rigid registration . . . 61
2 3D Reconstruction Accuracy . . . 67
3 3D Face Reconstruction from Real Images . . .72
4 3D Face Reconstruction from Video Sequences . . . 75
5 Illumination Estimation . . .77
6 3D Face Caricature Generation . . .78
Chapter 8: Conclusion and Discussion . . . 84
Bibliography . . .86

                                

[1] http://www.facegen.com/modeller.htm, Singular.
[2] T. Sim, S. Baker, and M. Bsat, “The cmu pose, illumination, and expression database,” IEEE

Transaction on Pattern Analysis and Machine Intelligent, pp.1615–1618, 2003.
[3] V. Blanz and T. Vetter, “A morphable model for the synthesis of 3d-faces,” in ACM Transactions

on Graphics (SIGGRAPH), 1999, pp. 187–194.
[4] V. Blanz and T. Vetter, “Face recognition based on fitting a 3d morphable model,” IEEE

Transaction on Pattern Analysis and Machine Intelligent, vol. 25, no. 9, pp. 1063–1074, 2003.
[5] J. Davis, R. Ramamoorthi, and S. Rusinkiewicz, “Spacetime stereo: A unifying framework for depth

from triangulation,” Computer Vision and Pattern Recognition, pp. 359–366, 2003.
[6] S. Rusinkiewicz, O. Hall-Holt, and M. Levoy, “Real-time 3d model acquisition,” ACM Transactions

on Graphics (SIGGRAPH), vol. 21, no. 3, pp. 438–446, Jul. 2002.
[7] L. Zhang, N. Snavely, B. Curless, and S. M. Seitz, “Spacetime faces: Highresolution capture for

modeling and animation,” ACM Transactions on Graphics, pp. 548–558, August 2004.
[8] S. Zhang and P. Huang, “High-resolution, real-time 3d shape acquisition,” Computer Vision and

Pattern Recognition Workshop, vol. 3, p. 28, 2004.
[9] B. Allen, B. Curless, and Z. Popovic’, “The space of human body shapes: reconstruction and

parameterization from range scans,” ACM Transactions on Graphics, vol. 22, pp. 587–594, 2003.
[10] B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin, “Making faces,”ACM Transactions on

Graphics, pp. 55–66, 1998.
[11] J. yong Noh and U. Neumann, “Abstract expression cloning,” ACM Transactions on Graphics

(SIGGRAPH), 2001.
[12] Y. Wang, X. Huang, C.-S. Lee, S. Zhang, Z. Li, D. Samaras, D. Metaxas, A. Elgammal,and P. Huang,

“High resolution acquisition, learning and transfer of dynamic 3-d facial expressions,” European

Association for Computer Graphics,
pp. 677–686, 2004.
[13] Y. Wang, M. Gupta, S. Zhang, S. Wang, X. Gu, D. Samaras, and P. Huang,“High resolution tracking

of non-rigid motion of densely sampled 3d data using harmonic maps,” International Journal of

Computer Vision, vol. 76, no. 3, pp. 283–300, March 2008.
[14] C. Basso, P. Paysan, and T. Vetter, “Registration of expressions data using a 3d morphable

model,” in Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and

Gestures. Washington, DC, USA: IEEE Computer Society, 2006, pp. 205–210.
[15] S. Wang, M. Jin, and X. D. Gu, “Conformal geometry and its applications on 3d shape matching,

recognition, and stitching,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 29,

no. 7, pp. 1209–1220, 2007.
[16] S. Wang, X. Gu, and H. Qin, “Automatic non-rigid registration of 3d dynamic data for facial

expression synthesis and transfer,” Computer Vision and Pattern Recognition, pp. 1–8, June 2008.
[17] P. Hallinan, “A low-dimensional representation of human faces for arbitrary lighting

conditions,” Computer Vision and Pattern Recognition, pp. 995–999,1994.
[18] L. Zhao and Y. Yang, “Theoretical analysis of illumination in pca-based vision systems,”

Pattern Recognition, vol. 32, no. 4, pp. 547–564, 1999.
[19] Y. Adini, Y. Moses, and S. Ullman, “Face recognition: The problem of compensating for changes

in illumination direction,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 19,

no. 7, pp. 721–732, 1997.
[20] P. Belhumeur and D. Kriegman, “What is the set of images of an object under all possible

illumination conditions?” International Conference on Computer Vision, vol. 28, no. 3, pp. 245–260,

1998.
[21] A. Shashua, “On photometric issues in 3d visual recognition from a single 2d image,”

International Conference on Computer Vision, pp. 99–122, 1999.
[22] R. Ramamoorthi and P. Hanrahan, “An efficient representation for irradiance environment maps,”

in SIGGRAPH, 2001, pp. 497–500.
[23] R. Ramamoorthi and P. Hanrahan, “A signal-processing framework for inverse rendering,”

SIGGRAPH, pp. 117–228, 2001.
[24] R. Basri and D. Jacobs, “Lambertian reflectance and linear subspaces,” IEEE Transaction on

Pattern Analysis and Machine Intelligent, vol. 25, no. 2, pp. 218–233, 2003.
[25] R. Ramamoorthi, “Analytic pca construction for theoretical analysis of lighting variability in

images of a lambertian object,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol.

24, no. 10, pp. 1–12, 2002.
[26] K. C. Lee, J. Ho, and D. Kriegman, “Acquiring linear subspaces for face recognition under

variable lighting,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 27, no. 5,

pp. 684–698, 2005.
[27] L. Zhang, S. Wang, and D. Samaras, “Pose invariant face recognition under arbitrary unknown

lighting using spherical harmonics,” Biometric Authentication Workshop, 2004.
[28] L. Zhang, S. Wang, and D. Samaras, “Face synthesis and recognition from a single image under

arbitrary unknown lighting using a spherical harmonic basis morphable model,” Computer Vision and

Pattern Recognition, vol. 2, pp. 209–216, 2005.
[29] L. Zhang and D. Samaras, “Face recognition from a single training image under arbitrary unknown

lighting using spherical harmonics,” IEEE Transaction on Pattern Analysis and Machine Intelligent,

vol. 28, no. 3, pp. 351–363, 2006.
[30] A. Bronstein, M. Bronstein, and R. Kimmel, “Expression invariant 3d face recognition,”

Proceedings of the 4th international conference on Audio- and video-based biometric person

authentication, vol. 2688, pp. 62–70, 2003.
[31] A. Bronstein, M. Bronstein, and R. Kimmel, “Three dimensional face recognition,”International

Conference on Computer Vision, vol. 64, no. 1, pp. 5–30,2005.
[32] Y. Wang, G. Pan, and Z. Wu, “3d face recognition in the presence of expression:a guidance-based

constraint deformation approach,” Computer Vision and Pattern Recognition, pp. 1–7, 2007.
[33] I. A. Kakadiaris, G. Passalis, G. Toderici, M. N. Murtuza, Y. Lu, N. Karampatziakis,and T.

Theoharis, “Three-dimensional face recognition in the presence of facial expressions: an annotated

deformable model approach,” IEEE Transaction on Pattern Analysis and Machine Intelligent, 2007.
[34] Z. Wen and T. Huang, “Capturing subtle facial motions in 3d face tracking,”International

Conference on Computer Vision, 2003.
[35] L. Zalewski and S. Gong, “Synthesis and recognition of facial expressions in virtual 3d views,

” Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures,

2004.
[36] F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, , and D. H. Salesin, “Synthesizing realistic

facial expressions from photographs,” ACM Transactions on Graphics (SIGGRAPH), pp. 75–84, 1998.
[37] V. Blanz, C. Basso, T. Poggio, and T. Vetter, “Reanimating faces in images and video,”

European Association for Computer Graphics, vol. 22, no. 3, pp.641–650, 2003.
[38] L. Zhang, Y. Wang, S. Wang, D. Samaras, S. Zhang, and P. Huang, “Imagedriven re-targeting and

relighting of facial expressions,” in Proceedings of the Computer Graphics International, 2005, pp.

11–18.
[39] J. Tenenbaum, V. de Silva, and J. Langford, “A global geometric framework for nonlinear

dimensionality reduction,” Science, 2000.
[40] S. Roweis and L. Saul, “Nonlinear dimensionality reduction by locally linear embedding,”

Science, vol. 290, pp. 2323–2326, 2000.
[41] S. Roweis, L. Saul, and G. Hinton, “Global coordination of local linear models,” Neural

Information Processing Systems, vol. 14, pp. 889–896, 2001.
[42] X. He, D. Cai, S. Yan, and H.-J. Zhang, “Neighborhood preserving embedding,”in International

Conference on Computer Vision. Washington, DC, USA: IEEE Computer Society, 2005, pp. 1208–1213.
[43] Y. Chang, C. Hu, and M. Turk, “Manifold of facial expression, proc. ieee intern,”Proceedings

of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures, 2003.
[44] C. H. Y. Chang and M. Turk, “Probabilistic expression analysis on manifolds,”Computer Vision

and Pattern Recognition, 2004.
[45] C. Hu, Y. Chang, R. Feris, and M. Turk, “Manifold based analysis of facial expression,” Face

Processing in Video Workshop, 2004.
[46] L. Liang, H. Chen, Y.-Q. Xu, and H.-Y. Shum, “Example-based caricature generation with

exaggeration,” in Proceedings of the Pacific Conference on Computer Graphics and Applications. IEEE

Computer Society, 2002, p. 386.
[47] Z. Mo, J. P. Lewis, and U. Neumann, “Improved automatic caricature by feature normalization and

exaggeration,” in ACM SIGGRAPH 2004 Sketches, 2004,p. 57.
[48] H. Chen, Y. qing Xu, H. yeung Shum, S. chun Zhu, and N. ning Zheng,“Example-based facial sketch

generation with non-parametric sampling,” in In ICCV01, 2001, pp. 433–438.
[49] J. Liu, Y. Chen, and W. Gao, “Mapping learning in eigenspace for harmonious caricature

generation,” in MULTIMEDIA ’06: Proceedings of the 14th annual ACM international conference on

Multimedia, 2006, pp. 683–686.
[50] E. Akleman and J. Reisch, “Modeling expressive 3d caricatures,” in ACM SIGGRAPH 2004 Sketches,

2004, p. 61.
[51] J. Xie, Y. Chen, J. Liu, C. Miao, and X. Gao, “Interactive 3d caricature generation based on

double sampling,” in Proceedings of the seventeen ACM international conference on Multimedia, 2009,

pp. 745–748.
[52] T. Kondo, K.Murakami, and H. Koshimizu, “From coarse to fine correspondence of 3-d facial

images and its application to 3-d facial caricaturing,” in Proceedings of the International

Conference on Recent Advances in 3-D Digital Imaging and
Modeling. IEEE Computer Society, 1997, p. 283.
[53] T. Fujiwara, T. Nishihara, M. Tominaga, H. Koshimizu, K. Kato, and K. Murakami, “On the

detection of feature points of 3d facial image and its application to 3d facial caricature,” in 3-D

Digital Imaging and Modeling, 1999, pp. 490–496.
[54] T. Fujiwara, H. Koshimizu, K. Fujimura, H. Kihara, Y. Noguchi, and N. Ishikawa, “3d modeling

system of human face and full 3d facial caricaturing,”in 3-D Digital Imaging and Modeling, 2001, pp.

385–392.
[55] P. Li, Y. Chen, J. Liu, and G. Fu, “3d caricature generation by manifold learning,”in ICME,

2008, pp. 941–944.
[56] T. Leyvand, D. Cohen-Or, G. Dror, and D. Lischinski, “Digital face beautification,”in ACM

SIGGRAPH 2006 Sketches. New York, NY, USA: ACM, 2006,p. 169.
[57] S.-F. Wang and S.-H. Lai, “Estimating 3d face model and facial deformation from a single image

based on expression manifold optimization,” in ECCV (1),2008, pp. 589–602.
[58] P. J. Besl and H. D. Mckay, “A method for registration of 3-d shapes,” IEEE Transaction on

Pattern Analysis and Machine Intelligent, vol. 14, no. 2, pp.239–256, 1992.
[59] R. Pfeifle and H.-P. Seidel, “Triangular b-splines for blending and filling of polygonal holes,

” in Conference on Graphics interface. Toronto, Ont., Canada, Canada: Canadian Information

Processing Society, 1996, pp. 186–193.
[60] D. Zhang and M. Hebert, “Harmonic maps and their applications in surface matching,” Computer

Vision and Pattern Recognition, vol. 2, 1999.
[61] U. Pinkall, S. D. Juni, and K. Polthier, “Computing discrete minimal surfaces and their

conjugates,” Experimental Mathematics, vol. 2, pp. 15–36, 1993.
[62] M. X. Nguyen, X. Yuan, and B. Chen, “Geometry completion and detail generation by texture

synthesis,” The Visual Computer, vol. 21, no. 8-10, pp. 669–678, 2005.
[63] G. Taubin, “Estimating the tensor of curvature of a surface from a polyhedral approximation,”

International Conference on Computer Vision, p. 902, 1995.
[64] S. Periaswamy and H. Farid, “Elastic registration in the presence of intensity variations,”

IEEE Transactions on Medical Imaging, vol. 22, pp. 865–874, 2003.
[65] L. Yin, X. Wei, Y. Sun, J. Wang, and M. J. Rosato, “A 3d facial expression database for facial

behavior research,” Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces

and Gestures, pp. 211–216, 2006.
[66] J. Yin, D. Hu, and Z. Zhou, “Noisy manifold learning using neighborhood smoothing embedding,”

Pattern Recogn. Lett., vol. 29, no. 11, pp. 1613–1620, 2008.
[67] H. Chang and D.-Y. Yeung, “Robust locally linear embedding,” Pattern Recognition, vol. 39, no.

6, pp. 1053–1065, 2006.
[68] T. F. Cox and M. A. A. Cox, Multidimensional Scaling, Second Edition. Chapman & Hall/CRC,

September 2000.
[69] M. Belkin and P. Niyogi, “Laplacian eigenmaps and spectral techniques for embedding and

clustering,” in Advances in Neural Information Processing Systems.MIT Press, 2001, pp. 585–591.
[70] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. Wiley-Interscience Publication,

2000.
[71] J. Bilmes and A. Gentle, “Tutorial of the em algorithm and its application to parameter

estimation for gaussian mixture and hidden markov models,” International Computer Science Institute,

1998.
[72] K. Levenberg, “A method for the solution of certain non-linear problems in least squares,” In

Quarterly of Applied Mathematics, vol. 2, no. 2, pp. 164–168, 1944.
[73] Y. Wang, L. Zhang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, and D. Samaras, “Face relighting from a

single image under arbitrary unknown lighting conditions,”IEEE Transaction on Pattern Analysis and

Machine Intelligent, vol. 31, no. 11, pp. 1968–1984, 2009.
[74] Y. Zhang and Y. Yang, “Illuminant direction determination for multiple light sources,”

Computer Vision and Pattern Recognition, vol. 1, pp. 269–276, 2000.
[75] Y. Zhang and Y. Yang, “Multiple illuminant direction detection with application to image

synthesis,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 23, no. 8, pp. 915–

920, 2001.
[76] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust face recognition via sparse

representation,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 31, no. 2, pp.

210–227, 2009.
[77] Z. Lin, A. Ganesh, J. Wright, L. Wu, M. Chen, and Y. Ma, “Fast convex optimization algorithms

for recovering a corrupted low-rank matrix,” UIUC Technical Report UILU-ENG-09-2214, 2009.
[78] F. Wallhoff, Facial Expressions and Emotion Database, FG-Net database.
[79] B. Glocker, N. Komodakis, G. Tziritas, N. Navab, and N. Paragios, “Dense image registration

through MRFs and efficient linear programming,” Special issue on information processing in medical

imaging, vol. 12, no. 6, pp. 731–741,2008.
[80] N. Komodakis, G. Tziritas, and N. Paragios, “Fast, approximately optimal solutions for single

and dynamic mrfs,” Computer Vision and Pattern Recognition,vol. 0, pp. 1–8, 2007.
[81] A. Myronenko, X. B. Song, and M. ´A. Carreira-Perpi˜n´an, “Non-rigid point set registration:

Coherent point drift,” in Neural Information Processing Systems,2006, pp. 1009–1016.
[82] H. Chui and A. Rangarajan, “A new algorithm for non-rigid point matching,”Computer Vision and

Pattern Recognition, vol. 2, p. 2044, 2000.
[83] L. Yin, X. Chen, Y. Sun, T.Worm, and M. Reale, “A high-resolution 3d dynamic facial expression

database,” in FG, 2008, pp. 1–6.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文