研究生: |
王書凡 Wang, Shu-Fan |
---|---|
論文名稱: |
從單張人臉影像中估測三維外形、表情變化、照度與光源 Estimating 3D Shape, Expression Deformation, Albedo, and Illumination from a Single Face Image |
指導教授: |
賴尚宏
Lai, Shang-Hong |
口試委員: | |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2010 |
畢業學年度: | 99 |
語文別: | 英文 |
論文頁數: | 93 |
中文關鍵詞: | 三維重建 、人臉重建 、流形學習 |
外文關鍵詞: | 3D Reconstruction, Face Reconstruction, Manifold Learning |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
三維人臉模型在許多的應用中都是相當熱門的主題,例如人臉動畫、人臉辨識以及虛擬人物會議等的應用。所以如何正確的模型化臉部的幾何、材質顏色以及光源變化在電腦視覺以及圖學都漸形重要。在文獻中從單張影像模型化三維人臉主要利用了事前訓練的資訊。然而,要在有包含人臉表情變化的單張影像中正確地重建出人臉幾何仍然相當困難,因為表情變化使得三維空間中臉部表面的變形更為複雜。主要的挑戰在於必需同時分析無表情的原始臉部幾何,以及其附加的表情變化,這使得問題更為棘手。另一方面,光源的變化,常常因為臉部材質以及角度而有所不同,困難度更為提高。在此篇論文中,我們主要發展了一套完整的系統,能夠重建三維人臉幾何、表情、照度、光源等的資訊。系統包含了訓練的步驟,也就是如何自動地在人臉模型間建立點與點的對應關係,以及描述無情形幾何、表情變化的模型化方法。在決定三維表面的對應關係時,我們是將三維的模型先參數化到二維的平面上,如此問題就變成二維平面上點對應問題。利用線性的主成分分析表示人臉原始外形,而使用非線性的流行分析描述表形變化。我們也同時提出三維人臉重建的演算法,基於訓練階段的模型,我們結合了線性與非線性子空間的表示法來描述無表情人臉形狀幾何與機率化流形三維表情變化。在此系統中,我們整合了人臉幾何、表情變化、材質與光源種種的資訊,如此在解決此問題時有良好的限制而得到合理的結果。實驗中的驗證支持我們所發展的系統,而所產生的結果也能夠延伸並適用於不同的應用,例如影像合成,光源估測,表情移除及表情轉換,以及人臉特徵誇張化等。
Three-dimensional human face modeling is a very popular topic with many applications,such as facial animation, face recognition and model-based facial video communication. Therefore, how to model the facial geometry, texture intensity and illumination variation is important in computer vision and graphics. Previous works on 3D head modeling from a single face image utilized prior information on 3D head models. However, it is difficult to accurately reconstruct the 3D face model from a single face image with expression since the facial expression induces 3D face model deformation in a complex manner. The main challenge is the coupling of the neutral 3D face model and the 3D deformation due to expression, thus making the 3D model estimation from a single face image with expression very challenging. On the other hand, the illumination condition also makes the problem more difficult. In this thesis,we focus on developing a 3D face model reconstruction system including surface registration and training of 3D face models with expressional deformations as well as the estimation of the 3D neutral shape and the 3D expressional deformation from a single face image. The proposed reconstruction algorithm integrates the linear and non-linear subspace representations for a prior 3D neutral morphable model and the probabilistic manifold-based 3D expressional deformation. We incorporate the face geometry, expression deformation, texture and illumination information into the problem so that it is well constrained. The reconstructed 3D face models can also be further extended and applied to many real-world applications.
[1] http://www.facegen.com/modeller.htm, Singular.
[2] T. Sim, S. Baker, and M. Bsat, “The cmu pose, illumination, and expression database,” IEEE
Transaction on Pattern Analysis and Machine Intelligent, pp.1615–1618, 2003.
[3] V. Blanz and T. Vetter, “A morphable model for the synthesis of 3d-faces,” in ACM Transactions
on Graphics (SIGGRAPH), 1999, pp. 187–194.
[4] V. Blanz and T. Vetter, “Face recognition based on fitting a 3d morphable model,” IEEE
Transaction on Pattern Analysis and Machine Intelligent, vol. 25, no. 9, pp. 1063–1074, 2003.
[5] J. Davis, R. Ramamoorthi, and S. Rusinkiewicz, “Spacetime stereo: A unifying framework for depth
from triangulation,” Computer Vision and Pattern Recognition, pp. 359–366, 2003.
[6] S. Rusinkiewicz, O. Hall-Holt, and M. Levoy, “Real-time 3d model acquisition,” ACM Transactions
on Graphics (SIGGRAPH), vol. 21, no. 3, pp. 438–446, Jul. 2002.
[7] L. Zhang, N. Snavely, B. Curless, and S. M. Seitz, “Spacetime faces: Highresolution capture for
modeling and animation,” ACM Transactions on Graphics, pp. 548–558, August 2004.
[8] S. Zhang and P. Huang, “High-resolution, real-time 3d shape acquisition,” Computer Vision and
Pattern Recognition Workshop, vol. 3, p. 28, 2004.
[9] B. Allen, B. Curless, and Z. Popovic’, “The space of human body shapes: reconstruction and
parameterization from range scans,” ACM Transactions on Graphics, vol. 22, pp. 587–594, 2003.
[10] B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin, “Making faces,”ACM Transactions on
Graphics, pp. 55–66, 1998.
[11] J. yong Noh and U. Neumann, “Abstract expression cloning,” ACM Transactions on Graphics
(SIGGRAPH), 2001.
[12] Y. Wang, X. Huang, C.-S. Lee, S. Zhang, Z. Li, D. Samaras, D. Metaxas, A. Elgammal,and P. Huang,
“High resolution acquisition, learning and transfer of dynamic 3-d facial expressions,” European
Association for Computer Graphics,
pp. 677–686, 2004.
[13] Y. Wang, M. Gupta, S. Zhang, S. Wang, X. Gu, D. Samaras, and P. Huang,“High resolution tracking
of non-rigid motion of densely sampled 3d data using harmonic maps,” International Journal of
Computer Vision, vol. 76, no. 3, pp. 283–300, March 2008.
[14] C. Basso, P. Paysan, and T. Vetter, “Registration of expressions data using a 3d morphable
model,” in Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and
Gestures. Washington, DC, USA: IEEE Computer Society, 2006, pp. 205–210.
[15] S. Wang, M. Jin, and X. D. Gu, “Conformal geometry and its applications on 3d shape matching,
recognition, and stitching,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 29,
no. 7, pp. 1209–1220, 2007.
[16] S. Wang, X. Gu, and H. Qin, “Automatic non-rigid registration of 3d dynamic data for facial
expression synthesis and transfer,” Computer Vision and Pattern Recognition, pp. 1–8, June 2008.
[17] P. Hallinan, “A low-dimensional representation of human faces for arbitrary lighting
conditions,” Computer Vision and Pattern Recognition, pp. 995–999,1994.
[18] L. Zhao and Y. Yang, “Theoretical analysis of illumination in pca-based vision systems,”
Pattern Recognition, vol. 32, no. 4, pp. 547–564, 1999.
[19] Y. Adini, Y. Moses, and S. Ullman, “Face recognition: The problem of compensating for changes
in illumination direction,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 19,
no. 7, pp. 721–732, 1997.
[20] P. Belhumeur and D. Kriegman, “What is the set of images of an object under all possible
illumination conditions?” International Conference on Computer Vision, vol. 28, no. 3, pp. 245–260,
1998.
[21] A. Shashua, “On photometric issues in 3d visual recognition from a single 2d image,”
International Conference on Computer Vision, pp. 99–122, 1999.
[22] R. Ramamoorthi and P. Hanrahan, “An efficient representation for irradiance environment maps,”
in SIGGRAPH, 2001, pp. 497–500.
[23] R. Ramamoorthi and P. Hanrahan, “A signal-processing framework for inverse rendering,”
SIGGRAPH, pp. 117–228, 2001.
[24] R. Basri and D. Jacobs, “Lambertian reflectance and linear subspaces,” IEEE Transaction on
Pattern Analysis and Machine Intelligent, vol. 25, no. 2, pp. 218–233, 2003.
[25] R. Ramamoorthi, “Analytic pca construction for theoretical analysis of lighting variability in
images of a lambertian object,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol.
24, no. 10, pp. 1–12, 2002.
[26] K. C. Lee, J. Ho, and D. Kriegman, “Acquiring linear subspaces for face recognition under
variable lighting,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 27, no. 5,
pp. 684–698, 2005.
[27] L. Zhang, S. Wang, and D. Samaras, “Pose invariant face recognition under arbitrary unknown
lighting using spherical harmonics,” Biometric Authentication Workshop, 2004.
[28] L. Zhang, S. Wang, and D. Samaras, “Face synthesis and recognition from a single image under
arbitrary unknown lighting using a spherical harmonic basis morphable model,” Computer Vision and
Pattern Recognition, vol. 2, pp. 209–216, 2005.
[29] L. Zhang and D. Samaras, “Face recognition from a single training image under arbitrary unknown
lighting using spherical harmonics,” IEEE Transaction on Pattern Analysis and Machine Intelligent,
vol. 28, no. 3, pp. 351–363, 2006.
[30] A. Bronstein, M. Bronstein, and R. Kimmel, “Expression invariant 3d face recognition,”
Proceedings of the 4th international conference on Audio- and video-based biometric person
authentication, vol. 2688, pp. 62–70, 2003.
[31] A. Bronstein, M. Bronstein, and R. Kimmel, “Three dimensional face recognition,”International
Conference on Computer Vision, vol. 64, no. 1, pp. 5–30,2005.
[32] Y. Wang, G. Pan, and Z. Wu, “3d face recognition in the presence of expression:a guidance-based
constraint deformation approach,” Computer Vision and Pattern Recognition, pp. 1–7, 2007.
[33] I. A. Kakadiaris, G. Passalis, G. Toderici, M. N. Murtuza, Y. Lu, N. Karampatziakis,and T.
Theoharis, “Three-dimensional face recognition in the presence of facial expressions: an annotated
deformable model approach,” IEEE Transaction on Pattern Analysis and Machine Intelligent, 2007.
[34] Z. Wen and T. Huang, “Capturing subtle facial motions in 3d face tracking,”International
Conference on Computer Vision, 2003.
[35] L. Zalewski and S. Gong, “Synthesis and recognition of facial expressions in virtual 3d views,
” Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures,
2004.
[36] F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, , and D. H. Salesin, “Synthesizing realistic
facial expressions from photographs,” ACM Transactions on Graphics (SIGGRAPH), pp. 75–84, 1998.
[37] V. Blanz, C. Basso, T. Poggio, and T. Vetter, “Reanimating faces in images and video,”
European Association for Computer Graphics, vol. 22, no. 3, pp.641–650, 2003.
[38] L. Zhang, Y. Wang, S. Wang, D. Samaras, S. Zhang, and P. Huang, “Imagedriven re-targeting and
relighting of facial expressions,” in Proceedings of the Computer Graphics International, 2005, pp.
11–18.
[39] J. Tenenbaum, V. de Silva, and J. Langford, “A global geometric framework for nonlinear
dimensionality reduction,” Science, 2000.
[40] S. Roweis and L. Saul, “Nonlinear dimensionality reduction by locally linear embedding,”
Science, vol. 290, pp. 2323–2326, 2000.
[41] S. Roweis, L. Saul, and G. Hinton, “Global coordination of local linear models,” Neural
Information Processing Systems, vol. 14, pp. 889–896, 2001.
[42] X. He, D. Cai, S. Yan, and H.-J. Zhang, “Neighborhood preserving embedding,”in International
Conference on Computer Vision. Washington, DC, USA: IEEE Computer Society, 2005, pp. 1208–1213.
[43] Y. Chang, C. Hu, and M. Turk, “Manifold of facial expression, proc. ieee intern,”Proceedings
of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures, 2003.
[44] C. H. Y. Chang and M. Turk, “Probabilistic expression analysis on manifolds,”Computer Vision
and Pattern Recognition, 2004.
[45] C. Hu, Y. Chang, R. Feris, and M. Turk, “Manifold based analysis of facial expression,” Face
Processing in Video Workshop, 2004.
[46] L. Liang, H. Chen, Y.-Q. Xu, and H.-Y. Shum, “Example-based caricature generation with
exaggeration,” in Proceedings of the Pacific Conference on Computer Graphics and Applications. IEEE
Computer Society, 2002, p. 386.
[47] Z. Mo, J. P. Lewis, and U. Neumann, “Improved automatic caricature by feature normalization and
exaggeration,” in ACM SIGGRAPH 2004 Sketches, 2004,p. 57.
[48] H. Chen, Y. qing Xu, H. yeung Shum, S. chun Zhu, and N. ning Zheng,“Example-based facial sketch
generation with non-parametric sampling,” in In ICCV01, 2001, pp. 433–438.
[49] J. Liu, Y. Chen, and W. Gao, “Mapping learning in eigenspace for harmonious caricature
generation,” in MULTIMEDIA ’06: Proceedings of the 14th annual ACM international conference on
Multimedia, 2006, pp. 683–686.
[50] E. Akleman and J. Reisch, “Modeling expressive 3d caricatures,” in ACM SIGGRAPH 2004 Sketches,
2004, p. 61.
[51] J. Xie, Y. Chen, J. Liu, C. Miao, and X. Gao, “Interactive 3d caricature generation based on
double sampling,” in Proceedings of the seventeen ACM international conference on Multimedia, 2009,
pp. 745–748.
[52] T. Kondo, K.Murakami, and H. Koshimizu, “From coarse to fine correspondence of 3-d facial
images and its application to 3-d facial caricaturing,” in Proceedings of the International
Conference on Recent Advances in 3-D Digital Imaging and
Modeling. IEEE Computer Society, 1997, p. 283.
[53] T. Fujiwara, T. Nishihara, M. Tominaga, H. Koshimizu, K. Kato, and K. Murakami, “On the
detection of feature points of 3d facial image and its application to 3d facial caricature,” in 3-D
Digital Imaging and Modeling, 1999, pp. 490–496.
[54] T. Fujiwara, H. Koshimizu, K. Fujimura, H. Kihara, Y. Noguchi, and N. Ishikawa, “3d modeling
system of human face and full 3d facial caricaturing,”in 3-D Digital Imaging and Modeling, 2001, pp.
385–392.
[55] P. Li, Y. Chen, J. Liu, and G. Fu, “3d caricature generation by manifold learning,”in ICME,
2008, pp. 941–944.
[56] T. Leyvand, D. Cohen-Or, G. Dror, and D. Lischinski, “Digital face beautification,”in ACM
SIGGRAPH 2006 Sketches. New York, NY, USA: ACM, 2006,p. 169.
[57] S.-F. Wang and S.-H. Lai, “Estimating 3d face model and facial deformation from a single image
based on expression manifold optimization,” in ECCV (1),2008, pp. 589–602.
[58] P. J. Besl and H. D. Mckay, “A method for registration of 3-d shapes,” IEEE Transaction on
Pattern Analysis and Machine Intelligent, vol. 14, no. 2, pp.239–256, 1992.
[59] R. Pfeifle and H.-P. Seidel, “Triangular b-splines for blending and filling of polygonal holes,
” in Conference on Graphics interface. Toronto, Ont., Canada, Canada: Canadian Information
Processing Society, 1996, pp. 186–193.
[60] D. Zhang and M. Hebert, “Harmonic maps and their applications in surface matching,” Computer
Vision and Pattern Recognition, vol. 2, 1999.
[61] U. Pinkall, S. D. Juni, and K. Polthier, “Computing discrete minimal surfaces and their
conjugates,” Experimental Mathematics, vol. 2, pp. 15–36, 1993.
[62] M. X. Nguyen, X. Yuan, and B. Chen, “Geometry completion and detail generation by texture
synthesis,” The Visual Computer, vol. 21, no. 8-10, pp. 669–678, 2005.
[63] G. Taubin, “Estimating the tensor of curvature of a surface from a polyhedral approximation,”
International Conference on Computer Vision, p. 902, 1995.
[64] S. Periaswamy and H. Farid, “Elastic registration in the presence of intensity variations,”
IEEE Transactions on Medical Imaging, vol. 22, pp. 865–874, 2003.
[65] L. Yin, X. Wei, Y. Sun, J. Wang, and M. J. Rosato, “A 3d facial expression database for facial
behavior research,” Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces
and Gestures, pp. 211–216, 2006.
[66] J. Yin, D. Hu, and Z. Zhou, “Noisy manifold learning using neighborhood smoothing embedding,”
Pattern Recogn. Lett., vol. 29, no. 11, pp. 1613–1620, 2008.
[67] H. Chang and D.-Y. Yeung, “Robust locally linear embedding,” Pattern Recognition, vol. 39, no.
6, pp. 1053–1065, 2006.
[68] T. F. Cox and M. A. A. Cox, Multidimensional Scaling, Second Edition. Chapman & Hall/CRC,
September 2000.
[69] M. Belkin and P. Niyogi, “Laplacian eigenmaps and spectral techniques for embedding and
clustering,” in Advances in Neural Information Processing Systems.MIT Press, 2001, pp. 585–591.
[70] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. Wiley-Interscience Publication,
2000.
[71] J. Bilmes and A. Gentle, “Tutorial of the em algorithm and its application to parameter
estimation for gaussian mixture and hidden markov models,” International Computer Science Institute,
1998.
[72] K. Levenberg, “A method for the solution of certain non-linear problems in least squares,” In
Quarterly of Applied Mathematics, vol. 2, no. 2, pp. 164–168, 1944.
[73] Y. Wang, L. Zhang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, and D. Samaras, “Face relighting from a
single image under arbitrary unknown lighting conditions,”IEEE Transaction on Pattern Analysis and
Machine Intelligent, vol. 31, no. 11, pp. 1968–1984, 2009.
[74] Y. Zhang and Y. Yang, “Illuminant direction determination for multiple light sources,”
Computer Vision and Pattern Recognition, vol. 1, pp. 269–276, 2000.
[75] Y. Zhang and Y. Yang, “Multiple illuminant direction detection with application to image
synthesis,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 23, no. 8, pp. 915–
920, 2001.
[76] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust face recognition via sparse
representation,” IEEE Transaction on Pattern Analysis and Machine Intelligent, vol. 31, no. 2, pp.
210–227, 2009.
[77] Z. Lin, A. Ganesh, J. Wright, L. Wu, M. Chen, and Y. Ma, “Fast convex optimization algorithms
for recovering a corrupted low-rank matrix,” UIUC Technical Report UILU-ENG-09-2214, 2009.
[78] F. Wallhoff, Facial Expressions and Emotion Database, FG-Net database.
[79] B. Glocker, N. Komodakis, G. Tziritas, N. Navab, and N. Paragios, “Dense image registration
through MRFs and efficient linear programming,” Special issue on information processing in medical
imaging, vol. 12, no. 6, pp. 731–741,2008.
[80] N. Komodakis, G. Tziritas, and N. Paragios, “Fast, approximately optimal solutions for single
and dynamic mrfs,” Computer Vision and Pattern Recognition,vol. 0, pp. 1–8, 2007.
[81] A. Myronenko, X. B. Song, and M. ´A. Carreira-Perpi˜n´an, “Non-rigid point set registration:
Coherent point drift,” in Neural Information Processing Systems,2006, pp. 1009–1016.
[82] H. Chui and A. Rangarajan, “A new algorithm for non-rigid point matching,”Computer Vision and
Pattern Recognition, vol. 2, p. 2044, 2000.
[83] L. Yin, X. Chen, Y. Sun, T.Worm, and M. Reale, “A high-resolution 3d dynamic facial expression
database,” in FG, 2008, pp. 1–6.