整合雙任務策略與時間信息以進行電腦渲染圖像的混疊偵測

簡易檢索 / 詳目顯示

回結果列表

研究生：	范書賀 Fan, Shu-Ho
論文名稱：	整合雙任務策略與時間信息以進行電腦渲染圖像的混疊偵測 Integrating Dual-Task Strategies and Temporal Information for Aliasing Detection in Computer Rendered Images
指導教授：	朱宏國 Chu, Hung-Kuo 陳煥宗 Chen, Hwann-Tzong
口試委員:	胡敏君 Hu, Min-Chun 姚智原 Yao, Chih-Yuan
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	35
中文關鍵詞：	混疊檢測、多任務學習、圖像品質評估、機器學習
外文關鍵詞：	aliasing detection, multi-task learning, mage quality assessment, machine learning
相關次數：	點閱：159 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著渲染技術的持續創新，3D場景的複雜度也不斷增加。遊戲和互動媒體對高品質視覺效果的追求推動了圖像品質評估（IQA）技術的進步。傳統依賴參考圖像的評估方法，如PSNR和SSIM，在實際應用中面臨限制，凸顯了針對渲染圖像設計的無參考IQA方法的重要性。本研究針對混疊瑕疵（aliasing）—常見的渲染瑕疵，提出了一種創新的多任務學習架構。此架構能夠同時校正和預測混疊瑕疵，無需參考圖像即可提升預測的準確性。此外，我們的方法還整合了時間信息，以增強視覺的連貫性和流暢性。我們使用Unity開發的自動標記流程來創建一個穩定且無偏見的資料集，用於模型的訓練和評估。實驗結果顯示，我們的方法能夠可靠地檢測各種複雜度的混疊瑕疵，並達到了state-of-the-art。通過解決渲染圖像評估中的特定挑戰並利用創新的學習技術，本篇論文提升了針對遊戲場景中混疊瑕疵的無參考IQA方法，確保了高品質視覺效果與效率之間的良好平衡。

As technology advances from simple 2D designs to intricate 3D environments, the demand for high-quality visuals in video games and interactive media necessitates robust image quality assessment (IQA) techniques. Traditional methods like PSNR and SSIM, reliant on reference images, struggle with the unique challenges of 3D rendered content, highlighting the need for specialized non-reference IQA approaches. This paper introduces a novel multi-task learning architecture that corrects and predicts aliasing artifacts simultaneously, enhancing predictive accuracy without reference images. It also incorporates temporal information to improve visual coherence and smoothness. An automated labeling pipeline developed using Unity ensures a stable and unbiased dataset for model training and evaluation. Our experiments demonstrate that this approach reliably detects aliasing across various complexities, achieving state-of-the-art performance. By addressing specific challenges in rendered image assessment and leveraging innovative learning techniques, our work advances IQA for video games and simulations, ensuring high visual quality.

摘要 i
Abstract ii
目錄 iii
插圖 v
表格 vii
Introduction 1
1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Related Work 5
1 Image Quality Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Aliasing Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Image Restoration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Proposed Method 9
1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Temporal Information Utilization . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Aliasing Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Leveraging Reconstruction for Classifier Improvement . . . . . . . . . . . . . 12
5 Kernel Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6.1 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6.2 Model Architecture Details . . . . . . . . . . . . . . . . . . . . . . . . 15
6.3 Training Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Experimental Results 19
1 Dataset and Labeling Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.1 Auto-Labeling Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Evaluation and Baseline Comparisons . . . . . . . . . . . . . . . . . . . . . . 21
2.1 Baseline Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Quantitative Performance Metrics . . . . . . . . . . . . . . . . . . . . 22
3 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
iii
3.2 Benefits of Temporal Information and Reconstruction . . . . . . . . . 24
4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5 Visual Insights with Grad-CAM . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1 Challenges of FBCNN in Aliasing Detection . . . . . . . . . . . . . . 27
5.2 Insights into Temporal Information and Reconstruction . . . . . . . . . 27
Conclusion 29
1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
                                

[1] Wenhan Yang, Robby T. Tan, Jiashi Feng, Jiaying Liu, Zongming Guo, and Shuicheng Yan.
Deep joint rain detection and removal from a single image, 2017.
[2] Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, and Huizhu Jia. Ffa-net: Feature
fusion attention network for single image dehazing, 2019.
[3] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam,
Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via
gradient-based localization. International Journal of Computer Vision, 128(2):336–359,
October 2019. ISSN 1573-1405. doi: 10.1007/s11263-019-01228-7. URL http:
//dx.doi.org/10.1007/s11263-019-01228-7.
[4] Anjul Patney and Aaron Lefohn. Detecting aliasing artifacts in image sequences using
deep neural networks. In Proceedings of the Conference on High-Performance Graph-
ics, HPG ’18, New York, NY, USA, 2018. Association for Computing Machinery. ISBN
9781450358965. doi: 10.1145/3231578.3231580. URL https://doi.org/10.
1145/3231578.3231580.
[5] Jiaxi Jiang, Kai Zhang, and Radu Timofte. Towards flexible blind jpeg artifacts removal,
2021.
[6] Zhendong Wang, Xiaodong Cun, Jianmin Bao, and Jianzhuang Liu. Uformer: A general
u-shaped transformer for image restoration. 2022 IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR), pages 17662–17672, 2021. URL https://
api.semanticscholar.org/CorpusID:235358213.
[7] Bolun Cai, Xiangmin Xu, Kui Jia, Chunmei Qing, and Dacheng Tao. Dehazenet: An end-
to-end system for single image haze removal. IEEE Transactions on Image Processing,
25:5187–5198, 2016. URL https://api.semanticscholar.org/CorpusID:
14092238.
[8] Ding Liu, Bihan Wen, Yuchen Fan, Chen Change Loy, and Thomas S. Huang. Non-local
recurrent network for image restoration. In Neural Information Processing Systems, 2018.
URL https://api.semanticscholar.org/CorpusID:47007607.
[9] Pengju Liu, Hongzhi Zhang, K. Zhang, Liang Lin, and Wangmeng Zuo. Multi-level
wavelet-cnn for image restoration. 2018 IEEE/ CVF Conference on Computer Vision
and Pattern Recognition Workshops (CVPRW), pages 886–88609, 2018. URL https:
//api.semanticscholar.org/CorpusID:29151865.
[10] Saeed Anwar and Nick Barnes. Real image denoising with feature attention. In 2019 IEEE/
CVF International Conference on Computer Vision (ICCV), pages 3155–3164, 2019. doi:
10.1109/ICCV.2019.00325.
[11] Zongsheng Yue, Hongwei Yong, Qian Zhao, Lei Zhang, and Deyu Meng. Variational de-
noising network: Toward blind noise modeling and removal. ArXiv, abs/1908.11314, 2019.
URL https://api.semanticscholar.org/CorpusID:201667906.
[12] Syed Waqas Zamir, Aditya Arora, Salman Hameed Khan, Munawar Hayat, Fahad Shahbaz
Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image
restoration. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR), pages 5718–5729, 2021. URL https://api.semanticscholar.org/
CorpusID:244346144.
[13] Marcos V. Conde, Ui-Jin Choi, Maxime Burchi, and Radu Timofte. Swin2sr: Swinv2
transformer for compressed image super-resolution and restoration. In ECCV Workshops,
2022. URL https://api.semanticscholar.org/CorpusID:252519482.
[14] Jingyun Liang, Jie Cao, Guolei Sun, K. Zhang, Luc Van Gool, and Radu Timofte. Swinir:
Image restoration using swin transformer. 2021 IEEE/CVF International Conference on
Computer Vision Workshops (ICCVW), pages 1833–1844, 2021. URL https://api.
semanticscholar.org/CorpusID:237266491.
[15] Manu Mathew Thomas, Karthikeyan Vaidyanathan, Gabor Liktor, and Angus Graeme
Forbes. A reduced-precision network for image reconstruction. ACM Transactions on
Graphics (TOG), 39:1 – 12, 2020. URL https://api.semanticscholar.org/
CorpusID:221492171.
[16] Killian Herveau, Max Piochowiak, and Carsten Dachsbacher. Minimal convolu-
tional neural networks for temporal anti aliasing. 2023. URL https://api.
semanticscholar.org/CorpusID:259305872.
[17] Steve Bako, Thijs Vogels, Brian McWilliams, Mark Meyer, Jan Novák, Alex Harvill,
Pradeep Sen, Tony DeRose, and Fabrice Rousselle. Kernel-predicting convolutional net-
works for denoising monte carlo renderings. ACM Transactions on Graphics (TOG), 36:1 –
14, 2017. URL https://api.semanticscholar.org/CorpusID:31004998.
[18] Xiaoxu Meng, Quan Zheng, Amitabh Varshney, Gurprit Singh, and Matthias Zwicker.
Real-time monte carlo denoising with the neural bilateral grid. In Eurographics Symposium
on Rendering, 2020. URL https://api.semanticscholar.org/CorpusID:
220284605.
[19] Lei Xiao, Salah Nouri, Matthew Chapman, Alexander Fix, Douglas Lanman, and Anton
Kaplanyan. Neural supersampling for real-time rendering. ACM Transactions on Graphics
(TOG), 39:142:1 – 142:12, 2020. URL https://api.semanticscholar.org/
CorpusID:221105079.
[20] Arthur Juliani, Vincent-Pierre Berges, Ervin Teng, Andrew Cohen, Jonathan Harper, Chris
Elion, Chris Goy, Yuan Gao, Hunter Henry, Marwan Mattar, and Danny Lange. Unity: A
general platform for intelligent agents, 2020.
[21] Zhou Wang, Alan Conrad Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. Image qual-
ity assessment: from error visibility to structural similarity. IEEE Transactions on Im-
age Processing, 13:600–612, 2004. URL https://api.semanticscholar.org/
CorpusID:207761262.
[22] Rafał K. Mantiuk, Kil Joong Kim, Allan G. Rempel, and Wolfgang Heidrich. Hdr-vdp-2:
a calibrated visual metric for visibility and quality predictions in all luminance conditions.
ACM SIGGRAPH 2011 papers, 2011. URL https://api.semanticscholar.
org/CorpusID:756729.
[23] Pontus Andersson, Jim Nilsson, Tomas Akenine-Möller, Magnus Oskarsson, Karl Jo-
han Åström, and Mark D. Fairchild. Flip: A difference evaluator for alternating im-
ages. Proc. ACM Comput. Graph. Interact. Tech., 3:15:1–15:23, 2020. URL https:
//api.semanticscholar.org/CorpusID:220643528.
[24] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for
large-scale image recognition. CoRR, abs/ 1409.1556, 2014. URL https://api.
semanticscholar.org/CorpusID:14124313.
[25] Zhenqiang Ying, Maniratnam Mandal, Deepti Ghadiyaram, and Alan Bovik. Patch-vq:
‘patching up’the video quality problem. In 2021 IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR). IEEE, June 2021. doi: 10.1109/cvpr46437.2021.
01380. URL http://dx.doi.org/10.1109/CVPR46437.2021.01380.
[26] S. Alireza Golestaneh, Saba Dadsetan, and Kris M. Kitani. No-reference image quality
assessment via transformers, relative ranking, and self-consistency, 2022.
[27] Etienne Vouga, Christopher Wojtan, Yu-Xiao Guo, Guojun Chen, Yue Dong, and Xin Tong.
Classifier guided temporal supersampling for real time rendering. Computer Graph-
ics Forum, 41, 2022. URL https://api.semanticscholar.org/CorpusID:
254249752.
[28] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for
biomedical image segmentation. ArXiv, abs/1505.04597, 2015. URL https://api.
semanticscholar.org/CorpusID:3719281.
[29] Saeed Anwar and Nick Barnes. Real image denoising with feature attention. In 2019 IEEE/
CVF International Conference on Computer Vision (ICCV), pages 3155–3164, 2019. doi:
10.1109/ICCV.2019.00325.
[30] AMD. Fidelityfx super resolution 2. https://gpuopen.com/
fidelityfx-superresolution-2/, 2023

簡易檢索 / 詳目顯示

相關論文