研究生: |
林柏廷 Lin Bo-Ting |
---|---|
論文名稱: |
用於嵌入式硬體故障偵錯之軟體開發方法 A Software Development Methodology for Embedded Hardware Fault Detection |
指導教授: | 黃慶育 |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 中文 |
論文頁數: | 62 |
中文關鍵詞: | 嵌入式系統 、軟體方法 、RESO 、冗餘 、GNU |
外文關鍵詞: | Embedded System, Software Methodology, RESO, Redundancy, GNU |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今的嵌入式系統開發,隨著硬體技術、製程的快速發展,應用也跟著更加廣泛,諸如高可靠度與穩定度的嵌入式系統開發與需求。例如影像處理或是資料傳送等應用,都需要系統處理大量的無誤運算。傳統的做法上,通常藉由增加硬體偵錯元件或是硬體驗證來偵測錯誤。如此一來相對地開發成本提高許多。在本篇論文中我們試圖提出一個修正的軟體方法,目的在即時與自動地對硬體功能單元在運作過程中所可能產生的錯誤做出偵測與回報。軟體方法的核心觀念在於複製與比較指令碼,利用時間冗餘比對兩條相似指令的結果來判斷程式執行中硬體有無產生瞬間性的功能單元錯誤。此外將藉由RESO(ReComputation of Shifted Operands)的概念以偵測永久性錯誤。軟體方法不需要對硬體以及軟體做出修改,嘗試以有限的系統效能下降來維持硬體運作時的可靠度,並節省除錯初期額外的硬體花費。實作方面,我們選擇了ARM微處理器以及GNU的工具鏈,包括編譯器與模擬器,作為實驗以及檢證結果的平台。藉由對工具鏈的修改來完成實驗。同時將依照ARM微處理器的特性來調整軟體方法以達到效能改進,此外我們將軟體方法的使用與改進自動化,能簡易地嵌入應用程式中。在實驗數據的比較上,相對於其他硬體測試平台,可看到偵測錯誤的能力令人滿意,效能也控制在合理的範圍之內。
In the modern development of embedded systems, cause of the evolution of hardware technology and process, the application of embedded system is more and more popular. The demands of high reliability and availability embedded system are growing. For example, image processing or data transmission both need lots of fault-free computation. In traditional solutions, ensuring the hardware correctness is often achieved through adding error detection tools or performing hardware verification. In this paper, we propose a software methodology aims to achieve hardware data-path able to detect errors on-line and report autonomously. The kernel concept of the software methodology is duplicating and comparing instructions. We use time-redundant operations for comparing two similar instructions in order to judge if transient errors occur. In addition to the concept of RESO (ReComputation of Shifted Operands), we can detect permanent errors and increase the error coverage. Besides, the software methodology need not modify the original hardware and software and can save extra cost at early stage of hardware debugging. In practical, we select ARM microprocessor and GNU tool-chain, including compiler and simulator, as the platform of experiment and verification. Furthermore, we automate the methodology and its optimization in order to ease the checking assertion routine. Comparing to experimental results of other hardware platforms, we can find the error detecting ability is satisfying and the performance degradation is also acceptable.
[1] W. Wolf, Computers as Components: Principles of Embedded Computing System Design, Morgan Kaufman Publishers 2002.
[2] D.J. Sherwin and A. Bossche, The reliability, availability, and productiveness of systems, London Chapman & Hall, New York, 1993.
[3] J. D. Musa, A. Iannino, and K. Okumoto, Software Reliability, Measurement, Prediction and Application, McGraw Hill, 1987.
[4] C. Bolchini, “A Software Methodology for Detecting Hardware Faults in VLIW Data Paths,” IEEE Trans. on Reliability, Vol. 52, No. 4, Dec, 2003.
[5] G.. S. Sohi, M. Franklin, and K. K. Saluja, “A Study of Time-Redundant Fault-Tolerance Techniques for High–Performance Pipelined Computers,” Proceedings of the 19th International Symposium on Fault-Tolerant Computing, pp. 436-443, June, 1989.
[6] M. Rebaudengo, and M. S. Reorda, M. Torchiano, and M.Violante, “Soft-Error Detection through Software Fault-Tolerance Techniques,” International Symposium on Defect and Fault Tolerance in VLSI Systems, Vol. 29, pp. 210-218, 1999
[7] J. Laprie, J. Arlat, C. Beounes, and K. Kanoun, “Definition and Analysis of Hardware- and Software-Fault-Tolerant Architectures,” IEEE Trans. on Computer, Vol. 23, pp. 39-51, July, 1990.
[8] C. Bolchini and F. Salice, “A software methodology for detecting hardware faults in VLIW data paths,” Proceedings of International Symposium on Defect and Fault Tolerance in VLSI Systems, pp. 170-175, San Francisco, CA, Oct. 2001.
[9] J. C. Fabre, J. C. L. Y. Deswarte, and D. Powell, “Saturation: Reduced idleness for improved fault-tolerance,” Proceedings of the 18th International Symposium on Fault-Tolerant Computing, pp. 200-205, 1988.
[10] K. Echtle, B. Hinz, and T. Nikolov, “On Hardware fault detection by diverse software,” Proceedings of the 13rd International Conference on fault-tolerant systems and diagnostics, pp. 363-370, 1990.
[11] N. Jouppi, “The nonuniform distribution of instruction-level and machine parallelism and its effect on performance,’’ IEEE Trans. on Computer, Vol. 38, No. 12, pp. 1645-1658, Dec. 1989.
[12] K. H. Huang and J. A. Abraham, “Algorithm-Based Fault Tolerance for Matrix Operations,” IEEE Trans. on Computers, Vol. 33, pp. 518-528, Dec 1984.
[13] D. M. Blough and A. Nicolau, “Fault tolerance in super-scalar and VLIW processors,” Proceedings of IEEE Workshop on Fault Tolerant Parallel and Distributed Systems, pp. 193–200, Amherst, Mass., July 1992.
[14] A. Avizienis, “The n-version approach to fault-tolerant software,” IEEE Trans. on Software Engineering, Vol. 11, pp. 1491-1501, Dec. 1985.
[15] M. A. Schuette and J. P. Shen, “Exploiting instruction-level parallelism for integrated control-flow monitoring,” IEEE Trans. on Computers, Vol. 43, No. 2, pp.129-140, Feb. 1994.
[16] B. Wang and Z. H. Lin, “Formal verification of embedded SoC,” Proceedings of the 4th International ASIC Conference, pp. 769-772, October, 2001.
[17] J. D. Musa, Software Reliability Engineering: More Reliable Software, Faster Development and Testing: McGraw-Hill, 1999.
[18] D. Seal, ARM Architecture Reference Manual. 2nd Edition, Addison-Wesley 2001.
[19] Blackfin Processor Instruction Set Reference, Revision 3.0, June 2004, Analog Devices, Inc.
[20] Qiang Peng; Jin Jing; “H.264 codec system-on-chip design and verification,” Proceedings. 5th International Conference on ASIC, Volume 2, 21-24 Oct. 2003 Page(s):922 - 925 Vol.2
[21] David Cormie, “The ARM11 Microarchitecture” ARM Architecture Reference Manual , April 2002 26, 1996-2000 ARM Limited
[22] J.Y.B. Lee, “Supporting server-level fault tolerance in concurrent-push-based parallel video servers,” IEEE Trans. on Circuits and Systems for Video Technology, Vol. 11, No. 1, pp. 25-39, Jan. 2001.
[23] P. Prata, and J. G. Silva, “Algorithm Based Fault Tolerance versus Result-Checking for Matrix Computations,” Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing, pp. 4-11, 1999
[24] N. Matsukawa, K. Masuda, J. Miyamoto, ”A bipolar-EPROM (BI-EPROM) structure for 3.3 V operation and high speed application,” Proceedings of International Meeting on Electron Devices, pp. 313-316, Dec., 1990.
[25] D. Mahrenholz, O. Spinczyk, W. Schroder-Preikschat, “Program instrumentation for debugging and monitoring with Aspect C++,” Proceedings of the Fifth International Symposium on Object-Oriented Real-Time Distributed Computing, pp. 249-256, May, 2002.
[26] Mahrenholz, D.; Spinczyk, O.; Schroder-Preikschat, W.; “Program instrumentation for debugging and monitoring with AspectC++,” 2002. (ISORC 2002). Proceedings. Fifth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing, 29 April-1 May 2002 Page(s):249 – 256
[27] Ching, P.C.; Cheng, Y.H.; Ko, M.H.; “An in-circuit emulator for TMS320C25,” IEEE Transactions on Education, Volume 37, Issue 1, Feb. 1994 Page(s):51 – 56
[28] The GNU Operating System (2005, June) Available: http://www.gnu.org/