研究生: |
周宣明 Chou, Hsuan Ming |
---|---|
論文名稱: |
考慮多功率模式及匯流排亂序交易的容錯設計最佳化 Optimizations for Error-Tolerant Designs Considering Multi-Power Modes and Out-of-Order Transactions |
指導教授: |
張世杰
Chang, Shih Chieh |
口試委員: |
金仲達
King, Chung Ta 王廷基 Wang, Ting Chi 鍾文邦 Jone, Wen Ben 林泰吉 Lin, Tay Jyi |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 英文 |
論文頁數: | 86 |
中文關鍵詞: | 時脈偏移 、軟錯誤 、匯流排死結 |
外文關鍵詞: | clock skew, soft error, bus deadlock |
相關次數: | 點閱:4 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
為了滿足低功率和高效率的需求,現今設計支援多功率模式及匯流排亂序交易,然而,也造成了時序錯誤、軟錯誤和死結問題。為了有效率的容錯及避免這些問題,我們提出了閘層級和計算機結構層級的最佳化方法。
第一部分,針對多功率模式的設計,我們提出了閘層級時序最佳化的方法。我們使用可調式延遲緩衝器建立一個可調式時脈樹,使得可以在不同功率模式之下指定有用的時脈偏移。我們使用線性規劃指定了不同功率模式之下的可調式延遲緩衝器的延遲,我們也提出了一個加速理論加速線性規劃的速度,最後,我們提出了一個有效率的方法選擇可調式延遲緩衝器的位置。
第二部分,我們提出了一個雙層級的軟錯誤容錯方法論,為不同應用妥協效率、功率和可靠性。我們提出了四種不同容錯能力的正反器設計,接著,我們提出雙層級的易受傷度分析,可以找出容易受到程式錯誤影響的正反器,最後,我們提出一個最佳化流程去擺放這些有容錯能力正反器的位置。
第三部分,針對匯流排亂序交易的設計,我們提出了計算機結構層級的死結避免機制。我們提供新穎的辨識號碼指定方法並且保證不會發生死結,更靈活的規則也被提出用以指定辨識號碼,借此大幅度的降低匯流排的暫停狀況。
To satisfy low-power and high-performance requirements, modern designs support advanced features such as multi-power modes and out-of-order transactions. However, these features may lead to occurrences of timing errors, soft errors, and deadlocks. To efficiently tolerate or avoid the occurrences of these errors and deadlocks, we propose several gate-level and architecture-level optimization methods.
First, we propose a gate-level timing optimization for multi-power mode designs. We use Adjustable Delay Buffers (ADBs) to construct a tunable clock tree so that useful skew can be assigned for different power modes. Then, we assign the delays of the ADBs for each power mode by Linear Programming (LP). A speedup theorem is proposed to greatly reduce the inequalities for the LP. We also propose an efficient heuristic to select the positions of ADBs.
Second, we present a dual-level soft-error tolerant design methodology to trade off performance, power, and reliability for different applications. Four novel detection and correction Flip-Flop (FF) structures are proposed to provide different levels of tolerance capability against soft errors. Then, architecture-level vulnerability analysis and gate-level susceptibility analysis are employed to identify weak FFs that can easily cause program execution errors. An optimization framework is developed to synthesize the proposed four novel FF structures into weak and highly-observable storage bits.
Third, we develop architecture-level deadlock-free mechanisms for the designs supporting out-of-order transactions. We provide a novel ID assignment mechanism which guarantees the issued transactions to be deadlock-free. Flexible rules are also presented for the ID assignment problem to greatly reduce the number of transaction stalls.
[1] G. Asadi and M.B. Tahoori, "An analytical approach for soft error rate estimation in digital circuits," in Proc. IEEE International Symposium on Circuits and Systems, 2005.
[2] AMBA AXI Protocol Specification. (2010) [Online]. Available: http://www.arm.com
[3] N.D.P. Avirneni, V. Subramanian, and A.K. Somani, "Low overhead soft error mitigation techniques for high-performance and aggressive systems," in Proc. IEEE/IFIP International Conference on Dependable Systems & Networks, 2009.
[4] B. Berger and P.W. Shor, “Approximation Algorithms for the Maximum Acyclic Subgraph Problem,” in Proc. ACM/SIAM Symposium on Discrete Algorithms (SODA), 1990.
[5] C.Y. Chang and K.J. Lee, “On Deadlock Problem of On-Chip Buses Supporting Out-of-Order Transactions,” IEEE Trans. Very Large Scale Integration System, vol. 22, pp. 484–496, 2014.
[6] T. H. Chao, Y. C. Hsu, J. M. Ho, A.B. Kahng, “Zero skew clock routing trees with minimum wirelength,” IEEE Trans. Circuits and Systems II: Analog and Digital Signal Processing, vol. 39, pp. 799-814, 1992.
[7] C.C.N Chu, D.F Wong, “An efficient and optimal algorithm for simultaneous buffer and wire sizing,” IEEE Trans. Computer-Aided Design of Integrated Circuits and system, vol.18, pp. 1297-1304, 1999.
[8] R.B. Deokar, S.S. Sapatnekar, “A graph-theoretic approach to clock skew optimization,” in Proc. IEEE International Symposium on Circuits and Systems, 2002.
[9] D. Ernst et al., "Razor: a low-power pipeline based on circuit-level timing speculation," in Proc. IEEE/ACM International Symposium on Microarchitecture (MICRO), 2003.
[10] M.A. Finn, "System effects of single event upsets," in Proc. Computers in Aerospace Conference, 1989.
[11] J.P. Fishburn, “Clock skew optimization,” IEEE Trans. Computers, Vol. 39, pp. 945-951, 1990.
[12] M.R. Guthaus, J.S. Ringenberg, D. Ernst, T.M. Austin, T. Mudge and R.B. Brown, “MiBench: A Free, Commercially Representative Embedded Benchmark Suite,” in Proc. International Workshop on Workload Characterization, 2001.
[13] D.F. Heidel et al., "Alpha-particle-induced upsets in advanced CMOS circuits and technology," IBM Journal of Research and Development, 2008.
[14] S. Held, B. Korte, J. Massberg, M. Ringe, J. Vygen, “Clock scheduling and clock tree construction for high performance ASICs,” in Proc. International Conference on Computer Aided Design, 2004.
[15] Y.-M. Hsiao, T.-J. Lo, Y.-S. Chu, and S.-W. Lo, "Low power 32-bit UniRISC with Power Block Manager," in Proc. IEEE Asia Pacific Conference on Circuits and Systems, 2008.
[16] B. Kankane, S. Sharma, N. R. Rizvi, "Delay error with meta-stability detection and correction using CMOS transmission logic," International Journal of VLSI design & Communication Systems (VLSICS), vol. 3, pp. 23, 2012.
[17] T. Karnik, P. Hazucha, J. Patel, "Characterization of soft errors caused by single event upsets in CMOS processes," IEEE Trans. on Dependable and Secure Computing, vol. 1, pp. 128-143, 2004.
[18] V. Khandelwal, A. Srivastava, “Variability-driven formulation for simultaneous gate sizing and postsilicon tunability allocation,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 27, pp. 610-620, 2008.
[19] N.A. Kurd, J.S. Barkarullah, R.O. Dizon, T.D. Fletcher, P.D. Madland, “A multigigahertz clocking scheme for Pentium 4 microprocessor,” IEEE Journal of Solid-State Circuits, vol. 36, pp. 1647-1653, 2001.
[20] H. Lee, S. Paik, Y. Shin, “Pulse width allocation with clock skew scheduling for optimizing pulsed latch-based sequential circuits,” in Proc. IEEE/ACM International Conference on Computer-Aided Design, 2008.
[21] D. Li, P.I-J. Chuang, D. Nairn, and M. Sachdev, "Design and analysis of metastable-hardened flip-flops in sub-threshold region", in Proc. International Symposium on Low Power Electronics and Design (ISLPED), 2011.
[22] D. Li, D. Rennie, P. Chuang, D. Nairn, and M. Sachdev, "Design and analysis of metastable-hardened and soft-error tolerant high-performance, low-power flip-flops," in Proc. International Symposium on Quality Electronic Design (ISQED), 2011.
[23] X. Li, S.V. Adve, P. Bose, and J.A Rivers, "Online estimation of architectural vulnerability factor for soft errors," in Proc. of International Symposium on Computer Architecture (ISCA), 2008.
[24] LogiCORE AXI Interconnect IP, Xilinx, U.S., 2010.
[25] C. L. Lung, H. C. Hsiao, Z. Y. Zeng, and S. C. Chang, “LP-based multi-mode multi-corner clock skew optimization,” in Proc. International Symposium on VLSI Design Automation and Test (VLSI-DAT), 2010.
[26] C. L. Lung, Z. Y. Zeng, C. H. Chou, and S. C. Chang, “Clock skew optimization considering complicated power modes,” in Proc. Design, Automation and Test in Europe Conference and Exhibition, 2010.
[27] R.E. Lyons and W. Vanderkulk, "The use of triple-modular redundancy to improve computer reliability," IBM Journal of Research and Development, vol. 6, pp. 200-209, 1962.
[28] A. Mahmood and E. J. McCluskey, “Concurrent error detection using watchdog processors—A survey,” IEEE Trans. Comput., vol. 37, no. 2, pp. 160–174, Feb. 1988.
[29] P. Mahoney, E. Fetzer, B. Doyle, S. Naffziger, “Clock distribution on a dual-core multi-thread Itanium-family processor,” in Proc. IEEE International Solid- State Circuits Conference, 2005.
[30] T. C. May and M. H. Woods, “A new physical mechanism for soft errors in dynamic memories,” in Proc. Rel. Phys. Symp., Apr. 1978, pp. 33–40.
[31] N. Miskov-Zivanov and D. Marculescu, “MARS-C: Modeling and reduction of soft errors in combinational circuits,” in Proc. Design Autom. Conf., Jul. 2006, pp. 767–772.
[32] S. Mitra, M. Zhang, S. Waqas, N. Seifert, B. Gill, and K. S. Kim, “Combinational logic soft error correction,” in Proc. IEEE Int. Test Conf. (ITC), Oct. 2006, pp. 1–9.
[33] P. Mongkolkachit and B. Bhuva, “Design technique for mitigation of alpha-particle-induced single-event transients in combinational logic,” IEEE Trans. Device Mater. Rel., vol. 3, no. 3, pp. 89–92, Sep. 2003.
[34] S. S. Mukherjee, C. Weaver, J. Emer, S. K. Reinhardt, and T. Austin, “A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor,” in Proc. IEEE/ACM Int. Symp. Microarchit. (MICRO), Dec. 2003, pp. 29–40.
[35] A. A. Nair, S. Eyerman, L. Eeckhout, and L. K. John, “A first-order mechanistic model for architectural vulnerability factor,” in Proc. Int. Symp. Comput. Archit. (ISCA), Jun. 2012, pp. 273–284.
[36] V. Nawale, T.W. Chen, “Optimal useful clock skew scheduling in the presence of variations using robust ILP formulations,” in Proc. IEEE/ACM International Conference on Computer-Aided Design, 2006.
[37] K. Nagaraj, S. Kundu, “A study on placement of post silicon clock tuning buffers for mitigating impact of process variation,” in Proc. Design, Automation and Test in Europe Conference and Exhibition, 2009.
[38] J.L. Neves, E.G. Friedman, “Optimal clock skew scheduling tolerant to process variations,” in Proc. Design Automation Conference, 1996.
[39] M. Nicolaidis, “Time redundancy based soft-error tolerance to rescue nanometer technologies,” in Proc. IEEE VLSI Test Symp., Apr. 1999, pp. 86–94.
[40] NSCU EDA Wiki. [Online]. http://www.eda.ncsu.edu/wiki/NCSU_EDA_Wiki, accessed Jan. 2, 2014.
[41] Open Core Protocol Specification, (2009) [Online]. Available: http://www.ocpip.org
[42] Pin - A Dynamic Binary Instrumentation Tool. (2012) [Online]. https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool
[43] D. Pisinger, “Algorithms for knapsack problems,” Ph.D. dissertation, Dept. Comput. Sci., Univ. Copenhagen, Feb. 1995.
[44] A. Rajaram, D.Z. Pan, “Robust chip-level clock tree synthesis for SOC designs,” in Proc. ACM/IEEE Design Automation Conference, 2008.
[45] R. R. Rao, K. Chopra, D. Blaauw, and D. Sylvester, “An efficient static algorithm for computing the soft error rates of combinational circuits,” in Proc. Design, Autom. Test Eur. (DATE), Mar. 2006, pp. 1–6.
[46] K. Ravindran, A. Kuehlmann, E. Sentovich, “Multi-domain clock skew scheduling,” in Proc. International Conference on Computer Aided Design, 2003.
[47] S.S. Sapatnekar, “RC interconnect optimization under the Elmore delay model,” in Proc. Design Automation conference, 1994.
[48] W. Sheng, L. Xiao, and Z. Mao, “Soft error optimization of standard cell circuits based on gate sizing and multi-objective genetic algorithm,” in Proc. ACM/IEEE Design Autom. Conf., Jul. 2009, pp. 502–507.
[49] A. Silberschatz, P. B. Galvin, and G. Gagen, Operating System Concepts, 7th ed. New York, USA: Wiley, 1993.
[50] N.K. Soundararajan, A. Parashar, and A. Sivasubramaniam, “Mechanisms for bounding vulnerabilities of processor structures,” in Proc. Int. Symp. Comput. Archit. (ISCA), 2007, pp. 506–515.
[51] V. Sridharan and D. R. Kaeli, “Using hardware vulnerability factors to enhance AVF analysis,” in Proc. Int. Symp. Comput. Archit. (ISCA), 2010, pp. 461–472.
[52] Y. S. Su, W. K. Hon, C. C. Yang, S. C. Chang, Y. J. Chang, “Value assignment of adjustable delay buffers for clock skew minimization in multi-voltage mode designs,” in Proc. IEEE/ACM International Conference on Computer-Aided Design, 2009.
[53] Technical Reference Manual of CoreLink™ NIC-400 Network Interconnect, ARM, U.K., 2012.
[54] J.L. Tsai, D.H. Baik, C.C.-P. Chen, K.K. Saluja, “A yield improvement methodology using pre- and post-silicon statistical clock scheduling,” in Proc. IEEE/ACM International Conference on Computer-Aided Design, 2004.
[55] J.L. Tsai, T. H. Chen, C.C.-P. Chen, “Zero skew clock-tree optimization with buffer insertion/sizing and wire sizing,” IEEE Trans. Computer-Aided Design of Integrated Circuits and System, vol. 23, pp. 565-572, 2004.
[56] J.L. Tsai, L. Zhang, C.C. Chen, “Statistical timing analysis driven post-silicon-tunable clock-tree synthesis,” in Proc. IEEE/ACM International Conference on Computer-Aided Design, 2005.
[57] K.R. Walcott, G. Humphreys, and S. Gurumurthi, “Dynamic prediction of architectural vulnerability from microarchitectural state,” in Proc. Int. Symp. Comput. Archit. (ISCA), 2007, pp. 516–527.
[58] x264 free software library. (2013) [Online]. http://www.videolan.org/developers/x264.html
[59] X. Xu and M.-L. Li, “Understanding soft error propagation using efficient vulnerability-driven fault injection,” in Proc. IEEE/IFIP Int. Conf. Dependable Syst. Netw. (DSN), Jun. 2012, pp. 1–12.
[60] L. Xun, M. C. Papaefthymiou, E. G. Friedman, “Maximizing performance by retiming and clock skew scheduling,” in Proc. Design Automation Conference, 1999.
[61] M. Zhang and N. R. Shanbhag, “Soft-error-rate-analysis (SERA) methodology,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25, no. 10, pp. 2140–2155, Oct. 2006.
[62] B. Zhang, W.-S. Wang, and M. Orshansky, “FASER: Fast analysis of soft error susceptibility for cell-based designs,” in Proc. Int. Symp. Qual. Electron. Design (ISQED), Mar. 2006, pp. 754–760.
[63] Q. Zhou and K. Mohanram, “Gate sizing to radiation Harden combinational logic,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25, no. 1, pp. 155–166, Jan. 2006.