簡易檢索 / 詳目顯示

研究生: 游逸平
Yi-Ping You
論文名稱: 低功率嵌入式處理器之編譯器最佳化研究
Compiler Optimizations on Embedded Processors for Low Power
指導教授: 李政崑
Jenq-Kuen Lee
口試委員:
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2007
畢業學年度: 95
語文別: 英文
論文頁數: 158
中文關鍵詞: 嵌入式處理器低功率編譯器低功率作業系統漏電消耗電源閘控可變電壓排程
外文關鍵詞: Embedded Processors, Power-Aware Compilers, Power-Aware Operating Systems, Leakage Power Dissipation, Power Gating, DVS Scheduling
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現今,有限電力之移動式及嵌入式運算應用相當普遍。因為這些系統受限於有限的電池壽命,因此電力消耗成為系統設計的主要考量因素;換言之,減少耗能是現今軟、硬體開發者重大的挑戰之一。一般而言,在互補金屬氧化物導體電路中,邏輯閘主要的電能消耗是在轉換活動時(當邏輯閘的輸出從 0 變成 1、或從 1 變成 0)發生,但隨著半導體製程技術的演進(電晶體的大小愈來愈小),漏電消耗占全部耗能的比例也愈來愈高。過去,減少處理器耗電或耗能的問題已經在實體線路、邏輯、電路等層面被廣泛地探討與研究,而現今,該問題仍是許多電腦組織架構與系統軟體設計研究的熱門議題。

    在本論文中,我們將介紹兩項利用系統軟體技術達成降低耗能的研究:其一、運用編譯器技術之安插指令到程式中來關閉與開啟功能元件,以降低漏電能源的消耗。亦即,當某一個功能元件呈現「閒置」狀態時便產生電源控制指令,且能有效地擺放這些指令使得安插的指令數最少。其二、單一處理器、或多處理器,甚至是多電源域處理器中作業系統內即時排程的問題,且該處理器具有調整運作電壓及關閉電源之功能。

    實驗模擬結果顯示這兩項研究方法都能有效減少耗能。進一步而言,利用編譯器產生電源閘控控制的方法相對於未使用電源閘控機制的方法,平均而言大約能減少百分之十一點九的處理器整體耗能(包括動態耗能與漏電耗能等),但程式碼大小大約增加百分之二十五,效能大約降低不到百分之一;另一個使用動態電壓調整的即時排程方法相對未使用動態電壓調整機制的方法,實驗結果顯示在三個真實應用程式(CNC、GAP、以及videophone)的正規化耗能分別為百分之五十三點七四、百分之三十七點八一,以及百分之十三點一零,並且當我們將此方法應用到一個特定的安全處理器上時,對於該處理器上的 AES 模組與 RSA 模組則分別可降低約百分之四十五及百分之二十的電源消耗。


    The popularity of power-constrained mobile and embedded computing applications is increasing rapidly. For such systems, power consumption is a key consideration to design goals because of the limited battery lifetime, and reducing power consumption represents a crucial challenge for today's software and hardware developers. In CMOS circuits, power is mainly dissipated in a gate during transitional activities --- when the gate output transits from 0 to 1 or from 1 to 0, but leakage power is representing a greater proportion of total power dissipation as the feature size of semiconductor technology continues to reduce. The problem of reducing processor power/energy consumption has been studied at physical-, logic-, and circuit-level extensively before, and now it is actively studied by architects and system software designers.

    In this thesis we will present investigations on how system software techniques can be used to minimize energy consumption. This includes two approaches: one emphasizes compilation techniques to insert instructions into programs to shut down and wake up function units as appropriate (i.e., not only to generate power-control instructions whenever a function unit goes `idle', but to produce a proper placement for power-control instructions so as to reduce the amount of such instructions) in order to reduce leakage energy consumption, and the other focuses on real-time scheduling problems in operating systems on a single processor (or multiple processors) with the ability to scale their operating supply voltages or even on processors with multiple voltage domains. Simulations demonstrate the effectiveness of these two methodologies. Specifically, the approach using compilers to generate power-gating controls reduces the overall energy consumption, including both dynamic and leakage energy consumption, of a processor by average of 11.9% compared with the one without power-gating mechanism, while the code size growth in terms of the amount of total instructions and the performance degradation are 25% and less than 1% on average, respectively. In the other approach that involves a real-time scheduling with variable supply voltages, the average normalized energy consumption of three real-world applications (CNC, GAP, and videophone) under an eight-voltage-level system is 53.74%, 37.81%, and 13.10%, respectively. Moreover, it has averages of 45% and 20% reduction of AES and RSA modules compared with an ordinary scheduler without energy consideration, respectively, when applying the approach on a specific security processor. We also address the potential issues when combining both compiler and operating system techniques for power management.

    Preface i Acknowledgements iii Contents iv List of Figures viii List of Tables xiii 1 Introduction 1 1.1 Sources of Power Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.1 Switching/Dynamic Power . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.2 Short-Circuit Power . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.3 Leakage Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Subthreshold Leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Gate-oxide Leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Band-to-band-tunneling Leakage . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2 Power Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3 Architectural-Level Designs for Low Power . . . . . . . . . . . . . . . . . . . 13 1.3.1 Dynamic Voltage Scaling . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.2 Multiple Threshold Voltage . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.3 Power Gating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2 Related Work 19 2.1 Leakage Power Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Power Management Using Dynamic Voltage Scaling . . . . . . . . . . . . . . . . . 22 2.3 Collaboration between Compilers and Operating Systems for Power Management . . . 24 3 Compiler-Assisted Power-Gating Control Placement 26 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1.2 Power-Gating Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2 Leakage-Power-Reduction Framework . . . . . . . . . . . . . . . . . . . . . . . 31 3.2.1 Component-Activity Data-Flow Analysis (CADFA) . . . . . . . . . . . . . . 33 3.2.2 Power-Gating-Instruction Scheduling . . . . . . . . . . . . . . . . . . . 40 3.2.3 Sink-N-Hoist Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Sinkable and Grouping-O® Analysis . . . . . . . . . . . . . . . . . . . . 47 Hoistable and Grouping-On Analysis . . . . . . . . . . . . . . . . . . . . 53 Grouping-Switch Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2.4 Power-Gating-Instruction Placement . . . . . . . . . . . . . . . . . . . . 59 3.3 Architecture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.4 Evaluations and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.4.1 Evaluation Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.4.2 Evaluations of CADFA . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.4.3 Evaluations of CADFA with Sink-N-Hoist . . . . . . . . . . . . . . . . . . 73 3.5 Employing Power Gating on Out-Of-Order Issue Processors . . . . . . . . . . . . 77 3.6 Employing Power Gating for Multithreaded Programs . . . . . . . . . . . . . . . 80 3.6.1 May-Happen-in-Parallel Analysis . . . . . . . . . . . . . . . . . . . . . 82 3.6.2 Predicated-Power-Gating Mechanism . . . . . . . . . . . . . . . . . . . . 83 3.6.3 Multithreaded Power-Gating Analysis (MTPGA) . . . . . . . . . . . . . . . 87 4 DVS Techniques for Real-Time Applications 99 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.1.2 Dynamic Voltage Scaling Mechanisms . . . . . . . . . . . . . . . . . . . . 101 4.1.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Task Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Power Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.2 Variable-Voltage Scheduling Approach . . . . . . . . . . . . . . . . . . . . . . 104 4.2.1 Scheduling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.2.2 Slack-Time Computation . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.2.3 Decision Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.2.4 A Running Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.3 Multiple-Level Voltage Scheduling Approach . . . . . . . . . . . . . . . . . . . 115 4.3.1 Extension of Variable-Voltage Scheduling Approach . . . . . . . . . . . . 116 4.3.2 Decision Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.4 Evaluations on Real-World Applications . . . . . . . . . . . . . . . . . . . . . 121 4.4.1 Application Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.4.2 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.5 Evaluations on Security Processors . . . . . . . . . . . . . . . . . . . . . . . 127 4.5.1 Architecture Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Power Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.5.2 Energy-Aware Scheduling Approach . . . . . . . . . . . . . . . . . . . . . 132 Scheduling Method for PEs . . . . . . . . . . . . . . . . . . . . . . . . 133 Scheduling Method for Non-PEs . . . . . . . . . . . . . . . . . . . . . . 135 Iterative Scheduling Method for PEs and Non-PEs . . . . . . . . . . . . . 135 4.5.3 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 5 Conclusion and Future Work 140 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Bibliography 144

    [1] Nevine AbouGhazaleh, Daniel Mosse, Bruce Childers, and Rami Mel-
    hem. Collaborative operating system and compiler power management
    for real-time applications. ACM Transactions on Embedded Computing
    Systems, 5(1):82-115, 2006.
    [2] Nevine AbouGhazaleh, Daniel Mosse, Bruce Childers, Rami Melhem,
    and Matthew Craven. Collaborative operating system and compiler
    power management for real-time applications. In Proceedings of the
    Ninth IEEE Real-Time and Embedded Technology and Applications
    Symposium (RTAS'03), pages 133-141, Washington, D.C., USA, May
    2003.
    [3] Nevine AbouGhazaleh, Daniel Mosse, Bruce Childers, Rami Melhem,
    and Matthew Craven. Energy management for real-time embedded
    applications with compiler support. In Proceedings of the 2003 ACM
    SIGPLAN Conference on Language, Compiler, and Tool for Embedded
    Systems (LCTES'03), pages 284-293, San Diego, California, USA, June
    2003.
    [4] Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Prin-
    ciples, Techniques, and Tools. Addison-Wesley Publishing Company,
    Reading, MA, 1986.
    [5] M. Alidina, J. Monteiro, S. Devadas, A. Ghosh, and M. Papaefthymiou.
    Precomputation-based sequential logic optimization for low power.
    IEEE Transactions on Very Large Scale Integration Systems, 2(4):426-
    436, December 1994.
    [6] ARM. Intelligent Energy Controller Technical Overview, Aug 2004.
    [7] Semiconductor Industry Association. Overview and working group
    summaries. International Technology Roadmap for Semiconductors
    2006 Update, pages 15-16, 2006.
    [8] Semiconductor Industry Association. System drivers. International
    Technology Roadmap for Semiconductors 2006 Update, page 7, 2006.
    [9] Hakan Aydi, Pedro Mejia-Alvarez, Daniel Mosse, and Rami Melhem.
    Dynamic and aggressive scheduling techniques for power-aware real-
    time systems. In Proceedings of the Real-Time Systems Symposium
    (RTSS'01), pages 95-105, London, UK, December 2001.
    [10] Kemal Aygun, Michael J. Hill, Kimberly Eilert, Kaladhar Radhakrish-
    nan, and Alex Levin. Power delivery for high-performance micropro-
    cessors. Intel Technology Journal, 9(4):273-283, 2005.
    [11] Rajkishore Barik. E±cient computation of may-happen-in-parallel in-
    formation for concurrent Java programs. In Proceedings of the Interna-
    tional Workshop on Languages and Compilers for Parallel Computing
    (LCPC'05), Hawthorne, New York, USA, October 2005. Lecture Notes
    in Computer Science, Vol. 4339, Springer Verlag.
    [12] Sanjoy Baruah, Gilad Koren, Decao Mao, Bud Mishra, Arvind Raghu-
    nathan, Louis Rosier, Dennis Shasha, and Fuxing Wang. On the com-
    petitiveness of on-line real-time task scheduling. In Proceedings of the
    12th Real-Time Systems Symposium (RTSS'91), pages 106-115, San
    Antonio, Texas, December 1991.
    [13] Nikolaos Bellas, Ibrahim N. Hajj, and Constantine D. Polychronopou-
    los. Architectural and compiler techniques for energy reduction in high-
    performance microprocessors. IEEE Transactions on Very Large Scale
    Integration Systems, 8(3):317-326, June 2000.
    [14] Luca Benini and G. De Micheli. State assignment for low power dis-
    sipation. IEEE Journal of Solid State Circuits, 30(3):258-268, March
    1995.
    [15] Sanjukta Bhanja and N. Ranganathan. Dependency preserving prob-
    abilistic modeling of switching activity using bayesian networks. In
    Proceedings of the Design Automation Conference (DAC'01), pages
    209-214, Las Vegas, Nevada, June 2001.
    [16] D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A framework for
    architectural-level power analysis and optimizations. In Proceedings of
    the International Symposium on Computer Architecture, pages 83-94,
    Vancouver, Canada, June 2000.
    [17] J. Adam Butts and Gurindar S. Sohi. A static power model for archi-
    tects. In Proceedings of the Annual IEEE/ACM International Sympo-
    sium on Microarchitecture, pages 191-201, Monterey, California, De-
    cember 2000.
    [18] David Callahan and Jaspal Sublok. Static analysis of low level syn-
    chronization. In Proceedings of the 1988 ACM SIGPLAN and SIGOPS
    Workshop on Parallel and Distributed Debugging, pages 100-111, Madi-
    son, Wisconsin, USA, January 1989.
    [19] Anantha Chandrakasan, William J. Bowhill, and Frank Fox. Design of
    High-Performance Microprocessor Circuits. Wiley-IEEE Press, 2000.
    [20] Anantha P. Chandrakasan, Samuel Sheng, and Robert W. Brodersen.
    Low-power CMOS digital design. IEEE Journal of Solid-State Circuits,
    27(4):473-484, 1992.
    [21] Jui-Ming Chang and Massoud Pedram. Register allocation and binding
    for low power. In Proceedings of the Design Automaton Conference,
    pages 29-35, San Francisco, California, USA, June 1995.
    [22] Jui-Ming Chang and Massoud Pedram. Energy minimization using
    multiple supply voltages. IEEE Transactions on Very Large Scale In-
    tegration (VLSI) Systems, 5(4), Dec 1997.
    [23] Rong-Guey Chang, Tyng-Ruey Chuang, and Jenq-Kuen Lee. E±cient
    support of parallel sparse computation for array intrinsic functions of
    Fortran 90. In Proceedings of the ACM International Conference on
    Supercomputing, pages 13-17, Melbourne, Australia, July 1998.
    [24] Rong-Guey Chang, Jia-Shing Li, Tyng-Ruey Chuang, and Jenq Kuen
    Lee. Probabilistic inference schemes for sparsity structures of Fortran
    90 array intrinsics. In Proceedings of the International Conference on
    Parallel Processing, pages 61-68, Valencia, Spain, September 2001.
    [25] Jian-Jia Chen and Tei-Wei Kuo. Procrastination for leakage-aware
    rate-monotonic scheduling on a dynamic voltage scaling processor. In
    Proceedings of ACM SIGPLAN/SIGBED Conference on Languages,
    Compilers, and Tools for Embedded Systems (LCTES'06), pages 153-
    162, Ottawa, Canada, June 2006.
    [26] Peng-Sheng Chen, Yuan-Shin Hwang, Roy Dz-Ching Ju, and
    Jenq Kuen Lee. Interprocedural probabilistic pointer analysis. IEEE
    Transactions on Parallel and Distributed Systems, 15(10):893-907, Oc-
    tober 2004.
    [27] Compaq Computer Corporation. Alpha 21264 Microprocessor Hard-
    ware Reference Manual. 1999.
    [28] V. De and S. Borkar. Technology and design challenges for low power
    and high performance. In Proceedings of the International Symposium
    on Low Power Electronics and Design, pages 163-168, San Diego, Cal-
    ifornia, August 1999.
    [29] Brian Doyle, Reza Arghavani, Doug Barlage, Suman Datta, Mark
    Doczy, Jack Kavalieros, Anand Murthy, and Robert Chau. Transistor
    elements for 30nm physical gate lengths and beyond. Intel Technology
    Journal, 6(2):42-54, May 2002.
    [30] Brian Doyle, Reza Arghavani, Doug Barlage, Suman Datta, Mark
    Doczy, Jack Kavalieros, Anand Murthy, and Robert Chau. Transistor
    elements for 30nm physical gate lengths and beyond. Intel Technology
    Journal, 6(2):42-54, May 2002.
    [31] Steven Dropsho, Volkan Kursun, David H. Albonesi, Sandhya
    Dwarkadas, and Eby G. Friedman. Managing static leakage energy
    in microprocessor functional units. In Proceedings of the 35th Interna-
    tional Symposium on Microarchitecture (MICRO'02), pages 321-332,
    Istanbul, Turkey, November 2002.
    [32] Evelyn Duesterwald and Mary Lou So®a. Concurrency analysis in the
    presence of procedures using a data-°ow framework. In Proceedings of
    the Symposium on Testing, Analysis, and Veri‾cation (TAV'91), pages
    36-48, British Columbia, Canada, October 1991.
    [33] Yunsi Fei, Srivaths Ravi, Anand Raghunathan, and Niraj K. Jha.
    Energy-optimizing source code transformations for OS-driven embed-
    ded software. In Proceedings of the 17th International Conference on
    VLSI Design (VLSID'04), pages 261-266, Mumbai, India, January
    2004.
    [34] Marc Fleischmann. Crusoe power management - reducing the operat-
    ing power with LongRun. In Proceedings of the Hot Chips Symposium
    XII, Palo Alto, California, August 2000.
    [35] Neil Gammage and Geo® Waters. Securing the Smart Network with
    Motorola Security Processors, March 2003.
    [36] M. R. Garey and D. S. Johnson. Computer and Intractability: A Guide
    to the Theory of NP-Completeness. W.H. Freeman & Co., New York,
    NY, 1979.
    [37] Ricardo Gonzalez, Benjamin M. Gordon, and Mark A. Horowitz. Sup-
    ply and threshold voltage scaling for low power CMOS. IEEE Journal
    of Solid-State Circuits, 32(8):1210-1216, 1997.
    [38] Ricardo E. Gonzalez. Xtensa: A con‾gurable and extensible processor.
    IEEE Micro, 20(2):60-70, 2000.
    [39] Flavius Gruian. Hard real-time scheduling for low-energy using stochas-
    tic data and DVS processors. In Proceedings of the International Sym-
    posium on Low-Power Electronics and Design (ISLPED'01), pages 46-
    51, Huntington Beach, California, August 2001.
    [40] Vadim Gutnik and Anantha Chandrakasan. An e±cient controller for
    variable supply-voltage low power processing. In Proceedings of the
    Symposium on VLSI Circuits, pages 158-159, Honolulu, Hawaii, June
    1996.
    [41] G. Hachtel, M. Hermida, M. Poncino A. Pardo, and F. Somenzi. Re-
    encoding sequential circuits to reduce power dissipation. In Proceedings
    of the International Conference on Computer-Aided Design, pages 70-
    73, San Jose, California, November 1994.
    [42] George Hadjiyiannis, Silvina Hanono, and Srinivas Devadas. IDSL: An
    instruction set description language for retargetability. In Proceedings
    of the 34th Conference on Design Automation (DAC'97), pages 299-
    302, Anaheim, California, USA, June 1997.
    [43] Hifn. 7954 security processor Data Sheet, December 2003.
    [44] Inki Hong, Darko Kirovski, Gang Qu, Miodrag Potkonjak, and Mani B.
    Srivastava. Power optimization of variable voltage core-based systems.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits
    and Systems, 18(12):1702-1714, December 1999.
    [45] Inki Hong, Miodrag Potkonjak, and Mani B. Srivastava. On-line
    scheduling of hard real-time tasks on variable voltage processor. In
    Proceedings of the International Conference on Computer-Aided Design
    (ICCAD'98), pages 653-656, San Jose, Califorina, November 1998.
    [46] Inki Hong, Gang Qu, Miodrag Potkonjak, and Mani B. Srivastava.
    Synthesis techniques for low-power hard real-time systems on variable-
    voltage processors. In Proceedings of the Real-Time Systems Sympo-
    sium (RTSS'98), pages 178-187, Madrid, Spain, December 1998.
    [47] J.-H. Hong and C.-W.Wu. Cellular array modular multiplier for the rsa
    public-key cryptosystem based on modi‾ed booth's algorithm. IEEE
    Transactions on VLSI Systems, 11:474-484, 2003.
    [48] M. Horowitz, T. Indermaur, and R. Gonzalez. Low-power digital de-
    sign. In Proceedings of the IEEE Symposium on Low Power Electronics,
    pages 8-11, San Diego, California, USA, October 1994.
    [49] Peter Y. T. Hsu and Edward S. Davidson. Highly concurrent scalar
    processing. In Proceedings of the 13th Annual International Symposium
    on Computer Architecture (ISCA'05), pages 386-395, Tokyo, Japan,
    June 1986.
    [50] Zhigang Hu, Alper Buyuktosunoglu, Viji Srinivasan, Victor V. Zyuban,
    Hans M. Jacobson, and Pradip Bose. Microarchitectural techniques for
    power gating of execution. In Proceedings of the International Sympo-
    sium on Low Power Electronics and Design (ISLPED'04), pages 32-37,
    Newport Beach, California, USA, August 2004.
    [51] Chung-Wen Huang, Young-Chia Lin, Yi-Ping You, Jenq-Kuen Lee, and
    Ting-Ting Hwang. Architecture-level simulations with rapid power es-
    timations for security processors with multiple power domains. In Pro-
    ceedings of Asia and South Paci‾c International Conference on Em-
    bedded SoCs (ASPICES'05), Bangalore, India, July 2005.
    [52] Gwan-Hwan Hwang, Jenq Kuen Lee, and Roy Dz-Ching Ju. A function-
    composition approach to synthesize Fortran 90 array operations. Jour-
    nal of Parallel and Distributed Computing, 54(1):1-47, October 1998.
    [53] Yuan-Shin Hwang, Peng-Sheng Chen, Jenq-Kuen Lee, and Roy Ju.
    Probabilistic points-to analysis. Lecture Notes in Computer Science,
    Languages and Compilers for Parallel Computing (LCPC 2001 Issue),
    2624:290-305, 2003.
    [54] Intel Corporation. Intel XScale core developer's manual. 27347301.pdf,
    December 2000.
    [55] Intel Corporation. Intel Pentium M processor datasheet. 25261202.pdf,
    June 2003.
    [56] Henry Ip, James Low, Peter Y. K. Cheung, George A. Constantinides,
    Wayne Luk, Shay P. Seng, and Paul Metzgen. Strassen's matrix mul-
    tiplication for customisable processors. In Proceedings of the IEEE In-
    ternational Conference on Field-Programmable Technology (FPT'02),
    pages 453-456, Hong Kong, December 2002.
    [57] Tohru Ishihara and Hiroto Yasuura. Voltage scheduling problem for
    dynamically variable voltage processors. In Proceedings of the Interna-
    tional Symposium on Low Power Electronics and Design (ISLPED'98),
    pages 197-202, Monterey, California, August 1998.
    [58] Hailin Jiang, Malgorzata Marek-Sadowska, and Sani R. Nassif. Bene‾ts
    and costs of power-gating technique. In Proceedings of the IEEE Inter-
    national Conference on Computer Design (ICCD'05), pages 559-566,
    San Jose, California, USA, October 2005.
    [59] Robert Jones. Modeling and design techniques reduce 90 nm
    power. EE Times 08/06/2004, 2004. Available online at
    http://www.eetimes.com/showArticle.jhtml?articleID=26806450.
    [60] J. T. Kao and A. P. Chandrakasan. Dual-threshold voltage techniques
    for low-power digital circuits. IEEE Journal of Solid-State Circuits,
    35(7):1009-1018, 2000.
    [61] Tanay Karnik, Shekhar Borkar, and Vivek De. Sub-90nm technologies
    - challenges and opportunities for CAD. In Proceedings of the Inter-
    national Conference on Computer-Aided Design (ICCAD'02), pages
    203-206, San Jose, California, USA, November 2002.
    [62] Stefanos Kaxiras, Zhigang Hu, and Margaret Martonosi. Cache decay:
    Exploiting generational behavior to reduce cache leakage power. In
    Proceedings of the International Symposium on Computer Architecture,
    pages 240-251, Gothenburg, Sweden, June 2001.
    [63] Ali Keshavarzi, Kaushik Roy, and Charles F. Hawkins. Intrinsic leak-
    age in deep submicron CMOS ICsxmeasurement-based test solutions.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
    8(6):717-723, 2000.
    [64] Nam Sung Kim, Todd Austin, David Blaauw, Trevor Mudge, Krisz-
    tian Flautner, Jie S. Hu, Mary Jane Irwin, Mahmut Kandemir, and
    Vijaykrishnan Narayanan. Leakage current: Moore's law meets static
    power. IEEE Computer, 36(12):68-75, 2003.
    [65] Namyun Kim, Minsoo Ryu, Seongsoo Hong, Manas Saksena, Chongho
    Choi, and Heonsik Shin. Visual assessment of a real-time system de-
    sign: A case study on a CNC controller. In Proceedings of the Real-Time
    Systems Symposium (RTSS'96), pages 300-310, Washington, D.C., De-
    cember 1996.
    [66] Woonseok Kim, Jihong Kim, and Sang Lyul Min. A dynamic voltage
    scaling algorithm for dynamic-priority hard real-time systems using
    slack time analysis. In Proceedings of the Design, Automation and Test
    in Europe Conference (DATE'02), pages 788-794, Paris, France, March
    2002.
    [67] Woo-Cheol Kwon and Taewhan Kim. Optimal voltage allocation tech-
    niques for dynamically variable voltage processors. ACM Transactions
    on Embedded Computing Systems, 4(1):211-230, 2005.
    [68] Lawrence L. Lapin. Modern Engineering Statistics. Wadsworth Pub-
    lishing Company, Belmont, CA, 1997.
    [69] Cheol-Hoon Lee and K.G. Shin. On-line dynamic voltage scaling for
    hard real-time systems using the EDF algorithm. In Proceedings of the
    25th IEEE International Real-Time Systems Symposium (RTSS'04),
    pages 319-335, Lisbon, Portugal, December 2004.
    [70] Chingren Lee, Jenq Kuen Lee, Ting-Ting Hwang, and Shi-Chun
    Tsai. Compiler optimizations on VLIW instruction scheduling for low
    power. ACM Transactions on Design Automation of Electronic Sys-
    tems, 8(2):252-268, 2003.
    [71] M.-C. Lee, J.-R. Huang, C.-P. Su, T.-Y. Chang, C.-T. Huang, and C.-
    W. Wu. A true random generator desing. In 13th VLSI Design/CAD
    Symp., Auguest 2002.
    [72] Mike Tien-Chien Lee, Vivek Tiwari, Sharad Malik, and Masahiro Fu-
    jita. Power analysis and minimization techniques for embedded DSP
    software. IEEE Transactions on Very Large Scale Integration Systems,
    5(1):123-133, March 1997.
    [73] Lin Li and Clark Verbrugge. A practical MHP information analysis for
    concurrent Java programs. In Proceedings of the International Work-
    shop on Languages and Compilers for Parallel Computing (LCPC'04),
    pages 194-208, West Lafayette, Indiana, USA, September 2004. Lec-
    ture Notes in Computer Science, Vol. 3602, Springer Verlag.
    [74] Sheng-Chih Lin, Anirban Basu, Ali Keshavarzi, Vivek De, Amit Mehro-
    tra, and Kaustav Banerjee.
    [75] T.-F. Lin, C.-P. Su, C.-T. Huang, and C.-W. Wu. A high-throughput
    low-cost aes cipher chip. In 3rd IEEE Asia-Paci‾c Conf. ASIC, Au-
    guest 2002.
    [76] Yung-Chia Lin, Chia Han Lu, Chung-JuWu, Chung-Lin Tang, Yi-Ping
    You, Ya-Chiao Moo, and Jenq Kuen Lee. E®ective code generation for
    distributed and ping-pong register ‾les: a case study on PAC VLIW
    DSP cores. The Journal of VLSI Signal Processing Systems, accepted.
    [77] Yung-Chia Lin, Chung-Lin Tang, Chung-Ju Wu, Ming-Yu Hung, Yi-
    Ping You, Ya-Chiao Moo, Sheng-Yuan Chen, and Jenq Kuen Lee. Com-
    piler supports and optimizations for PAC VLIW DSP processors. Lec-
    ture Notes in Computer Science, Languages and Compilers for Parallel
    Computing (LCPC 2005 Issue), 4339, 2005.
    [78] Yung-Chia Lin, Yi-Ping You, Chung-Wen Huang, Jenq-Kuen Lee, Wei-
    Kuan Shih, and Ting-Ting Hwang. Power-aware scheduling for parallel
    security processors with analytical models. Lecture Notes in Computer
    Science, Languages and Compilers for Parallel Computing (LCPC 2004
    Issue), 3602:470-484, 2005.
    [79] Yung-Chia Lin, Yi-Ping You, and Jenq Kuen Lee. Register allocation
    for VLIW DSP processors with irregular register ‾les. In Proceedings
    of Compilers for Parallel Computing (CPC'06), A Courna, Spain, Jan-
    uary 2005.
    [80] Yung-Chia Lin, Yi-Ping You, and Jenq Kuen Lee. PALF: Compiler
    supports for irregular register ‾les in clustered VLIW processors. Con-
    currency and Computation: Practice and Experience, Special Issue on
    CPC 2006), accepted.
    [81] Chung Laung Liu and James W. Layland. Scheduling algorithms for
    multiprogramming in a hard read-time environment. Journal of the
    ACM, 20(1):46-61, 1973.
    [82] C. Douglas Locke, David R. Vogel, and Thomas J. Mesler. Building a
    predictable avionics platform in Ada: A case study. In Proceedings of
    the IEEE Real-Time Systems Symposium (RTSS'91), pages 181-189,
    San Antonio, Texas, December 1991.
    [83] Ali Manzak and C. Chakrabarti. Variable voltage task scheduling
    for minimizing energy or minimizing power. In Proceedings of the
    International Conference on Acoustics, Speech and Signal Processing
    (ICASSP'00), pages 3239-3242, Istanbul, Turkey, June 2000.
    [84] Radu Marculescu, Diana Marculescu, and Massoud Pedram. Prob-
    abilistic modeling of dependencies during switching activity analysis.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits
    and Systems, 17(2):73-83, 1998.
    [85] Stephen P. Masticola and Barbara G. Ryder. Non-concurrency analysis.
    In Proceedings of the fourth ACM SIGPLAN Symposium on Principles
    and Practice of Parallel Programming (PPoPP'93), pages 129-138, San
    Diego, California, USA, May 1993.
    [86] Huzefa Mehta, Manjit Borah, Robert Michael Owens, and Mary Jane
    Irwin. Accurate estimation of combinational circuit activity. In Pro-
    ceedings of the Design Automation Conference (DAC'95), pages 618-
    622, San Francisco, CA, June 1995.
    [87] Pedro Mejia-Alvarez, Eugene Levner, and Daniel Mosse. Adaptive
    scheduling server for power-aware real-time tasks. ACM Transactions
    on Embedded Computing Systems, 3(2):284-306, 2004.
    [88] Jose Monteiro, Srinivas Devadas, and Bill Lin. A methodology for
    e±cient estimation of switching activity in sequential logic circuits.
    In Proceedings of the Design Automation Conference (DAC'94), pages
    12-17, San Diego, California, June 1994.
    [89] Daniel Mosse, Hakan Aydin, Bruce Childers, and Rami Melhem.
    Compiler-assisted dynamic power-aware scheduling for real-time ap-
    plications. In Proceedings of the Workshop on Compiler and OS for
    Low Power (COLP'00), Philadelphia, PA, October 2000.
    [90] Saibal Mukhopadhyay and Arijit Raychowdhury andKaushik Roy. Ac-
    curate estimation of total leakage current in scaled CMOS logic circuits
    based on compact current modeling. In Proceedings of the 40th Con-
    ference on Design Automation (DAC'03), pages 169-174, Anaheim,
    California, USA, June 2003.
    [91] Farid Najm. Transition density: A new measure of activity in digital
    circuits. IEEE Transcations on Computer-Aided Design of Integrated
    Circuits and Systems, 12(2):310-323, 1993.
    [92] Won Namgoong, Mengchen Yu, and Teresa Meng. A high-e±ciency
    variable-voltage CMOS dynamic DC-DC switching regulator. In Pro-
    ceedings of the IEEE International Solid-State Circuits Conference
    (ISSCC'97), pages 380-381, San Francisco, CA, February 1997.
    [93] Ripal Nathuji, Balasubramanian Seshasayee, and Karsten Schwan.
    Combining compiler and operating system support for energy e±-
    cient I/O on embedded platforms. In Proceedings of the 9th Inter-
    national Workshop on Software and Compilers for Embedded Systems
    (SCOPES'05), pages 80-90, Dallas, Texas, USA, September 2005.
    [94] Gleb Naumovich and George S. Avrunin. A conservative data °ow
    algorithm for detecting all pairs of statements that may happen in
    parallel for rendezvous-based concurrent programs. In Proceedings of
    the 6th ACM SIGSOFT Symposium on the Foundations of Software
    Engineering (FSE'98), pages 24-34, Lake Buena Vista, Florida, USA,
    November 1998.
    [95] Gleb Naumovich, George S. Avrunin, and Lori A. Clarke. An e±-
    cient algorithm for computing MHP information for concurrent Java
    programs. In 7th ACM SIGSOFT Symposium on the Foundations of
    Software Engineering (FSE'99), volume 1687 of Lecture Notes in Com-
    puter Science, pages 338-354. Springer, 1999.
    [96] Trevor Pering, Tom Burd, and Robert Brodersen. The simulation and
    evaluation of dynamic voltage scaling algorithms. In Proceedings of
    the International Symposium on Low Power Electronics and Design
    (ISLPED'98), pages 76-81, Monterey, California, May 1998.
    [97] Padmanabhan Pillai and Kang G. Shin. Real-time dynamic voltage
    scaling for low-power embedded operating systems. In Proceedings of
    the 18th ACM Symposium on Operating Systems Principles (SOSP'01),
    pages 89-102, Ban®, Canada, October 2001.
    [98] M.D. Powell, S-H. Yang, B. Falsa, K. Roy, and T.N. Vijaykumar.
    Gated-Vdd: A circuit technique to reduce leakage in deep-submicron
    cache memories. In Proceedings of the ACM/IEEE International Sym-
    posium on Low Power Electronics and Design, pages 90-95, Rapallo,
    Italy, August 2000.
    [99] S. C. Prasad and K. Roy. Circuit activity driven multilevel logic
    optimization for low power reliable operation. In Proceedings of the
    EDAC'93 EURO-ASIC, pages 368-372, Paris, France, February 1993.
    [100] Gang Quan and Xiaobo (Sharon) Hu. Energy e±cient ‾xed-priority
    scheduling for real-time systems on variable voltage processors. In Pro-
    ceedings of the Design Automation Conference (DAC'01), pages 828-
    833, Las Vegas, Nevada, June 2001.
    [101] Daler Rakhmatov, Sarma Vrudhula, and Chaitali Chakrabarti.
    Battery-conscious task sequencing for portable devices including volt-
    age/clock scaling. In Proceedings of the Design Automation Conference
    (DAC'02), pages 189-194, New Orleans, Louisiana, June 2002.
    [102] Siddharth Rele, Santosh Pande, Soner Onder, and Rajiv Gupta. Opti-
    mizing static power dissipation by functional units in superscalar pro-
    cessors. In Proceedings of the 11th International Conference on Com-
    piler Construction (CC'02), pages 261-275, Grenoble, France, April
    2002.
    [103] K. Roy and S. C. Prasad. SYCLOP: Synthesis of CMOS logic for low
    power applications. In Proceedings of the IEEE International Confer-
    ence on Computer Design, pages 464-467, Cambridge, Massachusetts,
    USA, October 1992.
    [104] Kaushik Roy, Saibal Mukhopadhyay, and Hamid Mahmoodi-Meimand.
    Leakage current mechanisms and leakage reduction techniques in deep-
    submicrometer CMOS circuits. Proceedings of the IEEE, 91(2):305-
    327, 2003.
    [105] Marcus T. Schmitz, Bashir M. Al-Hashimi, and Petru Eles. Energy-
    e±cient mapping and scheduling for dvs enabled distributed embedded
    systems. In Proceedings of the Design, Automation and Test in Europe
    Conference (DATE'02), pages 514-521, Paris, France, March 2002.
    [106] Wei-Kuan Shih and Jane W. S. Liu. On-line scheduling of impre-
    cise computations to minimize error. SIAM Journal on Computing,
    25(5):1105-1121, 1996.
    [107] Wei-Kuan Shih, Jane W. S. Liu, and Jen-Yao Chung. Algorithms
    for scheduling imprecise computations with timing constraints. SIAM
    Journal on Computing, 20(3):537-552, 1991.
    [108] Dongkun Shin, Jihong Kim, and Seongsoo Lee. Intra-task voltage
    scheduling for low-energy hard real-time applications. IEEE Design
    and Test of Computers, 18(2):20-30, 2001.
    [109] Youngsoo Shin and Kiyoung Choi. Power conscious ‾xed priority
    scheduling for hard real-time systems. In Proceedings of the De-
    sign Automation Conference (DAC'99), pages 134-139, New Orleans,
    Louisiana, June 1999.
    [110] Michael D. Smith. The SUIF Machine Library. Division of of Engi-
    neering and Applied Science, Harvard University, 1998.
    [111] Stanford Compiler Group. The SUIF Library. Stanford Compiler
    Group, Stanford University, 1995.
    [112] C.-P. Su, T.-F. Lin, C.-T. Huang, and C.-W. Wu. A highly e±cient
    aes cipher chip. In ASP-DAC, January 2003.
    [113] C.-Y. Su, S.-A. Hwang, P.-S. Chen, and C.-W. Wu. An improved
    montgomery algorithm for high-speed rsa public-key cryptos ystem.
    IEEE Transactions on VLSI Systems, 7:280-284, 1999.
    [114] Ching-Long Su and Alvin M. Despain. Cache designs for energy e±-
    ciency. In Proceedings of the 28th Annual Hawaii International Con-
    ference on System Sciences, pages 306-315, Los Angeles, California,
    USA, January 1995.
    [115] Vishnu Swaminathan and Krishnendu Chakrabarty. Real-time task
    scheduling for energy-aware embedded systems. In Proceedings of
    the IEEE Real-Time Systems Symposium (RTSS'00) Work-In-Progress
    Sessions, Orlando, Florida, November 2000.
    [116] Vishnu Swaminathan and Krishnendu Chakrabarty. Investigating the
    e®ect of voltage-switching on low-energy task scheduling in hard real-
    time systems. In Proceedings of the Asia South Paci‾c Design Automa-
    tion Conference (ASP-DAC'01), pages 251-254, Yokohama, Japan,
    January 2001.
    [117] Richard N. Taylor. Complexity of analyzing the synchronization struc-
    ture of concurrent programs. Acta Informatica, 19:57-84, 1983.
    [118] Scott Thompson, Paul Packan, and Mark Bohr. MOS scaling: Tran-
    sistor challenges for the 21st century. Intel Technology Journal, Q3,
    1998.
    [119] V. Tiwari, R. Donnelly, S. Malik, and R. Gonzalez. Dynamic power
    management for microprocessors: A case study. In Proceedings of the
    International Conference on VLSI Design, pages 185-192, Hyderabad,
    India, January 1997.
    [120] V. Tiwari, D.Singh, S. Rajgopal, G. Mehta, R. Patel, and F. Baez.
    Reducing power in high-performance microprocessors. In Proceedings
    of the Design Automaton Conference, pages 732-737, San Francisco,
    California, USA, June 1998.
    [121] C.Y. Tsui, M. Pedram, and A.M. Despain. Technology decomposition
    and mapping targeting low power dissipation. In Proceedings of the
    Design Automation Conference, pages 68-73, Dallas, Texas, June 1993.
    [122] Hiroshi Tsutsui, Takahiko Masuzaki, Tomonori Izumi, Takao Onoye,
    and Yukihiro Nakamura. High speed JPEG2000 encoder by con‾g-
    urable processor. In Proceedings of the IEEE Asia Paci‾c Conference
    on Circuits and Systems (APCCAS'02), pages 45-50, Singapore, De-
    cember 2002.
    [123] Harry J. M. Veendrick. Short-circuit dissipation of static CMOS cir-
    cuitry and its impact on the design of bu®er circuits. IEEE Journal
    OF Solid-State Circuits, 19(4):468-473, 1984.
    [124] M.-Y. Wang, C.-P. Su, C.-T. Huang, and C.-W. Wu. An hmac proces-
    sor with integrated sha-1 and md5 algorithms. In ASP-DAC, January
    2004.
    [125] ChiWu, Kun-Yuan Hsieh, Yung-Chia Lin, Chung-JuWu, Wen-Li Shih,
    S. C. Chen, Chung-Kai Chen, Chien-Ching Huang, Yi-Ping You, and
    Jenq Kuen Lee. Integrating compiler and system toolkit °ow for em-
    bedded VLIW DSP processors. In Proceedings of the 12th IEEE Inter-
    national Conference on Embedded and Real-Time Computing Systems
    and Applications (RTCSA'06), pages 215-222, Sydney, Australia, Au-
    gust 2006.
    [126] Frances Yao, Alan Demers, and Scott Shenker. A scheduling model
    for reduced CPU energy. In Proceedings of the 36th Annual Sympo-
    sium on Foundations of Computer Science (FOCS'95), pages 374-382,
    Milwaukee, Wisconsin, October 1995.
    [127] Yi-Ping You, Chung-Wen Huang, and Jenq Kuen Lee. A sink-n-hoist
    framework for leakage power reduction. In Proceedings of the ACM
    International Conference on Embedded Software (EMSOFT'05), pages
    124-133, Jersey City, New Jersey, USA, September 2005.
    [128] Yi-Ping You, Ching-Ren Lee, Jenq-Kuen Lee, and Wei-Kuan Shih.
    Real-time task scheduling for dynamically variable voltage processors.
    In Proceedings of the IEEE Workshop on Power Management for Real-
    Time and Embedded Systems, pages 5-10, Taipei, Taiwan, May 2001.
    [129] Yi-Ping You, Chingren Lee, and Jenq Kuen Lee. Compiler analysis and
    supports for leakage power reduction on microprocessors. Lecture Notes
    in Computer Science, Languages and Compilers for Parallel Computing
    (LCPC 2002 Issue), 2481:45-60, 2005.
    [130] Yi-Ping You, Chingren Lee, and Jenq Kuen Lee. Compilers for leakage
    power reduction. ACM Transactions on Design Automation of Elec-
    tronic Systems, 11(1):147-164, January 2006.
    [131] Yi-Ping You and Jenq Kuen Lee. Compiler frameworks for leak-
    age power reduction. In Student Poster Session of ACM SIG-
    PLAN/SIGBED 2005 Conference on Languages, Compilers, and Tools
    for Embedded Systems (LCTES'05), Chicago, Illinois, USA, June 2005.
    [132] Yi-Ping You, Chun-Yen Tseng, Yu-Hui Huang, Po-Chiun Huang,
    TingTing Hwang, and Sheng-Yu Hsu. Low-power techniques for net-
    work security processors. In Proceedings of the 10th Asia and South
    Paci‾c Design Automation Conference (ASP-DAC'05), pages 355-360,
    Shanghai, China, January 2005.
    [133] Chien-Cheng Yu, Wei-Ping Wang, and Bin-Da Liu. A new level con-
    verter for low-power applications. In The 2001 IEEE International
    Symposium on Circuits and Systems, 2001, pages 113-116, May 2001.
    [134] W. Zhang, Mahmut T. Kandemir, Narayanan Vijaykrishnan,
    Mary Jane Irwin, and V. De. Compiler support for reducing leak-
    age energy consumption. In Proceedings of the 6th Design Automation
    and Test in Europe Conference (DATE'03), pages 1146-1147, Messe
    Munich, Germany, March 2003.
    [135] Yumin Zhang, Xiaobo (Sharon) Hu, and Danny Z. Chen. Task schedul-
    ing and voltage selection for energy minimization. In Proceedings of the
    Design Automation Conference (DAC'02), pages 183-188, New Or-
    leans, Louisiana, June 2002.
    [136] V. Zivojnovic, J. Martinez, C. Schlager, and H. Meyr. DSPstone:
    A DSP-oriented benchmarking methodology. In Proceedings of the
    International Conference on Signal Processing and Technology (IC-
    SPAT'94), pages 715-720, Dallas, Texas, USA, October 1994.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE