簡易檢索 / 詳目顯示

研究生: 劉元明
Albert Y.-M. Liu
論文名稱: 乘加模組產生器之研製
A Multiply-And-Accumulate Module Generator
指導教授: 林永隆
Youn-Long Lin
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2000
畢業學年度: 88
語文別: 英文
論文頁數: 32
中文關鍵詞: 乘加器乘法器加法器數位訊號處理
外文關鍵詞: MAC, Multiplier, Booth Encoding, Sign-extension Prevention, Wallace addtion tree, DSP, Accumulator, Multiply-and-Accumulate
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在大部分的 DSP 應用中,Multiply-And-Accumulate (MAC) 是最常使用到的運算之一。我們提出了利用軟體的方法來自動產生高效能、可合成之 Verilog 硬體描述語言的 MAC 模組。這套軟體結合了多種新穎的技術,包括:radix-4 Booth encoding、three dimensional method、sign-extension prevention scheme 以及結合了 carry-select 和 carry-look-ahead 的加法器。除此之外,這套軟體還提供多種選項供使用者設定,包括:1、輸入與輸出訊號的位元寬度。2、輸入與輸出訊號的表示法(有號數、無號數或根據其他輸入訊號來決定)。3、多少級 pipeline。4、當發生 overflow 時,是否要改變結果。5、累加器的類型(只支援加法、或支援加法與減法)。6、pipeline stall 的功能。7、結果初始值的設定。一個常用的 MAC 模組(16x16 的輸入訊號、40-bit 的輸出結果、兩級 pipeline)可以在數秒內產生,而且工作頻率高達280 MHz(in post-layout simulation typical case when targeted toward a TSMC 0.35μm CMOS cell library)。
    為了減少設計所需要的時間,設計一套軟體能夠自動產生使用者自訂的 MAC 模組,對於 DSP 積體電路的設計者而言,將是一大幫助。我們已經發展了一個 MAC 模組產生器,它提供了許多選項供使用者設定,讓使用者可以根據設計的需求來設定這些選項,產生所需要的 MAC 模組。

    我們的 MAC 模組產生器可以產生四種不同的架構:三級管線、二級管線、一級管線以及組合性電路。一個 MAC 模組包含一個乘法器與一個加法器。乘法器是由 radix-4 Booth encoder、modified Wallace addition tree 以及 sign-extension prevention scheme 所組成。而加法器的組成則根據 MAC 模組的架構而有所不同。若 MAC 模組是三級管線或二級管線,則加法器是三個運算元的加法器;否則就是兩個運算元的加法器。

    我們這套軟體最大的貢獻在於:根據不同的 MAC 模組架構,提出相對應的設計方法,以達到最高效能。


    Multiply-And-Accumulate (MAC) is the most frequently used operation in many DSP applications. We propose a software method that can generate high-performance MAC units in synthesizable HDL format. Our tool integrates several novel techniques including a modified radix-4 Booth encoding, a three dimensional Wallace tree, a sign-extension prevention scheme , and a hybrid carry-select/carry-look-ahead adder. It allows users to specify the number of bits in both inputs and output, the number system (signed or unsigned or decided by command inputs), the number of pipeline stages, saturation option on overflow, accumulator type (“addition only” or “addition and subtraction”), and pipeline stall as well as accumulator initialization capability. A typical MAC unit (16x16 inputs, 40-bit Accumulation , 2-stage pipeline) can be generated within seconds and run at over 280 MHz in post-layout simulation typical case when targeted toward a TSMC 0.35μm CMOS cell library.

    ABSTRACT I CONTENTS II LIST OF FIGURES III LIST OF TABLES IV CHAPTER 1、INTRODUCTION 1 CHAPTER 2、RELATED WORK 4 CHAPTER 3、MAC CIRCUIT DESIGN 5 3.1 MAC Architecture 5 3.2 MAC Design Key Features 5 3.2.1 Multiplier Circuit Design 6 3.2.1-1 Radix-4 Booth Encoding 6 3.2.1-2 Sign-extension Prevention Scheme 7 3.2.1-3 Three Dimensional Wallace Tree 9 3.2.2 Accumulation Circuit Design 12 3.2.2-1 Carry Save Adder Structure 12 3.2.2-2 Two Inputs Final Adder 13 3.3 The I/O Interface of MAC/Multiplier Module 13 3.3.1 Pin Description 14 3.3.2 Timing Waveform Analysis 16 CHAPTER 4、EXPERIMENTS 19 CHAPTER 5、APPLICATION NOTE 21 5.1 Application Method 21 5.1.1 System Requirements 21 5.1.2 Program Graphic User Interface 21 5.2 Application Example-Dual MAC Architecture 27 5.2.1 Functionality Description 27 5.2.2 Interface Description 27 CHAPTER 6、CONCLUSION 30 REFERENCE 31

    [1] A.A. Farooqui, V.G. Oklobdzija. General data-path organization of a MAC unit for VLSI implementation of DSP processors. ISCAS '98, Proceedings of the 1998 IEEE International Symposium on Circuits and Systems, Volume: 2, Page(s): 260 -263 vol.2.
    [2] J. Fadavi-Ardekani. M*N Booth encoded multiplier generator using optimized Wallace trees, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Volume: 12 , June 1993 , Page(s): 120 –125.
    [3] V.G. Oklobdzija, D. Villeger, S.S. Liu. A method for speed optimized partial product reduction and generation of fast parallel multipliers using an algorithmic approach, IEEE Transactions on Computers, Volume: 453, March 1996, Page(s): 294 –306.
    [4] P.F. Stelling, C.U. Martel, V.G. Oklobdzija, R. Ravi. Optimal circuits for parallel multipliers, IEEE Transactions on Computers, Volume: 473, March 1998, Page(s): 273 –285.
    [5] Allam, M.W.; Elmasry, M.I. Low power implementation of fast addition algorithms, 1998., IEEE Canadian Conference on Electrical and Computer Engineering, Volume: 2, Page(s): 645 –647.
    [6] V.G. Oklobdzija, D. Villeger. Implementing Multiply-Accumulate Operation in Multiplication Time. 13th IEEE Symposium on Computer Arithmetic, 1997. Proceedings, Page(s): 99 -106.
    [7] Koren, Israel. “Computer arithmetic algorithms”, Prentice Hall, 1993.
    [8] Jackson, D.J.; Hannah, S.J. Modelling and Comparison of Adder Designs with Verilog HDL, March 1993. Proceedings SSST '93., Twenty-Fifth Southeastern Symposium on System Theory, Pages: 406 – 410
    [9] A. D. Booth. A signed binary multiplication technique, Quart. J. Math., vol. IV pages 2, 1951
    [10] O. L MacSorley, “High-Speed Arithmetic in Binary Computers,” IRE Proc., vol. 49, pp. 67-91, Jan. 1961.
    [11] C. S. Wallace, “A suggestion for a fast multiplier”, IEEE Trans. On Electron. Comp., Vol. EC-13, pp.14-17, Feb. 1964.
    [12] R. E. Bryant, Yirng-An Chen, "Verification of Arithmetic Functions with Binary Moment Diagrams", Tech. Report CMU-CS-94-160, School of Computer Science, Carnegie Mellon University, 1994.
    [13] R. E. Bryant, Yirng-An Chen, "Verification of Arithmetic Circuits with Binary Moment Diagrams", In 32nd Design Automation Conference, June. 1995, pp. 535-541
    [14] Yirng-An Chen, R. E. Bryant, "ACV: An Arithmetic Circuit Verifier", In Proceedings of International Conference of Computer-Aid Design, Nov. 1996, pp. 361-365.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE