簡易檢索 / 詳目顯示

研究生: 黃莞爾
Huang, Wan-Eih
論文名稱: A High Accuracy Low Complexity Singular Value Decomposition Processor for MIMO Communications
適用於多重輸入輸出通訊系統之高精確度低複雜度奇異值分解處理器設計
指導教授: 馬席彬
Ma, Hsi-Pin
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2010
畢業學年度: 99
語文別: 英文
論文頁數: 82
中文關鍵詞: 奇異值分解多重輸入輸出
外文關鍵詞: SVD, MIMO
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,使用MIMO(Multi-input multi-output, 多重輸入與多重輸出)技術在無線通訊系統上越來越重要,因為MIMO能增加資料傳送的速率或是資料的可靠度。其中一種MIMO技術就是利用前置編碼來調整傳送端傳送的訊號,以增加資料的可靠度。而目前最被廣泛使用的前置編碼技術就是利用SVD(Singular Value Decomposition, 奇異值分解)產生前置編碼矩陣。將MIMO通道對角化至平行的特徵子通道,以降低不同的天線資料串流造成的互相干擾,同時也可以利用water-filling達到最大的通道容量。但是計算通道矩陣的SVD需要花費大量的運算量,增加系統的複雜度。工程師必須在硬體大小、速度與精確度之間做最佳化的考量。

    本論文中提出一個全新的快速、高精確度與低複雜度的SVD演算法以及硬體設計,利用矩陣分割對雙子矩陣做平行運算,再以行列交換形成箭頭矩陣,接著利用Givens rotation對矩陣做處理,每次求出一個奇異值就縮小一個維度,依序求出所有的奇異值。矩陣的平行運算以及每次縮小一個維度的處理可以增加運算的速度,利用類Jacobi演算法的方式可以確保精確度,而Givens rotation電路更可以使用CORDIC(Coordinate Rotation Digital Computer)電路以及結合Approximate Rotation演算法來縮小面積,達到快速、高精確度與低複雜度的設計,以8x8矩陣為例,和傳統的two-sided Jacobi方法比較,可以節省高達53.35 % 運算量。

    提出的SVD處理器在Xilinx Virtex-4 XC4VLX160 FPGA上得到驗證,操作頻率可以達到200 MHz。在TSMC 0.18 μm製程下實做,整體估計面積約為577 k,功率消耗為178.19 mW,操作頻率可達到190 MHz。


    1 Introduction 1 1.1 Background of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Multiple Input Multiple Output Techniques . . . . . . . . . . . . . . 1 1.1.2 Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Review of Prior Singular Value Decomposition (SVD) Algorithms 5 2.1 Basic Matrix Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Householder Transformations . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Givens Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 GR SVD Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Jacobi SVD Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 Proposed SVD Algorithm 15 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3 Eigenvalue Decompositions of Submatrices . . . . . . . . . . . . . . . . . . 17 3.3.1 Step 1: Dividing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.2 Step 2: Conquering . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.3 Step 3: Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4 Singular Value Decomposition of Arrowhead Matrix . . . . . . . . . . . . . 19 3.4.1 Step 1: Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4.2 Step 2: Deflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.5 Coordinate Rotation Digital Computer Algorithm . . . . . . . . . . . . . . . 20 3.5.1 Basic concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.5.2 Approximate Rotation . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.5.3 Givens Rotation Implemented by CORDIC Algorithm . . . . . . . . 24 3.6 Mean Square Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.7 Computational Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . 27 4 Architecture Design 31 4.1 Proposed Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2 Pre-processing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2.1 Multiplier and Accumulator (MAC) Set . . . . . . . . . . . . . . . . 34 4.2.2 Normal Vector Generation . . . . . . . . . . . . . . . . . . . . . . . 35 4.3 Data Arrangement and CORDIC Array Architectures . . . . . . . . . . . . . 35 4.3.1 CORDIC Module A . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.3.2 CORDIC Module B . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.3.3 Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.4 Left- and Right- Singular Matrix Accumulator Architecture . . . . . . . . . . 43 4.5 Word-Length Determination . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.5.1 Word-length Determination . . . . . . . . . . . . . . . . . . . . . . 48 4.5.2 Word-length in Proposed Architecture . . . . . . . . . . . . . . . . . 49 5 Hardware Design 55 5.1 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.2 Hardware Design for Pre-processing . . . . . . . . . . . . . . . . . . . . . . 55 5.2.1 General Component Circuit . . . . . . . . . . . . . . . . . . . . . . 55 5.2.2 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.3 Hardware Design for CORDIC . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.3.1 CORDIC Module A . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.3.2 CORDIC Module B . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.4 Hardware Design for Singular Matrix Accumulator . . . . . . . . . . . . . . 68 6 Implementation Results 71 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.2 FPGA Emulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.2.1 FPGA Emulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.2.2 Hardware Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . 73 6.3 Pre-Layout Simulation Result . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.3.1 RTL and Pre-layout Simulation . . . . . . . . . . . . . . . . . . . . 75 6.3.2 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 7 Conclusions and FutureWorks 79 7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 7.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    [1] A. Poon, D. Tse, and R. W. Brodersen, ”An adaptive multiple-antenna transceiver for
    slowly flat-fading channels,” IEEE Trans. Commun., vol. 51, no. 11, pp.1820–1827, Nov.
    2003.
    [2] G. H. Golub and C. Reinsch, ”Singular value decomposition and least squares solutions,”
    Numer. Math., vol. 14, no. 5, pp. 403–420, Apr. 1970.
    [3] V. Hari and K. Veselic, ”On Jacobi methods for singular value decompositions,” SIAM
    J. on Scientific and Statistical Computing, vol. 8, no. 5, pp. 741–754, Sept. 1987.
    [4] Richard L. Burden and J. Douglas Faires, Numerical Analysis, 8th ed. Florence, KY:
    Cengage Learning, 2005.
    [5] R. P. Brent, F. T. Luk, and C. Van Loan, ”Computation of the singular value decomposition
    using mesh-connected processors,” Journal of VLSI and Computer Systems, vol. 1,
    no. 3, pp. 242–270, 1985.
    [6] R. Mcllhenny and M. D. Ercegovac, ”On the design of an on-line complex Householder
    transform,” Signals, Systems and Computers, 2006. ACSSC ’06. Fortieth Asilomar Conference
    on, pp. 318–322, Oct. 2006.
    [7] M. Gu and S. C. Eisenstat, ”A divide-and-conquer algorithm for the symmetric tridiagonal
    eigenproblem,” SIAM J. MATRIX ANAL. APPL., vol. 16, no. 1, pp. 172–191, Jan.
    1995.
    [8] J. R. Cavallaro and A. C. Elster, ”A CORDIC processor array for the SVD of a complex
    matrix,” in SVD and Signal Processing II (Algorithms, Analysis and Applications), R.
    Vaccaro, Ed. New York: Elsevier, 1991, pp. 227-239.
    [9] R. Andraka, ”A survey of CORDIC algorithms for FPGAs,” in Proc. ACM/SIGDA Conf.,
    pp. 191–200, 1998.
    [10] J. Gotze, ”Iterative version of the QRD for adaptive RLS filtering,” Proc. SPIE Advanced
    Signal Processing: Algorithms, Architectures and Implementations V, vol. 2296, pp.
    438–449, Oct. 1994.
    [11] K. Dickson, Z. Liu, and J. McCanny, ”QRD and SVD processor design based on an
    approximate rotations algorithm,” Signal Processing Systems, 2004. SIPS 2004. IEEE
    Workshop on, pp. 42–47, Oct. 2004.
    [12] Weiwei Ma, M. E. Kaye, D. M. Luke, and R. Doraiswami, ”An FPGA-based singular
    value decomposition processor,” Electrical and Computer Engineering, 2006. CCECE
    06. Canadian Conference on, pp. 1047–1050, May. 2006.
    [13] H. Choi and W.P. Burleson, ”Search-based wordlength optimization for VLSI/DSP synthesis,”
    VLSI Signal Processing, VII, 1994., [Workshop on], pp. 198–207, 1994.
    [14] C.W. Yu, ”A scalable MIMO detector IP for wireless communications,” Master’s thesis,
    National Tsing Hua University, HsinChu, Taiwan, Sept. 2006.
    [15] Z. Liu, K. Dickson, and J.V. McCanny, ”Application-specific instruction set processor
    for SoC implementation of modern signal processing algorithms,” IEEE Transactions
    on Circuits and Systems I: Fundamental Theory and Applications, vol. 52, no. 4, pp.
    755–765, Apr. 2005.
    [16] D. Markovic, B. Nikolic, and R.W. Brodersen, ”Power and area minimization for multidimensional
    signal processing,” IEEE J. Solid-State Circuits, vol. 42, no. 4, pp. 922–934,
    Apr. 2007.
    [17] J. Rabaey, Low Power Essentials, Boston, MA: Springer, 2009.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE