簡易檢索 / 詳目顯示

研究生: 郭景翔
Ching-Hsiang Kuo
論文名稱: 基於數位訊號處理核心演算法之PAC DSP效能評估
Evaluation of PAC DSP – Based on Typical DSP Algorithm Kernels
指導教授: 石維寬
Wei-Kuan Shih
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 48
中文關鍵詞: PACDSP效能評估核心演算法
外文關鍵詞: PACDSP, Evaluation, Algorithm Kernel
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著硬體科技的進步,需求量日漸增加的隨身多媒體裝置以及通訊設備逐漸佔據了人們的生活,而不論是音樂或視訊的編解碼計算、音訊的過濾處理、或是電子通訊的錯誤還原…等數位訊號處理,演算法都不斷的被改進。為了能夠因應規格快速開發新的產品以搶得Time-To-Market的先機,又要保留程式本身的執行效率,有越來越多的多媒體裝置設計業者採用DSP Solution。為了在這樣的趨勢之下在世界中佔有一席之地,工研院晶片中心於2004年著手開發台灣的第一個VLIW DSP—PACDSP (Parallel Architecture Core),冀望能夠在隨身多媒體的應用方面,提供使用者一組完整的Solution,進而成為世界最佳的開發平台。

    在本論文中,我們藉由對PACDSP核心架構的研究,提出一組使我們用以評估的Algorithm Kernel能夠在PACDSP之上達到最佳效能的實作方式,其中我們以演算法的流程、在2個Cluster上的資料分割方式、實作使用的指令組合與硬體架構之間配合的關係、以及演算法的最佳化探討…等方面來進行研究,以程式設計者的觀點指出其硬體架構及指令集對於各種DSP之基本應用演算法的開發所造成的影響。並與市場上已知的DSP進行數據比較,分析PACDSP的架構與指令集對我們的實作所造成的影響,提出分析及改善建議。


    With the development of silicon technology, the need of portable multimedia and communication devices is increasing fast day by day, which dominate people’s daily life. The digital signal processing algorithm, such as video and audio codec, voice filtering, and error correction, are also changing in a fast pace. To quickly develop new products with compatibility of latest specifications and algorithm without sacrificing its efficiency, more and more portable device vendors choose DSP solution. Seeing this trend, the SoC Technology Center (STC) in ITRI, started to develop the first VLIW DSP in Taiwan, called PAC (Parallel Architecture Core) DSP. The objective of PACDSP is to provide portable device vendors a complete solution, and to become the best development platform.

    In this paper, through the research on PACDSP kernel architecture, we have developed an efficient implementation of the evaluating algorithm kernels. We describe our implementations on PACDSP through algorithm flows, data parallelization on both clusters, the relationship between combination of instructions and hardware architecture, and algorithm-level optimizations, from the view of software programmers. Then we compare the cycle count and code size results with the existing DSPs, analysis the impact of PACDSP kernel architectures and instruction sets on our implementations, at last we put forward our suggestions to PACDSP architecture.

    中文摘要 I Abstract II 致謝 III Chapter 1 Introduction 1 Chapter 2 Benchmarking the DSP Processors 2 2.1 Benchmarking Approaches 2 2.1.1 Traditional Evaluation Approach 2 2.1.2 Application Base Evaluation Approach 2 2.1.3 Algorithm-Kernel Based Evaluation Approach 3 2.2 PACDSP 3 2.2.1 Overview 3 2.2.2 Scalar Unit 4 2.2.3 VLIW Datapath 5 2.2.4 Register Organization 5 2.2.5 Memory System 6 2.2.6 Pipeline 7 2.2.7 Memory Addressing Modes 8 2.2.8 Instruction Set 8 2.2.9 Assembly Language Format 10 Chapter 3 Typical Algorithm Kernel Benchmarking Functions 11 3.1 Introduction to the Benchmarking Functions 11 3.2 Benchmarking Functions Implementation on PACDSP v3.0 20 3.2.1 Vector Addition 20 3.2.2 Vector Multiply-and-Accumulate 22 3.2.3 Vector Maximum 23 3.2.4 Real Block FIR filter 23 3.2.5 Complex Block FIR filter 24 3.2.6 Single-Sample FIR filter 24 3.2.7 IIR filter 25 3.2.8 LMS filter 26 3.2.9 Viterbi Decoder 26 3.2.10 Bit Extracting 29 3.2.11 FFT 30 Chapter 4 Experiment Results and Analysis 34 4.1 Experiment Environments 34 4.2 Experiment Results 34 4.2.1 Cycle Count 34 4.2.2 Code Size 40 Chapter 5 Conclusion and Future Work 46 5.1 Conclusion 46 5.2 Future Work 46 References 47

    [1] Jia-Ming Chen, Hsin-Wen Wei, Shau-Yin Tseng, Jenq-Kuen Lee and Wei-Kuan Shih, “An Experience on the Programming for the PACDSP, a VLIW DSP Processor” Conference on VLSI Design/CAD August 09-12 2005 Taiwan.
    [2] Te-Shin Yang and Jih-Ching Chiu, “Vectorized Code Scheduling Method for the FFT Algorithm in VLIW Architecture”
    [3] BDTI Whitepaper, “Benchmarking Processors for DSP Applications”, presented at the University of Texas at Dallas, March 1, 2006.
    [4] Eleanor Chu, Alan George,“Inside the FFT Black Box: Serial and Parallel Fast Fourier Transform Algorithms (Computational Mathematics S.)”, CRC Press Inc.,US.
    [5] Rulph Chassaing, "DSP Applications Using C and the TMS320C6x DSK"
    , John Wiley & Sons, Inc, 2002.
    [6] Motorola, Inc., and Agere Systems, "How To Implement a Viterbi Decoder on the StarCore SC140," Application Note ANSC140VIT/D, July 18, 2000
    [7] I-Dao Liao, Chuan-Hua Chang, "A Digital Signal Processor for Next-generation Mobile Platform", SoC Technical Journal, Vol. 2, pp. 9-21, May, 2005.
    [8] E. Tan and W. Heinzelman, "DSP Architectures: Past, Present and Future,'' Computer Architecture News, Vol. 31, No. 3, June 2003, pp. 6-19.
    [9] BDTI Whitepaper, "Evaluating DSP Processor Performance".
    [10] Norbert A Pilz , Boris Lerner, and Kenneth Adamson, "Parallel FFT Implementations on Fixed-Point DSP-Cores with Subword-Parallelism", Irish Signals and Systems Conference, September, 2005.
    [11] PACDSP v3.0 Software Developer’s Bible – Vol.1 Software Developer’s Guide.
    [12] PACDSP v3.0 Software Developer’s Bible – Vol.2 Instruction Set Manual
    [13] K. Nadehara, T. Miyazaki, I. Kuroda, "Radix-4 FFT implementation using SIMD multimedia instructions," IEEE Conference on Acoustics, Speech, and Signal Processing (1999) 2131-2134.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE