基於數位訊號處理核心演算法之PAC DSP效能評估

簡易檢索 / 詳目顯示

回結果列表

研究生：	郭景翔 Ching-Hsiang Kuo
論文名稱：	基於數位訊號處理核心演算法之PAC DSP效能評估 Evaluation of PAC DSP – Based on Typical DSP Algorithm Kernels
指導教授：	石維寬 Wei-Kuan Shih
口試委員:
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2006
畢業學年度：	94
語文別：	英文
論文頁數：	48
中文關鍵詞：	PACDSP 、效能評估、核心演算法
外文關鍵詞：	PACDSP, Evaluation, Algorithm Kernel
相關次數：	點閱：93 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著硬體科技的進步，需求量日漸增加的隨身多媒體裝置以及通訊設備逐漸佔據了人們的生活，而不論是音樂或視訊的編解碼計算、音訊的過濾處理、或是電子通訊的錯誤還原…等數位訊號處理，演算法都不斷的被改進。為了能夠因應規格快速開發新的產品以搶得Time-To-Market的先機，又要保留程式本身的執行效率，有越來越多的多媒體裝置設計業者採用DSP Solution。為了在這樣的趨勢之下在世界中佔有一席之地，工研院晶片中心於2004年著手開發台灣的第一個VLIW DSP—PACDSP (Parallel Architecture Core)，冀望能夠在隨身多媒體的應用方面，提供使用者一組完整的Solution，進而成為世界最佳的開發平台。

在本論文中，我們藉由對PACDSP核心架構的研究，提出一組使我們用以評估的Algorithm Kernel能夠在PACDSP之上達到最佳效能的實作方式，其中我們以演算法的流程、在2個Cluster上的資料分割方式、實作使用的指令組合與硬體架構之間配合的關係、以及演算法的最佳化探討…等方面來進行研究，以程式設計者的觀點指出其硬體架構及指令集對於各種DSP之基本應用演算法的開發所造成的影響。並與市場上已知的DSP進行數據比較，分析PACDSP的架構與指令集對我們的實作所造成的影響，提出分析及改善建議。

With the development of silicon technology, the need of portable multimedia and communication devices is increasing fast day by day, which dominate people’s daily life. The digital signal processing algorithm, such as video and audio codec, voice filtering, and error correction, are also changing in a fast pace. To quickly develop new products with compatibility of latest specifications and algorithm without sacrificing its efficiency, more and more portable device vendors choose DSP solution. Seeing this trend, the SoC Technology Center (STC) in ITRI, started to develop the first VLIW DSP in Taiwan, called PAC (Parallel Architecture Core) DSP. The objective of PACDSP is to provide portable device vendors a complete solution, and to become the best development platform.

In this paper, through the research on PACDSP kernel architecture, we have developed an efficient implementation of the evaluating algorithm kernels. We describe our implementations on PACDSP through algorithm flows, data parallelization on both clusters, the relationship between combination of instructions and hardware architecture, and algorithm-level optimizations, from the view of software programmers. Then we compare the cycle count and code size results with the existing DSPs, analysis the impact of PACDSP kernel architectures and instruction sets on our implementations, at last we put forward our suggestions to PACDSP architecture.

中文摘要    I
Abstract    II
致謝    III
Chapter 1 Introduction    1
Chapter 2     Benchmarking the DSP Processors    2
1 Benchmarking Approaches    2
1.1 Traditional Evaluation Approach    2
1.2 Application Base Evaluation Approach    2
1.3 Algorithm-Kernel Based Evaluation Approach    3
2    PACDSP    3
2.1 Overview    3
2.2 Scalar Unit    4
2.3 VLIW Datapath    5
2.4 Register Organization    5
2.5 Memory System    6
2.6 Pipeline    7
2.7 Memory Addressing Modes    8
2.8 Instruction Set    8
2.9 Assembly Language Format    10
Chapter 3 Typical Algorithm Kernel Benchmarking Functions    11
1     Introduction to the Benchmarking Functions    11
2     Benchmarking Functions Implementation on PACDSP v3.0    20
2.1    Vector Addition    20
2.2    Vector Multiply-and-Accumulate    22
2.3    Vector Maximum    23
2.4    Real Block FIR filter    23
2.5    Complex Block FIR filter    24
2.6    Single-Sample FIR filter    24
2.7    IIR filter    25
2.8    LMS filter    26
2.9     Viterbi Decoder    26
2.10     Bit Extracting    29
2.11     FFT    30
Chapter 4 Experiment Results and Analysis    34
1 Experiment Environments    34
2 Experiment Results    34
2.1 Cycle Count    34
2.2 Code Size    40
Chapter 5     Conclusion and Future Work    46
1 Conclusion    46
2 Future Work    46
References    47

                                

[1] Jia-Ming Chen, Hsin-Wen Wei, Shau-Yin Tseng, Jenq-Kuen Lee and Wei-Kuan Shih, “An Experience on the Programming for the PACDSP, a VLIW DSP Processor” Conference on VLSI Design/CAD August 09-12 2005 Taiwan.
[2] Te-Shin Yang and Jih-Ching Chiu, “Vectorized Code Scheduling Method for the FFT Algorithm in VLIW Architecture”
[3] BDTI Whitepaper, “Benchmarking Processors for DSP Applications”, presented at the University of Texas at Dallas, March 1, 2006.
[4] Eleanor Chu, Alan George,“Inside the FFT Black Box: Serial and Parallel Fast Fourier Transform Algorithms (Computational Mathematics S.)”, CRC Press Inc.,US.
[5] Rulph Chassaing, "DSP Applications Using C and the TMS320C6x DSK"
, John Wiley & Sons, Inc, 2002.
[6] Motorola, Inc., and Agere Systems, "How To Implement a Viterbi Decoder on the StarCore SC140," Application Note ANSC140VIT/D, July 18, 2000
[7] I-Dao Liao, Chuan-Hua Chang, "A Digital Signal Processor for Next-generation Mobile Platform", SoC Technical Journal, Vol. 2, pp. 9-21, May, 2005.
[8] E. Tan and W. Heinzelman, "DSP Architectures: Past, Present and Future,'' Computer Architecture News, Vol. 31, No. 3, June 2003, pp. 6-19.
[9] BDTI Whitepaper, "Evaluating DSP Processor Performance".
[10] Norbert A Pilz , Boris Lerner, and Kenneth Adamson, "Parallel FFT Implementations on Fixed-Point DSP-Cores with Subword-Parallelism", Irish Signals and Systems Conference, September, 2005.
[11] PACDSP v3.0 Software Developer’s Bible – Vol.1 Software Developer’s Guide.
[12] PACDSP v3.0 Software Developer’s Bible – Vol.2 Instruction Set Manual
[13] K. Nadehara, T. Miyazaki, I. Kuroda, "Radix-4 FFT implementation using SIMD multimedia instructions," IEEE Conference on Acoustics, Speech, and Signal Processing (1999) 2131-2134.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文