研究生: |
洪明郁 Ming-Yu Hung |
---|---|
論文名稱: |
支援推測式多緒計算機結構的編譯器設計 Compiler supports for Optimizing Speculative Multithreading Architecture |
指導教授: |
李政崑
Jenq-Kuen Lee |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2004 |
畢業學年度: | 92 |
語文別: | 英文 |
論文頁數: | 39 |
中文關鍵詞: | 推測式多緒計算機結構 、相依關係分析 、同名機率資訊分析 、平行化 |
外文關鍵詞: | Speculative multithreading, Dependence analysis, Probabilistic points-to analysis, Parallelization |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著VLSI技術的進步,在單一個處理器上已經可以加入許多特殊運作功能。支援推測式多緒計算機結構(SpMT)就是其中一種。它具有推測執行及多緒處理的能力,而且可以達到緒列間的平行處理。這樣的架構可以藉著推測且多緒平行處理,讓一般的程式達到執行效能的提升;但是要注意的是,SpMT架構存在著推測錯誤的可能性,所以當程式有高度的相依性存在時,效能不但不會提升,還有可能會有衰減的情形;因為當推測錯誤時,必須做回復的動作,而回復的程式必須先將已經存在的多緒列先刪除,再執行原來正常的程式。由此可知,量化程式間的相依性,對SpMT架構是很重要的;因為可以依照量化出來的結果,來決定是否讓此架構作推測性的執行,如此一來就可以防止發生效能衰減的情形。
此篇論文提出了一個量化迴圈程式之間相依性的方法,這樣的量化資訊可以用來判斷該迴圈是否合適使用推測性多緒執行。首先我們先提出一個方法可以蒐集程式間指標同名(alias)機率資訊的方法(PPA),再利用PPA推導出指標之間的相依機率。之後我們利用推導出來的相依機率資訊讓推測式多緒處理的機器能夠永遠都做出一個正確的決定,不至於發生效能衰減的情況。我們的實驗平台是SImulator for Multithreaded Computer Architectures (SIMCA)。利用SIMCA所提供的特殊安插指令來模擬SpMT的行為。
By the progress of VLSI technology, there are more and more features added in a single processor. Speculative multithreading (SpMT) architecture is one of them. It has speculative function and multithreading feature in a processor, and it can exploit thread-level parallelism that cannot be identified statically. Speedup can be obtained by speculatively executing threads in parallel that are extracted from a sequential program. However, performance degradation might occur if the threads are highly dependent. A recovery mechanism will be activated when a speculative thread violates the sequential semantics. The recovery action usually incurs a very high penalty, because it must squash all living threads before doing recovery code. Therefore, it is essential for SpMT to quantify the degree of dependences and to turn off speculation if the degree of loop carried dependence is over a certain threshold.
This paper presents a technique that quantitatively computes loop carried dependences and such information can be used to determine if loop iterations should be executed in parallel by speculative threads or not. This technique can be broken into two steps. First, probabilistic points-to analysis is performed to estimate the probabilities of points-to relationships in case there are pointer references in programs. That way, the degree of dependences between loop iterations is computed quantitatively. Second, experimental results show compiler-directed thread-level speculation based on the information gathered by this technique can guarantee the architecture to always do a right decision on the experimental platform, SImulator for Multithreaded Computer Architectures (SIMCA). SIMCA be modeled as SpMT architecture by inserting SIMCA specific instructions.
[1] B. D. and T. Austin. The SimpleScalar Tool Set, Version 3.0. Unversity of
Wisconsin Madison Computer Science Department.
[2] L. Hammond, B. Hubbert, M. Siu, M. Prabhu, M. Chen, and K. Olukotun. The
stanford hydra cmp. IEEE MICRO Magazine, March-April 2000.
[3] V. Krishnan and J. Torrellas. A chip-multiprocessor architecture with speculative
multithreading. IEEE Transactions on Computers, 48(9):866–880, 1999.
[4] J. Oplinger, D. Heine, S.-W. Liao, B. A. Nayfeh, M. S. Lam, and K. Olukotun.
Software and hardware for exploiting speculative parallelism with a multiprocessor.
Technical Report CSL-TR-97-715, Stanford University, February 1997.
[5] G. S. Sohi, S. E. Breach, and T. N. Vijaykumar. Multiscalar processors. In 25
Years ISCA: Retrospectives and Reprints, pages 521–532, 1998.
[6] J.-Y. Tsai, J. Huang, C. Amlo, D. J. Lilja, and P.-C. Yew. The superthreaded
processor architecture. IEEE Transactions on Computers, 48(9):881–902, 1999.
[7] J.-Y. Tsai, Z. Jiang, and P.-C. Yew. Compiler techniques for the superthreaded
architectures. International Journal of Parallel Programming, 27(1):1–19, 1999.
[8] A. Berson, S. Smith, and K. Thearling. Building Data Mining Applications for
CRM. McGraw-Hill, 1999.
31
BIBLIOGRAPHY 32
[9] B. W. Kernighan and D. M. Ritchie. The C programming language, Second
Edition. Prentice Hall, 1988.
[10] B. Stroustrup. The C++ programming language. Addison-Wesley, 1991.
[11] G. Ramalingam. Data flow frequency analysis. In Proceedings of the ACM
SIGPLAN ’96 conference on Programming language design and implementation,
pages 267–277. ACM Press, 1996.
[12] Yuan-Shin Hwang, Peng-Sheng Chen, Jenq Kuen Lee, and Roy Dz-Ching Ju.
Probabilistic Points-to Analysis In Proceeding of the 2001 International Workshop
on Languages and Compilers for Parallel Computing (LCPC’01), Cumberland
Falls, Kentucky, August, 2001.
[13] Smith, M., Horowitz, M., Lam, M. Efficient Superscalar Performance Through
Boosting In 5th International Conference on Architectural Support for Programming
languages and Operating Systems. IEEE/ACM. Boston, MA. p248-259.
October, 1992
[14] Jin Lin, Tong Chen, Wei-Chung Hsu, Pen-Chung Yew, Roy Ju, Tin-fook Ngai,
and Sun Chan, A Compiler Framework for Speculative Analysis and Optimizations
in the ACM SIGPLAN’03 Conference on Programming Language Design
and Implementation (PLDI), pp. 289-299, June 2003
[15] Roy Dz-ching Ju, Kevin Nomura, Uma Mahadevan, Le-Chun Wu A Unified
Compiler Framework for Control and Data Speculation 2000 International Conference
on Parallel Architectures and Compilation Techniques (PACT’00) October
15 - 19, 2000 Philadelphia, Pennsylvania
[16] Dongkeun Kim (Intel), Steve Shih-wei Liao (Intel), Perry Wang (Intel), Juan
del Cuvillo (Intel), Xinmin Tian (Intel), Xiang Zou (Intel), Hong Wang (Intel),
BIBLIOGRAPHY 33
Donald Yeung (U. of Maryland College Park), Milind Gikar (Intel), John Shen
(Intel) Physical Experimentation with Prefetching Helper Threads on Intel’s
Hyper-Threaded Processors 2004 International Symposium on Code Generation
and Optimization with Special Emphasis on Feedback-Directed and Runtime
Optimization
[17] Intel’s docuemnt that provides the overview of the Microarchitecture, System
Bus, Data Integrity, Configuration and Initialization, Test Access Port, and
Integration Tools for use while designing systems with the Itanium processor.
ftp://download.intel.com/design/Itanium/Downloads/24870102.pdf
[18] M. Burke, P. Carini, J.-D. Choi, and M. Hind. Flow-insensitive interprocedural
alias analysis in the presence of pointers. In Proceedings of the 8th International
Workshop on Languages and Compilers for Parallel Computing, Columbus,
Ohio, August 1995.
[19] J.-D. Choi, M. Burke, and P. Carini. Efficient flow-sensitive interprocedural
computation of pointer-induced aliases and side effects. In Proceedings of the 20th
ACM SIGPLAN-SIGACT symposium on Principles of programming languages,
pages 232–245. ACM Press, 1993.
[20] M. Das. Unification-based pointer analysis with directional assignments. In
Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language
Design and Implementation (PLDI-00), volume 35.5 of ACM Sigplan Notices,
pages 35–46, N.Y., June 18–21 2000. ACM Press.
[21] A. Deutsch. Interprocedural May-Alias analysis for pointers: Beyond k-limiting.
SIGPLAN Notices, 29(6):230–241, June 1994. Proceedings of the ACM SIGPLAN
’94 Conference on Programming Language Design and Implementation.
BIBLIOGRAPHY 34
[22] M. Emami, R. Ghiya, and L. J. Hendren. Context-sensitive interprocedural
Points-to analysis in the presence of function pointers. SIGPLAN Notices,
29(6):242–256, June 1994. Proceedings of the ACM SIGPLAN ’94 Conference
on Programming Language Design and Implementation.
[23] W. Landi and B. G. Ryder. A safe approximate algorithm for interprocedural
pointer aliasing. SIGPLAN Notices, 27(7):235–248, July 1992. Proceedings
of the ACM SIGPLAN ’92 Conference on Programming Language Design and
Implementation.
[24] J. K. Lee, D. Ho, and Y.-C. Chuang. Data distribution analysis and optimization
for pointer-based distributed programs. In Proceedings of the 1997 International
Conference on Parallel Processing (ICPP ’97), pages 56–63, Washington - Brussels
- Tokyo, Aug. 1997. IEEE Computer Society Press.
[25] Y.-J. Lin, Y.-S. Hwang, and J. K. Lee. Compiler optimizations with dsp-specific
semantic descriptions. In Proceedings of the 2002 International Workshop on
Languages and Compilers for Parallel Computing, July 2002.
[26] Y.-S. Hwang, P.-S. Chen, J. K. Lee, and R. D.-C. Ju. Probabilistic points-to
analysis. In Proceedings of the 2001 International Workshop on Languages and
Compilers for Parallel Computing, August 2001.
[27] The Stanford SUIF Compiler Group. The SUIF Library. Stanford University,
1995.
[28] M. D. Smith. The SUIF Machine Library. Division of of Engineering and Applied
Science, Harvard University, March 1998.
BIBLIOGRAPHY 35
[29] R. Krishnaiyer, D. Kulkarni, D. M. Lavery, W. Li, C.-C. Lim, J. Ng, and D. C.
Sehr. An advanced optimizer for the ia-64 architecture. IEEE Micro, 20(6):60–
68, November/December 2000.
[30] D.-C. R. Ju, J.-F. Collard, and K. Oukbir. Probabilistic memory disambiguation
and its application to data speculation. In G. Lee and P.-C. Yew, editors,
Third Workshop on Interaction between Compilers and Computer Architectures
(INTERACT-3), San Jose, CA, Oct. 1998.
[31] M. C. Carlisle and A. Rogers. Software caching and computation migration in
olden. In Proceedings of ACM SIGPLAN Conference on Principles and Practice
of Parallel Programming, pages 29–39, July 1995.
[32] R.-G. Chang, T.-R. Chuang, and J. K. Lee. Efficient support of parallel sparse
computation for array intrinsic functions of Fortran 90. In Conference Proceedings
of the 1998 International Conference on Supercomputing, pages 45–52,
Melbourne, Australia, July 13–17, 1998. ACM SIGARCH.
[33] R.-G. Chang, J.-S. Li, J. K. Lee, and T.-R. Chuang. Probabilistic inference
schemes for sparsity structures of fortran 90 array intrinsics. In 2001 International
Conference on Parallel Processing (ICPP ’01, pages 61–68, Washington -
Brussels - Tokyo, Sept. 2001. IEEE.
[34] G.-H. Hwang, J. K. Lee, and R. D.-C. Ju. A function-composition approach
to synthesize Fortran 90 array operations. Journal of Parallel and Distributed
Computing, 54(1):1–47, 10 Oct. 1998.
[35] G.-H. Hwang, J. K. Lee, and R. D.-C. Ju. Array operation synthesis to optimize
HPF programs on distributed memory machines. Journal of Parallel and
Distributed Computing, 61(4):467–500, Apr. 2001.
BIBLIOGRAPHY 36
[36] Y.-P. You, C.-R. Lee, and J. K. Lee. Compiler analysis and supports for leakage
power reduction on microprocessors. In Proceedings of the 2002 International
Workshop on Languages and Compilers for Parallel Computing, July 2002.