簡易檢索 / 詳目顯示

研究生: 謝昀珊
Hsieh, Yun-Shan
論文名稱: 基於斯格明子賽道記憶體之記憶體–儲存系統
Memory–Storage Systems based on Skyrmion Racetrack Memories
指導教授: 石維寬
Shih, Wei-Kuan
口試委員: 黃彥男
Huang, Yen-Nun
張原豪
Chang, Yuan-Hao
王廷基
Wang, Ting-Chi
何宗易
Ho, Tsung-Yi
陳俊良
Chen, Jiann-Liang
謝仁偉
Hsieh, Jen-Wei
黃柏鈞
Huang, Po-Chun
梁郁珮
Liang, Yu-Pei
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 91
中文關鍵詞: 斯格明子賽道記憶體記憶體–儲存系統背靠背資料配置方式排序演算法移位錯誤資料表現錯誤
外文關鍵詞: skyrmion racetrack memories (SK-RMs), memory–storage systems, back-to-back data placement, sorting algorithm, position error, data representation error
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 當代的非揮發式記憶體被廣泛認為是靜態隨機存取記憶體、動態隨機存取記憶體及機械式硬碟等傳統記憶/儲存媒體的高能效替代品。在各種常見的非揮發式記憶體中,斯格明子賽道記憶體以其納秒級的存取性能、超高的存儲密度、以及對隨機插入/刪除操作的支援而聞名。然而,斯格明子賽道記憶體須依靠耗時的移位操作來對齊資料位元與存取埠以讀取或寫入資料,可能會顯著降低存取性能。另一方面,移位操作也可能會造成位置誤差和資料表示問題,因此如何提升資料的可靠性,也成為重要的設計議題。

    由於斯格明子賽道記憶體獨特的特性,既有為傳統記憶/儲存媒體設計的演算法在斯格明子賽道記憶體上可能導致嚴重的性能下降。有鑑於此,既有演算法應根據斯格明子賽道記憶體之特性重新設計,以期充分發揮斯格明子賽道記憶體的潛力。具體而言,許多既有演算法傾向於以隨機跳躍的方式存取記憶體中的資料,這會產生許多耗時的移位操作。以故,如何消除斯格明子賽道記憶體不必要的移位操作以提高演算法的性能至關重要。在許多現代應用中,例如多媒體和數據分析,處理多個陣列/向量並執行某些計算任務是種常見的操作。在陣列/向量中,適當的資料放置策略對於避免斯格明子賽道記憶體不必要的移位操作至關重要。因此,在本論文的第一部分中,我們提出了一種遞迴、背靠背的資料配置方式,以最小化移位操作並改善斯格明子賽道記憶體之存取性能。同時,我們以經典的排序演算法為例,提出了一種考慮斯格明子賽道記憶體特性的排序演算法,稱為移位限制排序演算法,以展示遞迴、背靠背資料配置方式之性能優勢。

    除了存取性能外,資料可靠性是設計記憶體或存儲系統時必須考慮的另一個關鍵問題。儘管既有的糾錯碼可用以檢測和糾正位元錯誤,但它們耗時的編、解碼過程往往會嚴重拖慢斯格明子賽道記憶體的效能。藉由觀察資料可靠性、存取性能和空間利用率間相互制衡的關係,在本論文的第二部分中,我們提出了一個考量資料存取粒度之斯格明子賽道記憶體管理框架,希望能在消除由位置誤差和資料表示問題所導致錯誤的同時,最佳化資料之存取性能和記憶體之空間利用率。為實現此目標,我們所提出的框架為不同存取粒度的資料,選擇不同的資料編碼、布置、索引、存取埠選擇策略,以最大限度地減少資料訪問的移位開銷。


    Modern nonvolatile memories (NVMs) are widely recognized as energy-efficient replacements of classical memory/storage media, such as SRAM, DRAM, and mechanical hard disk. Among the popular NVMs, the skyrmion racetrack memory (SK-RM) is well known for its nanosecond-level access performance, ultra-high storage density, and unique supports of random data insert/delete operations. However, SK-RM relies on time-consuming shift operations to align a data bit with any access port to read or write the bit, and the access performance might degrade. On the other hand, shift operations also introduce unique issues, such as the position error and the data representation problem, which considerably impact the data reliability.

    The existing algorithms designed for classical media might experience serious performance degradation when working on the SK-RM, due to the distinct characteristics of SK-RM. Thus, to fully unleash the potentials of the SK-RM, the existing algorithms should be redesigned with these SK-RM characteristics. In particular, many existing algorithms tend to access the in-memory data in a random-hopping fashion, which generates many shift operations of SK-RM. It is therefore crucial for the existing algorithms to eliminate unnecessary shift operations of SK-RM to boost the performance of the algorithms. In many modern applications, such as multimedia and data analysis, it is a common operation to process two or more arrays/vectors of data to perform certain computation tasks. In the arrays/vectors, an appropriate data placement strategy is critical for avoiding unnecessary shift operations of SK-RM. The observation thus motivates the first part of this dissertation in proposing a recursive back-to-back data placement manner to minimize the shift operations. To evaluate the back-to-back data placement manner, we take sorting algorithms as a case study, and propose a novel shift-limited sorting algorithm for SK-RM.

    Besides the access performance, the data reliability is the other key issue that must be well considered when designing a memory or storage system. Although brilliant error correction codes have been proposed to detect and correct bit errors, however, their time-consuming encoding and decoding processes cannot fully match the nanosecond-level access latency of SK-RM. Observing the dilemma among data reliability, access performance, and space utilization, in the second part of this dissertation, we propose a granularity-driven management scheme for SK-RM. While eliminating the errors incurred due to the position error and the data representation problem, the proposed management scheme aims to jointly optimize access performance and space utilization. To achieve this goal, the proposed scheme adaptively selects different combinations of data encoding, layout, and indexing schemes for the data of different granularities. Moreover, we investigate the port selection problem under our proposed data layouts to minimize shift overheads on data accesses.

    摘 要 i Abstract iii Acknowledgement v Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Background 5 2.1 Overview of Racetrack Memories (RMs) 5 2.2 SK-RM in the Memory Hierarchy 6 2.3 Organization of SK-RMs 8 2.4 The Access Interface of SK-RMs 9 Chapter 3 Shift-limited Sort: Optimizing Sorting Performance on Skyrmion Memory-Based Systems 11 3.1 Motivations 11 3.1.1 The Sorting Algorithms and a Motivational Example 11 3.1.2 Motivations 15 3.2 Shift-limited Sort for SK-RMs 16 3.2.1 Working Principle 17 3.2.2 Deriving the Back-to-Back Sorting Order of Data Segments 20 3.2.3 Segmental-Merging Routine 23 3.2.4 Working Example 26 3.3 Analytical Studies 28 3.4 Experimental Studies 30 3.4.1 Experimental Settings 30 3.4.2 Experimental Results 33 3.5 Work Summary 40 Chapter 4 Granularity-driven Management for Reliable and Efficient Skyrmion Racetrack Memories 41 4.1 Motivations 41 4.2 Granularity-driven Data Management of SK-RM 44 4.2.1 Flyweight Data Encoding with Guard Skyrmions 44 4.2.2 Granularity-driven Data Layouts and Management 47 4.2.3 Implementation Remarks 54 4.3 Analytical Studies 58 4.3.1 Mode S 59 4.3.2 Mode M 61 4.3.3 Mode L 62 4.4 Experimental Studies 63 4.4.1 Experimental Settings 63 4.4.2 Experimental Results 68 4.5 Discussions 76 4.6 Work Summary 77 Chapter 5 Conclusion and Future Work 79 Bibliography 81

    [1] Garcia Lopez, P., Montresor, A., Epema, D., Datta, A., Higashino, T., Iamnitchi, A., Adriana, I., Marinho, B., Pascal, F. & Riviere, E, “Edge-centric Computing: Vision and Challenges,” ACM SIGCOMM Computer Communication Review, vol. 45, no. 5, pp. 37–42, 2015.
    [2] Lockerman, E., Feldmann, A., Bakhshalipour, M., Stanescu, A., Gupta, S., Sanchez, D., & Beckmann, N., “Livia: Data-Centric Computing Throughout the Memory Hierarchy,” in Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 417–433, March 2020.
    [3] Chen, C. P., & Zhang, C. Y., “Data-intensive applications, challenges, techniques and technologies: A survey on Big Data,” Information sciences, vol. 275, pp. 314–347, August 2014.
    [4] Caulfield, A. M., Grupp, L. M., & Swanson, S., “Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications,” ACM SIGPLAN Notices, vol. 44, no. 3, pp. 217–228, 2019.
    [5] Fevgas, A., Akritidis, L., Bozanis, P., & Manolopoulos, Y., “Indexing in flash storage devices: a survey on challenges, current approaches, and future trends,” The VLDB Journal, vol. 29, no. 1, pp. 273–311, 2020.
    [6] Lee, W., Kang, M., Hong, S., & Kim, S., “Interpage-Based Endurance-Enhancing Lower State Encoding for MLC and TLC Flash Memory Storages,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no. 9, pp. 2033–2045, September 2019.
    [7] Cai, Y., Luo, Y., Haratsch, E. F., Mai, K., & Mutlu, O., “Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery,” in 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 551–563, February 2015.
    [8] Liang, Y. P., Chen, T. Y., Chang, Y. H., Chen, S. H., Wei, H. W., & Shih, W. K., “B^*-Sort: Enabling Write-once Sorting for Nonvolatile Memory,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 12, pp. 4549–4562, 2020.
    [9] Yu, S., Sun, X., Peng, X., & Huang, S., “Compute-in-Memory with Emerging Nonvolatile-Memories: Challenges and Prospects,” in 2020 IEEE Custom Integrated Circuits Conference (CICC), March 2020, pp. 1–4.
    [10] Venkataraman, S., Tolia, N., Ranganathan, P., & Campbell, R. H., “Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory,” in 9th USENIX Conference on File and Storage Technologies (FAST 11), 2011.
    [11] Le Gallo, M., & Sebastian, A., “An overview of phase-change memory device physics,” Journal of Physics D: Applied Physics, vol. 53, no. 21, pp. 213002, 2020.
    [12] Zhou, W., Feng, D., Hua, Y., Liu, J., Huang, F., & Chen, Y., “An Efficient Parallel Scheduling Scheme on Multi-partition PCM Architecture,” in Proceedings of the 2016 International Symposium on Low Power Electronics and Design, August 2016, pp. 344–349.
    [13] Wang, R., Jiang, L., Zhang, Y., Wang, L., & Yang, J., “Exploit Imbalanced Cell Writes to Mitigate Write Disturbance in Dense Phase Change Memory,” in 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), June 2015, pp. 1–6.
    [14] Chen, Y., Wong, W. F., Li, H., & Koh, C. K., “Processor Caches Built Using Multi-Level Spin-Transfer Torque RAM Cells,” in IEEE/ACM International Symposium on Low Power Electronics and Design, August 2011, pp. 73–78.
    [15] Mao, M., Li, H., Jones, A. K., & Chen, Y., “Coordinating Prefetching and STT-RAM based Last-level Cache Management for Multicore Systems,” in Proceedings of the 23rd ACM International Conference on Great Lakes Symposium on VLSI (GLSVLSI), May 2013, pp. 55–60.
    [16] Kültürsay, E., Kandemir, M., Sivasubramaniam, A., & Mutlu, O., “Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative,” in 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2013, pp. 256–267.
    [17] Xu, C., Niu, D., Zheng, Y., Yu, S., & Xie, Y., “Reliability-Aware Cross-Point Resistive Memory Design,” in Proceedings of the 24th Edition of the Great Lakes Symposium on VLSI (GLSVLSI), May 2014, pp. 145–150.
    [18] Wu, H., Wang, X. H., Gao, B., Deng, N., Lu, Z., Haukness, B., Bronner, G., & Qian, H., “Resistive Random Access Memory for Future Information Processing System,” in Proceedings of the IEEE, vol. 105, no. 9, pp. 1770–1789, 2017.
    [19] Fong, S. W., Neumann, C. M., & Wong, H. S. P., “Phase-Change Memory—Towards a Storage-Class Memory,” in IEEE Transactions on Electron Devices, vol. 64, no. 11, pp. 4374–4385, November 2017.
    [20] Oukid, I., & Lersch, L., “On the Diversity of Memory and Storage Technologies,” Datenbank-Spektrum, vol. 18, no. 2, pp. 121–127, June 2018.
    [21]Parkin, S. S., Hayashi, M., & Thomas, L., “Magnetic Domain-Wall Racetrack Memory,” Science, vol. 320, no. 5873, pp. 190–194, 2008.
    [22] Parkin, S., & Yang, S. H., “Memory on the racetrack,” Nature nanotechnology, vol. 10, no. 3, pp. 195–198, 2015.
    [23] Parkin, S. S., “Data in the Fast Lanes of RACETRACK MEMORY,” Scientific American, vol. 300, no. 6, pp. 76–81, 2009.
    [24] Thomas, L., Yang, S. H., Ryu, K. S., Hughes, B., Rettner, C., Wang, D. S., Tsai, C. H., Shen, K. H., & Parkin, S. S., “Racetrack Memory: a high-performance, low-cost, non-volatile memory based on magnetic domain walls,” in 2011 International Electron Devices Meeting, December 2011, pp. 24.2.1–24.2.4.
    [25] Hu, Q., Sun, G., Shu, J., & Zhang, C., “Exploring Main Memory Design Based on Racetrack Memory Technology,” in Proceedings of the 26th edition on Great Lakes Symposium on VLSI (GLSVLSI), May 2016, pp. 397–402.
    [26] Kang, W., Zheng, C., Huang, Y., Zhang, X., Zhou, Y., Lv, W., & Zhao, W., “Complementary Skyrmion Racetrack Memory With Voltage Manipulation,” in IEEE Electron Device Letters, vol. 37, no. 7, pp. 924–927, 2016.
    [27] Kang, W., Huang, Y., Zheng, C., Lv, W., Lei, N., Zhang, Y., Zhang, X., Zhou, Y., & Zhao, W., “Voltage Controlled Magnetic Skyrmion Motion for Racetrack Memory,” Scientific reports, vol. 6, no. 23164, pp. 1–11, 2016.
    [28] Xu, R., Sha, E. H. M., Zhuge, Q., Gu, S., & Shi, L., “Optimizing Data Placement for Hybrid SPM with SRAM and Racetrack Memory,” in 2020 IEEE 38th International Conference on Computer Design (ICCD), October 2020, pp. 409–416.
    [29] Sun, Z., Bi, X., Jones, A. K., & Li, H., “Design Exploration of Racetrack Lower-level Caches,” in 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), August 2014, pp. 263–266.
    [30] Sun, Z., Wu, W., & Li, H., “Cross-Layer Racetrack Memory Design for Ultra High Density and Low Power Consumption,” in 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC), May 2013, pp. 1–6.
    [31] Chang, C. H., & Chang, C. W., “Adaptive Memory and Storage Fusion on Non-Volatile One-Memory System,” in 2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA), August 2019, pp. 1–6.
    [32] Gu, S., Sha, E. H. M., Zhuge, Q., Chen, Y., & Hu, J., “A Time, Energy, and Area Efficient Domain Wall Memory-Based SPM for Embedded Systems,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 35, no. 12, pp. 2008–2017, December 2016.
    [33] Zhu, D., Kang, W., Li, S., Huang, Y., Zhang, X., Zhou, Y., & Zhao, W., “Skyrmion Racetrack Memory With Random Information Update/Deletion/Insertion,” in IEEE Transactions on Electron Devices, vol. 65, no. 1, pp. 87–95, January 2018.
    [34] Kang, W., Chen, X., Zhu, D., Zhang, X., Zhou, Y., Qiu, K., Zhang, Y., & Zhao, W., “A Comparative Study on Racetrack Memories: Domain Wall vs Skyrmion,” in 2018 IEEE 7th Non-Volatile Memory Systems and Applications Symposium (NVMSA), August 2018, pp. 7–12.
    [35] Kang, W., Wu, B., Chen, X., Zhu, D., Wang, Z., Zhang, X., Zhou, Y., Zhang, Y., & Zhao, W., “A Comparative Cross-layer Study on Racetrack Memories: Domain Wall vs Skyrmion,” ACM Journal on Emerging Technologies in Computing Systems (JETC), vol. 16, no. 1, pp. 1–17, 2019.
    [36] Chen, F., Li, Z., Kang, W., Zhao, W., Li, H., & Chen, Y., “Process Variation Aware Data Management for Magnetic Skyrmions Racetrack Memory,” in 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), January 2018, pp. 221–226.
    [37] Parkin, S., “Racetrack Memory: A storage class memory based on current controlled magnetic domain wall motion,” in 2009 Device Research Conference, June 2009, pp. 3–6.
    [38] Tomasello, R., Martinez, E., Zivieri, R., Torres, L., Carpentieri, M., & Finocchio, G., “A strategy for the design of skyrmion racetrack memories,” Scientific reports, vol. 4, no. 6784, pp. 1–7, 2014.
    [39] Bose, R. C., & Nelson, R. J., “A Sorting Problem,” Journal of the ACM (JACM), vol. 9, no. 2, pp. 282–296, 1962.
    [40] Venkatesan, R., Kozhikkottu, V., Augustine, C., Raychowdhury, A., Roy, K., & Raghunathan, A., “TapeCache: A high Density, Energy Efficient Cache Based on Domain Wall Memory,” in Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), July 2012, pp. 185–190.
    [41] Nolte, T., Hansson, H., Norström, C., & Punnekkat, S., “Using bit-stuffing distributions in CAN analysis,” in IEEE Real-Time Embedded Systems Workshop at the Real-Time Systems Symposium, December 2001.
    [42] Alaei, R., Moallem, P., & Bohlooli, A., “Statistical based algorithm for reducing bit stuffing in the Controller Area Networks,” Microelectronics Journal, vol. 101, p. 104794, 2020.
    [43] Tang, D. T., & Bahl, L. R., “Block Codes for a Class of Constrained Noiseless Channels,” Information and Control, vol. 17, no. 5, pp. 436– 461, 1970.
    [44] Le, D. D., Nguyen, D. P., Tran, T. H., & Nakashima, Y., “Joint polar and run-length limited decoding scheme for visible light communication systems,” IEICE Communications Express, vol. 7, no. 1, pp. 19–24, 2018.
    [45] Yang, T. Y., Yang, M. C., Li, J., & Kang, W., “Permutation-Write: Optimizing Write Performance and Energy for Skyrmion Racetrack Memory,” in 2020 57th ACM/IEEE Design Automation Conference (DAC), July 2020, pp. 1–6.
    [46] Zhang, C., Sun, G., Zhang, X., Zhang, W., Zhao, W., Wang, T., Liang, Y., Liu, Y., Wang, Y., & Shu, J., “Hi-fi Playback: Tolerating Position Errors in Shift Operations of Racetrack Memory,” in Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA), vol. 43, no. 3, pp. 694–706, June 2015.
    [47] Mao, H., Zhang, C., Sun, G., & Shu, J., “Exploring Data Placement in Racetrack Memory based Scratchpad Memory,” in 2015 IEEE Non-Volatile Memory System and Applications Symposium (NVMSA), August 2015, pp. 1–5.
    [48] Gu, S., Sha, E. H. M., Zhuge, Q., Chen, Y., & Hu, J., “Area and Performance Co-optimization for Domain Wall Memory in Application-specific Embedded Systems,” in Proceedings of the 52nd Annual Design Automation Conference, June 2015, pp. 1–6.
    [49] Mühlbauer, S., Binz, B., Jonietz, F., Pfleiderer, C., Rosch, A., Neubauer, A., Georgii, R., & Böni, P., “Skyrmion Lattice in a Chiral Magnet,” Science, vol. 323, no. 5916, pp. 915–919, 2009.
    [50] Liang, Z., Sun, G., Kang, W., Chen, X., & Zhao, W., “ZUMA: Enabling Direct Insertion/Deletion Operations with Emerging Skyrmion Racetrack Memory,” in 2019 56th ACM/IEEE Design Automation Conference (DAC), June 2019, pp. 1–6.
    [51] Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C., Introduction to Algorithms, MIT Press, 2022.
    [52] Jadoon, S., Solehria, S. F., & Qayum, M., “Optimized Selection Sort Algorithm is faster than Insertion Sort Algorithm: a Comparative Study,” International Journal of Electrical & Computer Sciences IJECS-IJENS, vol. 11, no. 2, pp. 19–24, 2011.
    [53] Zhang, C., Sun, G., Zhang, W., Mi, F., Li, H., & Zhao, W., “Quantitative Modeling of Racetrack Memory, A Tradeoff among Area, Performance, and Power,” in 20th Asia and South Pacific Design Automation Conference (ASP-DAC), January 2015, pp. 100–105.
    [54] Khan, A. A., Hameed, F., Bläsing, R., Parkin, S., & Castrillon, J., “RTSim: A Cycle-Accurate Simulator for Racetrack Memories,” in IEEE Computer Architecture Letters, vol. 18, no. 1, pp. 43–46, 2019.
    [55] Hsieh, Y. S., Huang, P. C., Chen, P. X., Chang, Y. H., Kang, W., Yang, M. C., & Shih, W. K., “Shift-limited Sort: Optimizing Sorting Performance on Skyrmion Memory-Based Systems,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 11, pp. 4115–4128, 2020.
    [56] Chen, T. Y., Chang, Y. H., Kuan, Y. H., Yang, M. C., Chang, Y. M., & Hsiu, P. C., “Enhancing Flash Memory Reliability by Jointly Considering Write-back Pattern and Block Endurance,” ACM Transactions on Design Automation of Electronic Systems (TODAES), vol. 23, no. 5, pp. 1–24, 2018.
    [57] Chen, S. H., Yang, M. C., & Chang, Y. H., “Optimizing Lifetime Capacity and Read Performance of Bit-Alterable 3-D NAND Flash,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 40, no. 2, pp. 218–231, 2020.
    [58] Chen, S. H., Tsao, C. W., & Chang, Y. H., “Beyond address mapping: A user-oriented multiregional space management design for 3-D NAND flash memory,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 6, pp. 1286–1299, June 2020.
    [59] Jin, H., Cheng, P., & Zhang, J., “Buffer System for Optical Storage System,” in 1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM, 10 Years Networking the Pacific Rim, 1987-1997, August 1997, pp. 134–137 vol. 1.
    [60] Archer, S., Mappouras, G., Calderbank, R., & Sorin, D., “Foosball Coding: Correcting Shift Errors and Bit Flip Errors in 3D Racetrack Memory,” in 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), June 2020, pp. 331–342.
    [61] Cho, S., & Lee, H., “Flip-N-Write: A Simple Deterministic Technique to Improve PRAM Write Performance, Energy and Endurance,” in Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, December 2009, pp. 347–357.
    [62] Alsuwaiyan, A., & Mohanram, K., “MFNW: An MLC/TLC Flip-N-Write Architecture,” ACM Journal on Emerging Technologies in Computing Systems (JETC), vol. 14, no. 2, pp. 1–32, 2018.
    [63] Palangappa, P. M., & Mohanram, K., “Flip-Mirror-Rotate: An Architecture for Bit-write Reduction and Wear Leveling in Non-volatile Memories,” in Proceedings of the 25th Edition on Great Lakes Symposium on VLSI (GLSVLSI), May 2015, pp. 221–224.
    [64] Von Puttkamer, E., “A Simple Hardware Buddy System Memory Allocator,” in IEEE Transactions on Computers, vol. C-24, no. 10, pp. 953– 957, 1975.
    [65] Bonwick, J., “The Slab Allocator: An Object-Caching Kernel Memory Allocator,” in USENIX Summer 1994 Technical Conference, vol. 16, June 1994.
    [66] Cooper, B. F., Silberstein, A., Tam, E., Ramakrishnan, R., & Sears, R., “Benchmarking Cloud Serving Systems with YCSB,” in Proceedings of the 1st ACM Symposium on Cloud Computing, June 2010, pp. 143–154.
    [67] Zhang, X., Ezawa, M., & Zhou, Y., “Magnetic skyrmion logic gates: conversion, duplication and merging of skyrmions,” Scientific reports, vol. 5, no. 1, pp. 1–8, 2015.
    [68] Chauwin, M., Hu, X., Garcia-Sanchez, F., Betrabet, N., Paler, A., Moutafis, C., & Friedman, J. S., “Skyrmion Logic System for Large-Scale Reversible Computation,” Physical Review Applied, vol. 12, p. 064053, 2019.
    [69] Zhang, H., Zhu, D., Kang, W., Zhang, Y., & Zhao, W., “Stochastic Computing Implemented by Skyrmionic Logic Devices,” Physical Review Applied, vol. 13, p. 054049, 2020.
    [70] Zokaee, F., Chen, F., Sun, G., & Jiang, L., “Sky-Sorter: A Processing-in-Memory Architecture for Large-Scale Sorting,” in IEEE Transactions on Computers, 2022.
    [71] Choong, B. C. M., Luo, T., Liu, C., He, B., Zhang, W., & Zhou, J. T., “Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems,” Journal of Systems Architecture, vol. 128, p. 102507, 2022.
    [72] Ollivier, S., Zhang, X., Tang, Y., Choudhuri, C., Hu, J., & Jones, A. K., “FPIRM: Floating-point Processing in Racetrack Memories,” arXiv preprint arXiv:2204.13788, 2022.
    [73] Lo, T. S., Wu, C. F., Chang, Y. H., Kuo, T. W., & Wang, W. C., “Space-efficient Graph Data Placement to Save Energy of ReRAM Crossbar,” in 2021 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), July 2021, pp. 1–6.
    [74] Kang, Y. W., Wu, C. F., Chang, Y. H., Kuo, T. W., & Ho, S. Y., “On Minimizing Analog Variation Errors to Resolve the Scalability Issue of ReRAM-based Crossbar Accelerators,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 11, pp. 3856–3867, 2020.
    [75] Shafiee, A., Nag, A., Muralimanohar, N., Balasubramonian, R., Strachan, J. P., Hu, M., Williams, R. S. & Srikumar, V., “ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 14–26, June 2016.
    [76] Chi, P., Li, S., Xu, C., Zhang, T., Zhao, J., Liu, Y., Wang, Y., & Xie, Y., “PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory,” ACM SIGARCH Computer Architecture News, vol. 44, no.3, pp. 27–39, June 2016.
    [77] Ahn, J., Yoo, S., Mutlu, O., & Choi, K., “PIM-Enabled Instructions: A Low-Overhead, Locality-Aware Processing-in-memory Architecture,” in 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA), pp. 336–348, June 2015.
    [78] Wang, S., Cao, J., & Yu, P., “Deep Learning for Spatio-Temporal Data Mining: A Survey,” in IEEE Transactions on Knowledge and Data Engineering, 2020.
    [79] Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., & Muharemagic, E., “Deep learning applications and challenges in big data analytics,” Journal of Big Data, vol. 2, no. 1, pp. 1–21, 2015.
    [80] Nisbet, R., Elder, J., & Miner, G., Handbook of Statistical Analysis and Data Mining Applications, Academic press, 2009.
    [81] Burr, G. W., Shelby, R. M., Sebastian, A., Kim, S., Kim, S., Sidler, S., Virwani, K., Ishii, M., Narayanan, P., Fumarola, A., Sanches, L. L., Boybat, I., Gallo, M. L., Moon, K., Woo, J., Hwang, H. & Leblebici, Y., “Neuromorphic computing using non-volatile memory,” Advances in Physics: X, vol. 2, no. 1, pp. 89–124, 2017.
    [82] Jang, J. W., Park, S., Jeong, Y. H., & Hwang, H., “ReRAM-based Synaptic Device for Neuromorphic Computing,” in 2014 IEEE International Symposium on Circuits and Systems (ISCAS), June 2014, pp. 1054–1057.
    [83] Schrauwen, B., Verstraeten, D., & Van Campenhout, J., “An overview of reservoir computing: theory, applications and implementations,” in Proceedings of the 15th European Symposium on Artificial Neural Networks (ESANN), 2007, pp. 471–482.
    [84] Li, S., Kang, W., Zhang, X., Nie, T., Zhou, Y., Wang, K. L., & Zhao, W. “Magnetic Skyrmions for Unconventional Computing,” Materials Horizons, vol. 8, no. 3, pp. 854–868, 2021.
    [85] Zhang, Y., Qu, P., Ji, Y., Zhang, W., Gao, G., Wang, G., Song, S., Li, G., Chen, W., Zheng, W., Chen, F., Pei, J., Zhao, R., Zhao, M., & Shi, L., “A system hierarchy for brain-inspired computing,” Nature, vol. 586, no. 7829, pp. 378–384, 2020.
    [86] Khan, A. A., Ollivier, S., Longofono, S., Hempel, G., Castrillon, J., & Jones, A. K., “Brain-inspired Cognition in Next Generation Racetrack Memories,” arXiv preprint arXiv:2111.02246, 2021.

    QR CODE