簡易檢索 / 詳目顯示

研究生: 林憶萱
Lin, Yi-Syuan
論文名稱: 整合大型DNA儲存系統的新興DNA運算電腦系統
Designing an integrated DNA storage and computing for next-generation computer system
指導教授: 石維寬
Shih, Wei-Kuan
口試委員: 張原豪
Chang, Yuan-Hao
陳碩漢
Chen, Shuo-Han
梁郁珮
Liang, Yu-Pei
陳彥廷
Chen, Yen-Ting
陳郁方
Chen, Yu-Fang
周志遠
Chao, Jerry
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2023
畢業學年度: 112
語文別: 英文
論文頁數: 77
中文關鍵詞: 超大型資料集分群演算法DNA儲存系統DNA運算電腦微流體
外文關鍵詞: DNA Storage, DNA Computing, Digital Microfludics, Archived Storage, Clustering Algorithm
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著科技的發展,資料儲存的需求也越來越龐大。然而,傳統的儲存媒介已無法跟上這急速成長的需求。在這種情況下,去氧核糖核酸(DNA)儲存被視為一種具有吸引力的替代性儲存媒介。DNA儲存(DNA Storage)具有高密度、長時間存放的耐用性與穩定性,使其成為應對不斷增長的資料需求的理想選擇。然而,DNA資料儲存(DNA Storage)仍面臨著一些挑戰,其中包括高昂的讀寫成本。
    幸運的是,DNA計算技術(DNA Computing)的應用為這些挑戰提供了一個可行的解決方案。這種技術允許直接在DNA儲存架構(DNA Storage)中使用資料進行計算,而無需進行昂貴的轉換成傳統二進位制的格式。這種直接計算的方法大大節省了時間和成本,同時提高了計算效率。
    基於這些背景,本篇研究的主要目標是提出一個有效率並且具有DNA儲存系統(DNA Storage System)的完整DNA計算機框架,包含開發一種能夠充分利用DNA儲存的優勢並解決其局限性的計算機系統。該系統將集成DNA儲存架構(DNA Storage)和DNA計算單元(DNA Computing Unit),實現高效的資料儲存和計算功能。
    在我們的研究方法中,我們將探索如何最佳化管理DNA儲存架構(DNA Storage)中的資料,包括資料的放置和分類。我們將設計和優化DNA儲存架構的組織方式,以確保資料的快速訪問和高效存取。同時,我們也將研究如何避免DNA資料在運送過程中的污染和錯誤。
    為了實現此一目標,我們將結合DNA計算技術和微流體技術(Digital Microfluidics),以實現對DNA資料的自動處理和計算。這將有助於提高DNA計算機系統的效率和準確性,此解決方案有潛力革新數據儲存和計算領域,且為管理和處理大規模數位資料另闢蹊徑。


    With technological advancements, the demand for data storage continues to increase, potentially surpassing the capabilities of traditional storage media in the future. As a result, deoxyribonucleic acid (DNA) storage has emerged as an appealing alternative due to its high density, long-term durability, and stability. DNA storage offers an ideal solution to meet the escalating data demands. However, several challenges persist in DNA storage, notably the high costs associated with reading and writing operations. Fortunately, DNA computing technology presents a possible solution to these challenges. Expensive conversions to traditional binary formats are eliminated by enabling direct computation within DNA storage. This direct computation method offers notable advantages in terms of time and costs savings and enhanced computational efficiency. This paper's primary objective of this research is to propose an efficient and comprehensive DNA computing framework incorporating a DNA storage system. The goal is to develop a computing system that maximizes the benefits of DNA storage while addressing its limitations. Integrating the DNA storage architecture and the DNA computing unit enables efficient data storage and computation. The method to carry out this work is investigating optimal data management strategies within DNA storage, encompassing data placement and classification. We will design and optimize the organizational structure of the DNA storage framework to ensure efficient data access. In addition, we will explore methods to mitigate contamination and errors during DNA data transportation. To achieve these objectives, we will make use of digital microfluidics to automate the processing and computation of DNA data. This framework holds the potential to enhance the efficiency and accuracy of DNA computing systems significantly. Ultimately, this research may offer an alternative avenue for managing and processing large-scale digital data.

    Contents Abstract (Chinese) I Abstract II Acknowledgements (Chinese) III Contents IV List of Figures VII List of Tables IX List of Algorithms X 1 Introduction 1 1.1 DNA Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 DNA Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Background and Motivation 6 2.1 DNA Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 DNA-related technology . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.1 DNA synthesis and encoding . . . . . . . . . . . . . . . . . . 7 2.2.2 DNA sequencing and decoding . . . . . . . . . . . . . . . . . 8 2.2.3 PCR and Retrieve . . . . . . . . . . . . . . . . . . . . . . . 9 IV 2.2.4 OE-PCR for Insertion and Deletion . . . . . . . . . . . . . . 9 2.3 DNA storage operations with modern technology . . . . . . . . . . 10 2.3.1 A Brief History of DNA-based storage system . . . . . . . . 10 2.3.2 A DNA-Based Storage System: the Four Core Operations . 10 2.4 DNA computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4.1 Advancements in DNA Computing: From Logic Gates to Complex Circuits . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4.2 DNA Transport with Digital Microfluidics . . . . . . . . . . 15 2.5 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3 Methodology 21 3.1 A Write-Reduction Encoding and Accessing Technology for a Hybrid- Electronic DNA Storage System . . . . . . . . . . . . . . . . . . . . 21 3.2 Key Store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1 VERA: Version Editing Recovery Approach . . . . . . . . . 24 3.2.2 Index scheme in Key store . . . . . . . . . . . . . . . . . . . 27 3.2.3 Additional usage of key store . . . . . . . . . . . . . . . . . 28 3.3 Data preprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4 Working procedure of WREATH . . . . . . . . . . . . . . . . . . . 31 3.5 DCSS: a DNA computer with a storage system . . . . . . . . . . . . 35 3.5.1 Host System Interaction . . . . . . . . . . . . . . . . . . . . 37 3.5.2 DNA Storage Process . . . . . . . . . . . . . . . . . . . . . . 39 3.5.3 DNA Computing Unit . . . . . . . . . . . . . . . . . . . . . 40 3.5.4 Working procedure of DNA Computer . . . . . . . . . . . . 40 4 Performance Evaluation 42 4.1 DNA Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.1.1 Experimental setting . . . . . . . . . . . . . . . . . . . . . . 42 V 4.1.2 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . 48 4.2 DNA Computer System . . . . . . . . . . . . . . . . . . . . . . . . 56 4.2.1 Experiment setting . . . . . . . . . . . . . . . . . . . . . . . 56 4.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5 Conclusion 68 Bibliography 70 6 Publication List 76 Publication List 76

    Bibliography
    [1] Ion chef system, 2017.
    [2] Leonard M Adleman. Molecular computation of solutions to combinatorial
    problems. science, 266(5187):1021–1024, 1994.
    [3] Mirela Alistar and Urs Gaudenz. Opendrop: An integrated do-it-yourself
    platform for personal use of biochips. Bioengineering, 4(2):45, 2017.
    [4] Meinolf Blawat, Klaus Gaedke, Ingo H ̈utter, Xiao-Ming Chen, Brian Turczyk,
    Samuel Inverso, Benjamin W. Pruitt, and George M. Church. Forward error
    correction for dna data storage. Procedia Computer Science, 80:1011–1022,
    2016. International Conference on Computational Science 2016, ICCS 2016,
    6-8 June 2016, San Diego, California, USA.
    [5] James Bornholt, Randolph Lopez, Douglas M. Carmean, Luis Ceze, Georg
    Seelig, and Karin Strauss. A dna-based archival storage system. In Proceedings
    of the Twenty-First International Conference on Architectural Support for
    Programming Languages and Operating Systems, ASPLOS ’16, page 637–649,
    New York, NY, USA, 2016. Association for Computing Machinery.
    [6] Rob Carlson. On dna and transistorso, 2016.
    [7] Luis Ceze, Jeff Nivala, and Karin Strauss. Molecular digital data storage
    using dna. Nature Reviews Genetics, 20(8):456–466, 2019.
    70
    [8] Luis Ceze, Karin Strauss, Patrick PC Lee, Fan Xu, Atul Sikaria, Yi Liu, Pavel
    Zakharov, Cheng He, Jiongzhou Liu, Mike Shuey, et al. {DNA} data storage
    and {Near-Molecule} processing for the yottabyte era. In 19th USENIX Con-
    ference on File and Storage Technologies (FAST 21), pages 417–429, 2021.
    [9] Yeongjae Choi, Taehoon Ryu, Amos C Lee, Hansol Choi, Hansaem Lee, Jae-
    jun Park, Suk-Heung Song, Seojoo Kim, Hyeli Kim, Wook Park, et al. High
    information capacity dna-based data storage with augmented encoding char-
    acters using degenerate bases. Scientific reports, 9(1):1–7, 2019.
    [10] George M Church, Yuan Gao, and Sriram Kosuri. Next-generation digital
    information storage in dna. Science, 337(6102):1628–1628, 2012.
    [11] Cisco. Cisco annual internet report (2018–2023) white paper, 2020.
    [12] Yaniv Erlich and Dina Zielinski. Dna fountain enables a robust and efficient
    storage architecture. Science, 355(6328):950–954, 2017.
    [13] Andy Extance. How dna could store all the world’s data. Nature News,
    537(7618):22, 2016.
    [14] Nick Goldman, Paul Bertone, Siyuan Chen, Christophe Dessimoz, Emily M
    LeProust, Botond Sipos, and Ewan Birney. Towards practical, high-
    capacity, low-maintenance information storage in synthesized dna. Nature,
    494(7435):77–80, 2013.
    [15] Sara Goodwin, John D McPherson, and W Richard McCombie. Coming of
    age: ten years of next-generation sequencing technologies. Nature Reviews
    Genetics, 17(6):333, 2016.
    [16] Robert N. Grass, Reinhard Heckel, Michela Puddu, Daniela Paunescu, and
    Wendelin J. Stark. Robust chemical preservation of digital information on
    71
    dna in silica with error-correcting codes. Angewandte Chemie International
    Edition, 54(8):2552–2555, 2015.
    [17] Gregory Griffin, Alex Holub, and Pietro Perona. Caltech 256, 4 2022.
    [18] Taishan Hu, Nilesh Chitnis, Dimitri Monos, and Anh Dinh. Next-generation
    sequencing technologies: An overview. Human Immunology, 82(11):801–811,
    2021.
    [19] Bio Basic Inc. Dna sequencing price, 2017.
    [20] Sriram Kosuri and George M Church. Large-scale de novo dna synthesis:
    technologies and applications. Nature methods, 11(5):499, 2014.
    [21] Wei Lai, Lei Ren, Qian Tang, Xiangmeng Qu, Jiang Li, Lihua Wang, Li Li,
    Chunhai Fan, and Hao Pei. Programming chemical reaction networks using
    intramolecular conformational motions of dna. ACS nano, 12(7):7093–7099,
    2018.
    [22] Fei-Fei Li, Marco Andreeto, Marc’Aurelio Ranzato, and Pietro Perona. Cal-
    tech 101, 4 2022.
    [23] Seung-Hwan Lim, Hyogi Sim, Raghul Gunasekaran, and Sudharshan S
    Vazhkudai. Scientific user behavior and data-sharing trends in a petascale
    file system. In Proceedings of the International Conference for High Perfor-
    mance Computing, Networking, Storage and Analysis, pages 1–12, 2017.
    [24] Kevin N. Lin, Kevin Volkel, James M. Tuck, and Albert J. Keung. Dynamic
    and scalable dna-based information storage. Nature Communications, 11(1),
    2020.
    72
    [25] WANG Liu-yue, LI Hui-mei, MA Meng-qi, LIANG Ming-xing, HE Ru-yang,
    and CHEN Hua-bo. Improve the site-directed mutagenesis efficiency of overlap
    extension pcr by outboard-primers. Biotechnology Bulletin, 35(12):196, 2019.
    [26] Mark Douglas Matteucci and M Ho Caruthers. Synthesis of deoxyoligonu-
    cleotides on a polymer support. Journal of the American Chemical Society,
    103(11):3185–3191, 1981.
    [27] Kchouk Mehdi, Jean-Francois Gibrat, and Mourad Elloumi. Generations of
    sequencing technologies: From first to next generation. Electromagnetic Bi-
    ology and Medicine, 9(3):8–p, 2017.
    [28] Linda C Meiser, Philipp L Antkowiak, Julian Koch, Weida D Chen, A Xavier
    Kohll, Wendelin J Stark, Reinhard Heckel, and Robert N Grass. Reading and
    writing digital data in dna. Nature protocols, 15(1):86–101, 2020.
    [29] Sharon Newman, Ashley P Stephenson, Max Willsey, Bichlien H Nguyen,
    Christopher N Takahashi, Karin Strauss, and Luis Ceze. High density dna
    data storage library via dehydration with digital microfluidic retrieval. Nature
    communications, 10(1):1–6, 2019.
    [30] Mitsunori Ogihara and Animesh Ray. Simulating boolean circuits on a dna
    computer. In Proceedings of the first annual international conference on Com-
    putational molecular biology, pages 226–231, 1997.
    [31] Lee Organick, Siena Dumas Ang, Yuan-Jyue Chen, Randolph Lopez, Sergey
    Yekhanin, Konstantin Makarychev, Miklos Z Racz, Govinda Kamath, Parik-
    shit Gopalan, Bichlien Nguyen, et al. Random access in large-scale dna data
    storage. Nature biotechnology, 36(3):242, 2018.
    73
    [32] Darshan Panda, Kutubuddin Ali Molla, Mirza Jainul Baig, Alaka Swain,
    Deeptirekha Behera, and Manaswini Dash. Dna as a digital information stor-
    age device: hope or hype? 3 Biotech, 8(5):1–9, 2018.
    [33] Michael G Pollack, Alexander D Shenderov, and Richard B Fair.
    Electrowetting-based actuation of droplets for integrated microfluidics. Lab
    on a Chip, 2(2):96–101, 2002.
    [34] Tianqi Song, Abeer Eshra, Shalin Shah, Hieu Bui, Daniel Fu, Ming Yang,
    Reem Mokhtar, and John Reif. Fast and compact dna logic circuits based on
    single-stranded gates using strand-displacing polymerase. Nature nanotech-
    nology, 14(11):1075–1081, 2019.
    [35] Synbio Technologies. Gene synthesis, 2021.
    [36] Xiewei Xiong, Tong Zhu, Yun Zhu, Mengyao Cao, Jin Xiao, Li Li, Fei Wang,
    Chunhai Fan, and Hao Pei. Molecular convolutional neural networks with
    dna regulatory circuits. Nature Machine Intelligence, 4(7):625–635, 2022.
    [37] Rui Xu and Donald Wunsch. Survey of clustering algorithms. IEEE Trans-
    actions on neural networks, 16(3):645–678, 2005.
    [38] SM Hossein Tabatabaei Yazdi, Yongbo Yuan, Jian Ma, Huimin Zhao, and
    Olgica Milenkovic. A rewritable, random-access dna-based storage system.
    Scientific reports, 5(1):1–10, 2015.
    [39] Pingping Zhang, Yingying Ding, Wenting Liao, Qiuli Chen, Huaqun Zhang,
    Peipei Qi, Ting He, Jinhong Wang, Songhua Deng, Tianyue Pan, et al. A
    simple, universal, efficient pcr-based gene synthesis method: sequential oe-pcr
    gene synthesis. Gene, 524(2):347–354, 2013.
    74
    [40] Victor Zhirnov, Reza M. Zadegan, Gurtej S. Sandhu, George M. Church, and
    William L. Hughes. Nucleic acid memory. Nature Materials, 15(4):366–370,
    2016.

    QR CODE