簡易檢索 / 詳目顯示

研究生: 王彥翔
Wang, Yen-Hsiang
論文名稱: 超越再現性:邁向科學研究中的計算韌性
Beyond Reproducibility: Toward Computational Resilience in Scientific Research
指導教授: 雷松亞
Ray, Soumya
口試委員: 楊立威
Yang, Lee-Wei
王尼克
Danks, Nicholas Patrick
學位類別: 碩士
Master
系所名稱: 科技管理學院 - 服務科學研究所
Institute of Service Science
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 53
中文關鍵詞: 計算可再現性計算韌性代碼可重用性代碼可擴展性可驗證性可分發性代碼可讀性
外文關鍵詞: computational reproducibility, computational resilience, code reusability, code extensibility, verifiability, distributability, code readability
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 大數據和先進計算能力的結合促進了用於跨科學領域科學發現的日益複雜的軟體,從而導致複雜的計算研究。與此同時,計算研究的挑戰正在增加,但不僅限於計算可重複性。軟體堆棧的複雜性和科學軟體的脆弱性等挑戰使得審閱者難以在他們自己的計算環境上複製和驗證研究,以及社群中的其他科學家重用和擴展代碼。我們研究了從概念形成到研究發表的計算研究如何通過轉向計算韌性(計算可再現性、計算 可複制性、操作穩健性、代碼可重用性和代碼可擴展性)來提高其應對破壞性挑戰的固有能力。然後,我們提出了支持原則(開放性、透明性、可驗證性、可分發性和代碼可讀性)及其相關技術,以介紹本身社群很重要的價值觀,並描述了我們希望建立 良好的計算研究。我們還通過回顧《管理科學》上發表的 264 項研究中的 36 項,發現這些研究中確實存在計算韌性問題,這些研究的科學軟體主要是用 R 或 Python 編寫的。最後,我們為科學家們提出了計算韌性的路線圖和設計,幫助他們評估研究工件的狀態和方向,以改進和組織它們在一個乾淨的結構中。


    The nexus of big data and advanced computing power has facilitated increasingly complex software used for scientific discovery across scientific fields, resulting in complex computational research. At the same time, the challenges of computational research are increasing but are not only limited to computational reproducibility. Challenges such as the complexity of software stack and the fragility of scientific software make it difficult for reviewers to reproduce and verify research on their own computational environments, and for other scientists in the community to reuse and extend code.
    We examine how computational research, from concept formation to research publication, can improve its inherent ability to respond to disruptive challenges, through moving towards computational resilience (computational reproducibility, computational replicability, operational robustness, code reusability, and code extensibility). Then, we propose supporting principles (openness, transparency, verifiability, distributability, and code readability) with their relevant techniques to introduce the community values that are important in themselves, and they describe how we want to get there to build good computational research. We also find that computational resilience issues do exist in the studies by reviewing 36 of the 264 studies published in Management Science, whose software was primarily written in R or Python. Finally, we present a roadmap and design toward computational resilience for scientists, assisting them in evaluating the status and direction of research artifacts to improve and organizing them in a clean package structure.

    Abstract -----------------------------------------------------------------------------------------------------2 摘要 ---------------------------------------------------------------------------------------------------------3 Table of Contents------------------------------------------------------------------------------------------4 Chapter 1 — Introduction--------------------------------------------------------------------------------5 Chapter 2 — Challenges in Computational Research ------------------------------------------------7 2.1 What is computational research? --------------------------------------------------------------7 2.2 Reproducibility and Replicability Crisis in Research----------------------------------------8 2.3 Fragility of Scientific Software - Cases of Computational Mistakes ---------------------10 2.4 Layers of Complexity - Software Stack ------------------------------------------------------13 2.5 Needs to go beyond computational reproducibility----------------------------------------16 Chapter 3 — Computational Resilience in Research -----------------------------------------------19 3.1 Dimensions of Computational Resilience ---------------------------------------------------19 3.2 Research Artifacts Lifecycle in the Research Publication ---------------------------------23 3.3 Supporting Principles for Computational Resilience --------------------------------------28 Chapter 4 — Explore Computational Resilience in Management Science ----------------------34 4.1 Data Collection — Studies from Management Science ------------------------------------34 4.2 Issues of Computational Resilience among the Studies -----------------------------------34 Chapter 5 — Roadmap and Design Towards Computational Resilience ------------------------37 5.1 Roadmap to Computational Resilience in Research----------------------------------------37 5.2 Packaging Research Artifacts for Computational Resilience -----------------------------41 Chapter 6 — Conclusion -------------------------------------------------------------------------------44 Chapter 7 — Limitations and Future Research ------------------------------------------------------47 Reference -------------------------------------------------------------------------------------------------48

    ACM. (2020). Artifact Review and Badging Version 1.1. Retrieved June 9, 2022, from https:// www.acm.org/publications/policies/artifact-review-and-badging-current
    Adam, D. (2020). Special report: The simulations driving the world's response to COVID-19. Nature, 580(7802), 316-319.
    Anderson, C. (2015). Docker [software engineering]. Ieee Software, 32(3), 102-c3.
    Bahaidarah, L., Hung, E., Oliveira, A. F., Penumaka, J., Rosario, L., & Trisovic, A. (2021). Toward reusable science with readable code and reproducibility. arXiv preprint arXiv:2109.10387.
    Bajpai, V., Kühlewind, M., Ott, J., Schönwälder, J., Sperotto, A., & Trammell, B. (2017, August). Challenges with reproducibility. In Proceedings of the Reproducibility Workshop (pp. 1-4).
    Baker, M. (2016). Reproducibility crisis. Nature, 533(26), 353-66.
    Bender, M. (2019). A code glitch may have caused errors in more than 100 published studies.
    Vice.
    Bergh, D. D., Sharp, B. M., Aguinis, H., & Li, M. (2017). Is there a credibility crisis in
    strategic management research? Evidence on the reproducibility of study findings.
    Strategic Organization, 15(3), 423-436.
    Boettiger, C. (2015). An introduction to Docker for reproducible research. ACM SIGOPS
    Operating Systems Review, 49(1), 71-79.
    Buse, R. P., & Weimer, W. R. (2008, July). A metric for software readability. In Proceedings
    of the 2008 international symposium on Software testing and analysis (pp. 121-130). But is the code (re)usable? (2021). Nature Computational Science, 1(7), 449. https://doi.org/
    10.1038/s43588-021-00109-9
    Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T. H., Huber, J., Johannesson, M., ... & Wu,
    H. (2018). Evaluating the replicability of social science experiments in Nature and
    Science between 2010 and 2015. Nature Human Behaviour, 2(9), 637-644. Chang, G. (2003). RETRACTED: structure of MsbA from Vibrio cholera: a multidrug
    resistance ABC transporter homolog in a closed conformation.
    48
    Chang, G., & Roth, C. B. (2001). Structure of MsbA from E. coli: a homolog of the multidrug resistance ATP binding cassette (ABC) transporters. Science, 293(5536), 1793-1800.
    Chawla, D. S. (2020). Critiqued coronavirus simulation gets thumbs up from code-checking efforts. Nature, 582(7812), 323-325.
    Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS quarterly, 1165-1188.
    Chen, X., Dallmeier-Tiessen, S., Dasler, R., Feger, S., Fokianos, P., Gonzalez, J. B., ... & Neubert, S. (2019). Open is not enough. Nature Physics, 15(2), 113-119.
    Clyburne-Sherin, A., Fei, X., & Green, S. A. (2019). Computational reproducibility via containers in psychology. Meta-psychology, 3.
    Culina, A., van den Berg, I., Evans, S., & Sánchez-Tójar, A. (2020). Low availability of code in ecology: A call for urgent action. PLoS Biology, 18(7), e3000763.
    Dacrema, M. F., Cremonesi, P., & Jannach, D. (2019, September). Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In Proceedings of the 13th ACM conference on recommender systems (pp. 101-109).
    Ferguson, N. M., Cummings, D. A., Cauchemez, S., Fraser, C., Riley, S., Meeyai, A., ... & Burke, D. S. (2005). Strategies for containing an emerging influenza pandemic in Southeast Asia. Nature, 437(7056), 209-214.
    Ferguson, N. M., Laydon, D., Nedjati-Gilani, G., Imai, N., Ainslie, K., Baguelin, M., ... & Ghani, A. C. (2020). Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand.
    Fidler, F., & Wilcox, J. (2018). Reproducibility of scientific results.
    Gandrud, C. (2018). Reproducible research with R and RStudio. Chapman and Hall/CRC. Geoffrey, C., B., R. C., L., R. C., Owen, P., Yen-Ju, C., & P., C. A. (2006). Retraction .
    Science, 314(5807), 1875. https://doi.org/10.1126/science.314.5807.1875b
    Goeva, A., Stoudt, S., & Trisovic, A. (2020). Toward reproducible and extensible research:
    from values to action.
    Grahe, J. (2018). Another step towards scientific transparency: Requiring research materials
    for publication. The Journal of Social Psychology, 158(1), 1-6.
    Gregor, S., & Hevner, A. R. (2013). Positioning and presenting design science research for
    maximum impact. MIS quarterly, 337-355. 49

    Hannay, J. E., MacLeod, C., Singer, J., Langtangen, H. P., Pfahl, D., & Wilson, G. (2009, May). How do scientists develop and use scientific software?. In 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering (pp. 1-8). Ieee.
    Hanson, B., Sugden, A., & Alberts, B. (2011). Making data maximally available. Science, 331(6018), 649-649.
    Herndon, T., Ash, M., & Pollin, R. (2014). Does high public debt consistently stifle economic growth? A critique of Reinhart and Rogoff. Cambridge journal of economics, 38(2), 257-279.
    Heroux, M. A., Barba, L., Parashar, M., Stodden, V., & Taufer, M. (2018). Toward a Compatible Reproducibility Taxonomy for Computational and Computing Sciences (No. SAND2018-11186). Sandia National Lab.(SNL-NM), Albuquerque, NM (United States).
    Hevner, A., & Chatterjee, S. (2010). Design science research in information systems. In Design research in information systems(pp. 9-22). Springer, Boston, MA. Hosseini, S., Barker, K., & Ramirez-Marquez, J. E. (2016). A review of definitions and
    measures of system resilience. Reliability Engineering & System Safety, 145, 47-61. Hutton, C., Wagener, T., Freer, J., Han, D., Duffy, C., & Arheimer, B. (2016). Most
    computational hydrology is not reproducible, so is it really science?. Water Resources
    Research, 52(10), 7548-7555.
    Ivie, P., & Thain, D. (2018). Reproducibility in scientific computing. ACM Computing
    Surveys (CSUR), 51(3), 1-36.
    Johanson, A., & Hasselbring, W. (2018). Software engineering for computational science:
    Past, present, future. Computing in Science & Engineering, 20(2), 90-109. Kanewala, U., & Bieman, J. M. (2014). Testing scientific software: A systematic literature
    review. Information and software technology, 56(10), 1219-1232.
    Kim, Y. M., Poline, J. B., & Dumas, G. (2018). Experimenting with reproducibility: a case
    study of robustness in bioinformatics. GigaScience, 7(7), giy077.
    Krafczyk, M., Shi, A., Bhaskar, A., Marinov, D., & Stodden, V. (2019, June). Scientific tests
    and continuous integration strategies to enhance reproducibility in the scientific software context. In Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems (pp. 23-28).
    50

    Ma, C., & Chang, G. (2004). Structure of the multidrug resistance efflux transporter EmrE from Escherichia coli. Proceedings of the National Academy of Sciences, 101(9), 2852-2857.
    Magazine, D. L. (2011). The dataverse network®: an open-source application for sharing, discovering and preserving data. D-lib Magazine, 17(1/2).
    Marwick, B., Boettiger, C., & Mullen, L. (2018). Packaging data analytical work reproducibly using R (and friends). The American Statistician, 72(1), 80-88.
    Merton, R. K. (1996). On social structure and science. University of Chicago Press. Miller, G. (2006). A scientist's nightmare: software problem leads to five retractions. Millman, K. J., & Aivazis, M. (2011). Python for scientists and engineers. Computing in
    Science & Engineering, 13(2), 9-12.
    Miyakawa, T. (2020). No raw data, no science: another possible source of the reproducibility
    crisis. Molecular brain, 13(1), 1-6.
    National Academies of Sciences, Engineering, and Medicine. (2019). Reproducibility and
    replicability in science.
    Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., ... &
    Yarkoni, T. S. C. I. E. N. T. I. F. I. C. S. T. A. N. D. A. R. D. S. (2015). Promoting an
    open research culture. Science, 348(6242), 1422-1425.
    Oberkampf, W. L., & Roy, C. J. (2010). Verification and validation in scientific computing.
    Cambridge University Press.
    Oberkampf, W. L., Trucano, T. G., & Pilch, M. M. (2007). Predictive Capability Maturity
    Model for computational modeling and simulation (No. SAND2007-5948). Sandia
    National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States). Open Science Collaboration. (2015). Estimating the reproducibility of psychological science.
    Science, 349(6251), aac4716.
    Osborne, J. M., Bernabeu, M. O., Bruna, M., Calderhead, B., Cooper, J., Dalchau, N., ... &
    Deane, C. (2014). Ten simple rules for effective computational research. PLoS
    Computational Biology, 10(3), e1003506.
    Pasquier, T., Lau, M. K., Trisovic, A., Boose, E. R., Couturier, B., Crosas, M., ... & Seltzer,
    M. (2017). If these data could talk. Scientific data, 4(1), 1-5.
    51

    Peng, R. D. (2011). Reproducible research in computational science. Science, 334(6060), 1226-1227.
    Pornillos, O., Chen, Y. J., Chen, A. P., & Chang, G. (2005). X-ray structure of the EmrE multidrug transporter in complex with a substrate. Science, 310(5756), 1950-1953.
    Post, D. (2013). The changing face of scientific and engineering computing. Computing in Science & Engineering, 15(06), 4-6.
    Prabhu, P., Kim, H., Oh, T., Jablin, T. B., Johnson, N. P., Zoufaly, M., ... & Beard, S. (2011, November). A survey of the practice of computational science. In SC'11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-12). IEEE.
    Reinhart, C. M., & Rogoff, K. S. (2010). Growth in a Time of Debt. American economic review, 100(2), 573-78.
    Reyes, C. L., & Chang, G. (2005). Structure of the ABC transporter MsbA in complex with ADP· vanadate and lipopolysaccharide. Science, 308(5724), 1028-1031.
    Riquelme, J. L., & Gjorgjieva, J. (2021). Towards readable code in neuroscience. Nature Reviews Neuroscience, 22(5), 257-258.
    Sanders, R., & Kelly, D. (2008). Dealing with risk in scientific software development. IEEE software, 25(4), 21-28.
    Segal, J. (2007, September). Some problems of professional end user developers. In IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC
    2007) (pp. 111-118). IEEE.
    Sevgi, M., Diaconescu, A. O., Tittgemeyer, M., & Schilbach, L. (2016). RETRACTED: social Bayes: using Bayesian modeling to study autistic trait–related differences in social cognition.
    Stagge, J. H., Rosenberg, D. E., Abdallah, A. M., Akbar, H., Attallah, N. A., & James, R.
    (2019). Assessing data availability and research reproducibility in hydrology and water
    resources. Scientific data, 6(1), 1-12.
    Stodden, V., Krafczyk, M. S., & Bhaskar, A. (2018, June). Enabling the verification of
    computational results: An empirical evaluation of computational reproducibility. In
    Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems (pp. 1-5).
    52

    Stodden, V., McNutt, M., Bailey, D. H., Deelman, E., Gil, Y., Hanson, B., ... & Taufer, M. (2016). Enhancing reproducibility for computational methods. Science, 354(6317), 1240-1241.
    Stodden, V., Seiler, J., & Ma, Z. (2018). An empirical analysis of journal policy effectiveness for computational reproducibility. Proceedings of the National Academy of Sciences, 115(11), 2584-2589.
    Storer, T. (2017). Bridging the chasm: A survey of software engineering practice in scientific programming. ACM Computing Surveys (CSUR), 50(4), 1-32.
    Strand, J. (2020). Scientists Make Mistakes. I Made a Big One. Retrieved from June 9, 2022, from https://medium.com/elemental-by-medium/when-science-needs-self-correcting- a130eacb4235
    Strand, J. F., Brown, V. A., & Barbour, D. L. (2019). Talking points: A modulating circle reduces listening effort without improving speech recognition. Psychonomic Bulletin & Review, 26(1), 291-297.
    Tansley, S., & Tolle, K. M. (2009). The fourth paradigm: data-intensive scientific discovery (Vol. 1). A. J. Hey (Ed.). Redmond, WA: Microsoft research.
    Taschuk, M., & Wilson, G. (2017). Ten simple rules for making research software more robust. PLoS computational biology, 13(4), e1005412.
    Wilson, G., Aruliah, D. A., Brown, C. T., Hong, N. P. C., Davis, M., Guy, R. T., ... & Wilson, P. (2014). Best practices for scientific computing. PLoS biology, 12(1), e1001745.

    QR CODE