研究生: |
王彥翔 Wang, Yen-Hsiang |
---|---|
論文名稱: |
超越再現性:邁向科學研究中的計算韌性 Beyond Reproducibility: Toward Computational Resilience in Scientific Research |
指導教授: |
雷松亞
Ray, Soumya |
口試委員: |
楊立威
Yang, Lee-Wei 王尼克 Danks, Nicholas Patrick |
學位類別: |
碩士 Master |
系所名稱: |
科技管理學院 - 服務科學研究所 Institute of Service Science |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 英文 |
論文頁數: | 53 |
中文關鍵詞: | 計算可再現性 、計算韌性 、代碼可重用性 、代碼可擴展性 、可驗證性 、可分發性 、代碼可讀性 |
外文關鍵詞: | computational reproducibility, computational resilience, code reusability, code extensibility, verifiability, distributability, code readability |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
大數據和先進計算能力的結合促進了用於跨科學領域科學發現的日益複雜的軟體,從而導致複雜的計算研究。與此同時,計算研究的挑戰正在增加,但不僅限於計算可重複性。軟體堆棧的複雜性和科學軟體的脆弱性等挑戰使得審閱者難以在他們自己的計算環境上複製和驗證研究,以及社群中的其他科學家重用和擴展代碼。我們研究了從概念形成到研究發表的計算研究如何通過轉向計算韌性(計算可再現性、計算 可複制性、操作穩健性、代碼可重用性和代碼可擴展性)來提高其應對破壞性挑戰的固有能力。然後,我們提出了支持原則(開放性、透明性、可驗證性、可分發性和代碼可讀性)及其相關技術,以介紹本身社群很重要的價值觀,並描述了我們希望建立 良好的計算研究。我們還通過回顧《管理科學》上發表的 264 項研究中的 36 項,發現這些研究中確實存在計算韌性問題,這些研究的科學軟體主要是用 R 或 Python 編寫的。最後,我們為科學家們提出了計算韌性的路線圖和設計,幫助他們評估研究工件的狀態和方向,以改進和組織它們在一個乾淨的結構中。
The nexus of big data and advanced computing power has facilitated increasingly complex software used for scientific discovery across scientific fields, resulting in complex computational research. At the same time, the challenges of computational research are increasing but are not only limited to computational reproducibility. Challenges such as the complexity of software stack and the fragility of scientific software make it difficult for reviewers to reproduce and verify research on their own computational environments, and for other scientists in the community to reuse and extend code.
We examine how computational research, from concept formation to research publication, can improve its inherent ability to respond to disruptive challenges, through moving towards computational resilience (computational reproducibility, computational replicability, operational robustness, code reusability, and code extensibility). Then, we propose supporting principles (openness, transparency, verifiability, distributability, and code readability) with their relevant techniques to introduce the community values that are important in themselves, and they describe how we want to get there to build good computational research. We also find that computational resilience issues do exist in the studies by reviewing 36 of the 264 studies published in Management Science, whose software was primarily written in R or Python. Finally, we present a roadmap and design toward computational resilience for scientists, assisting them in evaluating the status and direction of research artifacts to improve and organizing them in a clean package structure.
ACM. (2020). Artifact Review and Badging Version 1.1. Retrieved June 9, 2022, from https:// www.acm.org/publications/policies/artifact-review-and-badging-current
Adam, D. (2020). Special report: The simulations driving the world's response to COVID-19. Nature, 580(7802), 316-319.
Anderson, C. (2015). Docker [software engineering]. Ieee Software, 32(3), 102-c3.
Bahaidarah, L., Hung, E., Oliveira, A. F., Penumaka, J., Rosario, L., & Trisovic, A. (2021). Toward reusable science with readable code and reproducibility. arXiv preprint arXiv:2109.10387.
Bajpai, V., Kühlewind, M., Ott, J., Schönwälder, J., Sperotto, A., & Trammell, B. (2017, August). Challenges with reproducibility. In Proceedings of the Reproducibility Workshop (pp. 1-4).
Baker, M. (2016). Reproducibility crisis. Nature, 533(26), 353-66.
Bender, M. (2019). A code glitch may have caused errors in more than 100 published studies.
Vice.
Bergh, D. D., Sharp, B. M., Aguinis, H., & Li, M. (2017). Is there a credibility crisis in
strategic management research? Evidence on the reproducibility of study findings.
Strategic Organization, 15(3), 423-436.
Boettiger, C. (2015). An introduction to Docker for reproducible research. ACM SIGOPS
Operating Systems Review, 49(1), 71-79.
Buse, R. P., & Weimer, W. R. (2008, July). A metric for software readability. In Proceedings
of the 2008 international symposium on Software testing and analysis (pp. 121-130). But is the code (re)usable? (2021). Nature Computational Science, 1(7), 449. https://doi.org/
10.1038/s43588-021-00109-9
Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T. H., Huber, J., Johannesson, M., ... & Wu,
H. (2018). Evaluating the replicability of social science experiments in Nature and
Science between 2010 and 2015. Nature Human Behaviour, 2(9), 637-644. Chang, G. (2003). RETRACTED: structure of MsbA from Vibrio cholera: a multidrug
resistance ABC transporter homolog in a closed conformation.
48
Chang, G., & Roth, C. B. (2001). Structure of MsbA from E. coli: a homolog of the multidrug resistance ATP binding cassette (ABC) transporters. Science, 293(5536), 1793-1800.
Chawla, D. S. (2020). Critiqued coronavirus simulation gets thumbs up from code-checking efforts. Nature, 582(7812), 323-325.
Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS quarterly, 1165-1188.
Chen, X., Dallmeier-Tiessen, S., Dasler, R., Feger, S., Fokianos, P., Gonzalez, J. B., ... & Neubert, S. (2019). Open is not enough. Nature Physics, 15(2), 113-119.
Clyburne-Sherin, A., Fei, X., & Green, S. A. (2019). Computational reproducibility via containers in psychology. Meta-psychology, 3.
Culina, A., van den Berg, I., Evans, S., & Sánchez-Tójar, A. (2020). Low availability of code in ecology: A call for urgent action. PLoS Biology, 18(7), e3000763.
Dacrema, M. F., Cremonesi, P., & Jannach, D. (2019, September). Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In Proceedings of the 13th ACM conference on recommender systems (pp. 101-109).
Ferguson, N. M., Cummings, D. A., Cauchemez, S., Fraser, C., Riley, S., Meeyai, A., ... & Burke, D. S. (2005). Strategies for containing an emerging influenza pandemic in Southeast Asia. Nature, 437(7056), 209-214.
Ferguson, N. M., Laydon, D., Nedjati-Gilani, G., Imai, N., Ainslie, K., Baguelin, M., ... & Ghani, A. C. (2020). Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand.
Fidler, F., & Wilcox, J. (2018). Reproducibility of scientific results.
Gandrud, C. (2018). Reproducible research with R and RStudio. Chapman and Hall/CRC. Geoffrey, C., B., R. C., L., R. C., Owen, P., Yen-Ju, C., & P., C. A. (2006). Retraction .
Science, 314(5807), 1875. https://doi.org/10.1126/science.314.5807.1875b
Goeva, A., Stoudt, S., & Trisovic, A. (2020). Toward reproducible and extensible research:
from values to action.
Grahe, J. (2018). Another step towards scientific transparency: Requiring research materials
for publication. The Journal of Social Psychology, 158(1), 1-6.
Gregor, S., & Hevner, A. R. (2013). Positioning and presenting design science research for
maximum impact. MIS quarterly, 337-355. 49
Hannay, J. E., MacLeod, C., Singer, J., Langtangen, H. P., Pfahl, D., & Wilson, G. (2009, May). How do scientists develop and use scientific software?. In 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering (pp. 1-8). Ieee.
Hanson, B., Sugden, A., & Alberts, B. (2011). Making data maximally available. Science, 331(6018), 649-649.
Herndon, T., Ash, M., & Pollin, R. (2014). Does high public debt consistently stifle economic growth? A critique of Reinhart and Rogoff. Cambridge journal of economics, 38(2), 257-279.
Heroux, M. A., Barba, L., Parashar, M., Stodden, V., & Taufer, M. (2018). Toward a Compatible Reproducibility Taxonomy for Computational and Computing Sciences (No. SAND2018-11186). Sandia National Lab.(SNL-NM), Albuquerque, NM (United States).
Hevner, A., & Chatterjee, S. (2010). Design science research in information systems. In Design research in information systems(pp. 9-22). Springer, Boston, MA. Hosseini, S., Barker, K., & Ramirez-Marquez, J. E. (2016). A review of definitions and
measures of system resilience. Reliability Engineering & System Safety, 145, 47-61. Hutton, C., Wagener, T., Freer, J., Han, D., Duffy, C., & Arheimer, B. (2016). Most
computational hydrology is not reproducible, so is it really science?. Water Resources
Research, 52(10), 7548-7555.
Ivie, P., & Thain, D. (2018). Reproducibility in scientific computing. ACM Computing
Surveys (CSUR), 51(3), 1-36.
Johanson, A., & Hasselbring, W. (2018). Software engineering for computational science:
Past, present, future. Computing in Science & Engineering, 20(2), 90-109. Kanewala, U., & Bieman, J. M. (2014). Testing scientific software: A systematic literature
review. Information and software technology, 56(10), 1219-1232.
Kim, Y. M., Poline, J. B., & Dumas, G. (2018). Experimenting with reproducibility: a case
study of robustness in bioinformatics. GigaScience, 7(7), giy077.
Krafczyk, M., Shi, A., Bhaskar, A., Marinov, D., & Stodden, V. (2019, June). Scientific tests
and continuous integration strategies to enhance reproducibility in the scientific software context. In Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems (pp. 23-28).
50
Ma, C., & Chang, G. (2004). Structure of the multidrug resistance efflux transporter EmrE from Escherichia coli. Proceedings of the National Academy of Sciences, 101(9), 2852-2857.
Magazine, D. L. (2011). The dataverse network®: an open-source application for sharing, discovering and preserving data. D-lib Magazine, 17(1/2).
Marwick, B., Boettiger, C., & Mullen, L. (2018). Packaging data analytical work reproducibly using R (and friends). The American Statistician, 72(1), 80-88.
Merton, R. K. (1996). On social structure and science. University of Chicago Press. Miller, G. (2006). A scientist's nightmare: software problem leads to five retractions. Millman, K. J., & Aivazis, M. (2011). Python for scientists and engineers. Computing in
Science & Engineering, 13(2), 9-12.
Miyakawa, T. (2020). No raw data, no science: another possible source of the reproducibility
crisis. Molecular brain, 13(1), 1-6.
National Academies of Sciences, Engineering, and Medicine. (2019). Reproducibility and
replicability in science.
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., ... &
Yarkoni, T. S. C. I. E. N. T. I. F. I. C. S. T. A. N. D. A. R. D. S. (2015). Promoting an
open research culture. Science, 348(6242), 1422-1425.
Oberkampf, W. L., & Roy, C. J. (2010). Verification and validation in scientific computing.
Cambridge University Press.
Oberkampf, W. L., Trucano, T. G., & Pilch, M. M. (2007). Predictive Capability Maturity
Model for computational modeling and simulation (No. SAND2007-5948). Sandia
National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States). Open Science Collaboration. (2015). Estimating the reproducibility of psychological science.
Science, 349(6251), aac4716.
Osborne, J. M., Bernabeu, M. O., Bruna, M., Calderhead, B., Cooper, J., Dalchau, N., ... &
Deane, C. (2014). Ten simple rules for effective computational research. PLoS
Computational Biology, 10(3), e1003506.
Pasquier, T., Lau, M. K., Trisovic, A., Boose, E. R., Couturier, B., Crosas, M., ... & Seltzer,
M. (2017). If these data could talk. Scientific data, 4(1), 1-5.
51
Peng, R. D. (2011). Reproducible research in computational science. Science, 334(6060), 1226-1227.
Pornillos, O., Chen, Y. J., Chen, A. P., & Chang, G. (2005). X-ray structure of the EmrE multidrug transporter in complex with a substrate. Science, 310(5756), 1950-1953.
Post, D. (2013). The changing face of scientific and engineering computing. Computing in Science & Engineering, 15(06), 4-6.
Prabhu, P., Kim, H., Oh, T., Jablin, T. B., Johnson, N. P., Zoufaly, M., ... & Beard, S. (2011, November). A survey of the practice of computational science. In SC'11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-12). IEEE.
Reinhart, C. M., & Rogoff, K. S. (2010). Growth in a Time of Debt. American economic review, 100(2), 573-78.
Reyes, C. L., & Chang, G. (2005). Structure of the ABC transporter MsbA in complex with ADP· vanadate and lipopolysaccharide. Science, 308(5724), 1028-1031.
Riquelme, J. L., & Gjorgjieva, J. (2021). Towards readable code in neuroscience. Nature Reviews Neuroscience, 22(5), 257-258.
Sanders, R., & Kelly, D. (2008). Dealing with risk in scientific software development. IEEE software, 25(4), 21-28.
Segal, J. (2007, September). Some problems of professional end user developers. In IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC
2007) (pp. 111-118). IEEE.
Sevgi, M., Diaconescu, A. O., Tittgemeyer, M., & Schilbach, L. (2016). RETRACTED: social Bayes: using Bayesian modeling to study autistic trait–related differences in social cognition.
Stagge, J. H., Rosenberg, D. E., Abdallah, A. M., Akbar, H., Attallah, N. A., & James, R.
(2019). Assessing data availability and research reproducibility in hydrology and water
resources. Scientific data, 6(1), 1-12.
Stodden, V., Krafczyk, M. S., & Bhaskar, A. (2018, June). Enabling the verification of
computational results: An empirical evaluation of computational reproducibility. In
Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems (pp. 1-5).
52
Stodden, V., McNutt, M., Bailey, D. H., Deelman, E., Gil, Y., Hanson, B., ... & Taufer, M. (2016). Enhancing reproducibility for computational methods. Science, 354(6317), 1240-1241.
Stodden, V., Seiler, J., & Ma, Z. (2018). An empirical analysis of journal policy effectiveness for computational reproducibility. Proceedings of the National Academy of Sciences, 115(11), 2584-2589.
Storer, T. (2017). Bridging the chasm: A survey of software engineering practice in scientific programming. ACM Computing Surveys (CSUR), 50(4), 1-32.
Strand, J. (2020). Scientists Make Mistakes. I Made a Big One. Retrieved from June 9, 2022, from https://medium.com/elemental-by-medium/when-science-needs-self-correcting- a130eacb4235
Strand, J. F., Brown, V. A., & Barbour, D. L. (2019). Talking points: A modulating circle reduces listening effort without improving speech recognition. Psychonomic Bulletin & Review, 26(1), 291-297.
Tansley, S., & Tolle, K. M. (2009). The fourth paradigm: data-intensive scientific discovery (Vol. 1). A. J. Hey (Ed.). Redmond, WA: Microsoft research.
Taschuk, M., & Wilson, G. (2017). Ten simple rules for making research software more robust. PLoS computational biology, 13(4), e1005412.
Wilson, G., Aruliah, D. A., Brown, C. T., Hong, N. P. C., Davis, M., Guy, R. T., ... & Wilson, P. (2014). Best practices for scientific computing. PLoS biology, 12(1), e1001745.