自然語言處理的數據收集：新方法和新挑戰｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	徐博彥 Shmueli, Boaz
論文名稱：	自然語言處理的數據收集：新方法和新挑戰 NLP Data Collection: New Methods and New Issues
指導教授：	古倫維 Ku, Lun-Wei 雷松亞 Soumya, Ray
口試委員:	陳信希 Chen, Hsin-Hsi 李政德 Li, Cheng-Te 陳宜欣 Chen, Yi-Shin 沈之涯 Shen, Chih-Ya
學位類別：	博士 Doctor
系所名稱：	電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications
論文出版年：	2022
畢業學年度：	110
語文別：	英文
論文頁數：	83
中文關鍵詞：	自然語言處理、自然語言處理、計算語言學、情感計算、諷刺檢測、情緒識別、情緒檢測、回應型表情包、群眾外包倫理、表情包、GIF 、群眾外GIF
外文關鍵詞：	natural language processing, computational linguistics, NLP, sentiment analysis, affective computing, sarcasm detection, emotion recognition, emotion detection, GIF, reaction GIF, crowdsourcing ethics, AI ethics
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

N/A

Natural Language Processing (NLP) – computer systems that “understand” and “generate” text – has seen tremendous progress in recent years, mostly as a result of advances in machine learning. NLP applications, such as machine translation and automated personal assistants (e.g., Siri), have become ubiquitous in modern life. Many of the machine learning algorithms powering such applications require large, high-quality datasets for training. Our work focuses on new methods and new issues related to the collection and labeling of such datasets.

We propose new methods for the automatic collection of data for affective computing, which is the study and development of systems that can process, classify, and synthesize mental states (emotions, feelings, moods). We first present a method for collecting sarcasm data; such data is important for building sarcasm detectors, which are essential for recognizing sarcastic intent (and sarcasm perception) in human communication. Our method is based on the careful analysis of text-based reactions and interactions of users on social media, and offers unique and important advantages over all existing methods for collecting sarcasm data. One important advantage is the ability to automatically collect both intended and perceived sarcasm. Another property of our method is that the labeling is context- and culture-aware, ensuring a high-quality dataset.

Moving from sarcasm to emotions, we present a novel method for collecting and labeling texts with their induced reaction labels. We highlight the distinction between induced emotions and perceived emotions — a distinction mostly missing from the NLP literature. We find that most existing datasets are labeled perceived emotions. Datasets with induced emotions are of utmost importance but more difficult to collect. Our method thus fills this important gap. The method is based on the novel use of reaction GIFs – the short, mute animations used ubiquitously in social media as reactions to texts. By carefully analyzing online interactions on social networks, we are able to capture texts and their induced reactions. In addition, we show how the labels in the dataset can be augmented with induced sentiment and induced emotions. The method can capture data from various platforms that use reaction GIFs, as well as applied to different downstream tasks including multi-modal emotion detection and emotion recognition in dialogues. We used the new methods to collect a large sarcasm dataset and a large reaction dataset. Both these datasets are available to the research community. Along with our methods, they open up new directions for research and applications in affective computing.

Finally, we turn our attention to new issues related to manual data collection of NLP data, which is often done using crowdsourcing platforms such as Amazon Mechanical Turk. We explore ethical issues pertaining to the employment of crowdworkers for collection and annotation of NLP datasets. We find that NLP crowdsourcing work is growing exponentially, yet most existing related ethical research is limited in scope, focusing on labor-related issues such as compensation and working conditions. We discover that the Final Rule, which is the common framework on which ethics committees (e.g., IRBs) are based, is not suited for online data collection platforms. We highlight various harms and risks related to the NLP-related tasks performed by crowdworkers, as well as debunk a few myths related to the IRB process. This has vast implications for both researchers and workers. As part of this work, the current employment of IRBs in NLP research was studied. An important question that is answered in this research is: “are crowdworkers human subjects?”. The research also finds common scenarios where crowdworkers performing NLP tasks are at risk of harm, including psychological harm such as addiction. This contribution fills an important gap in the NLP ethics literature, and serves to reopen the discussion regarding the ethical employment of crowdworkers. Our work can serve as a framework for researchers designing and reviewing crowdsourced work for NLP and related machine learning domains.

Introduction . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . 19
1 What is NLP?  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2 Affective Computing . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Machine Learning  . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Supervised Learning and Text Annotation . . . . . . . . . . . . . . . . 20
5 Thesis Outline  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Reactive Supervision: A New Method For Collecting Sarcasm Data  . . . . . 25
1 Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2 Reactive Supervision  . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1 Method  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Advantages  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3 SPIRS Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4 Experiments and Analysis  . . . . . . . . . . . . . . . . . . . . . . . 31
4.1 Sarcasm Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Detection with Conversation Context . . . . . . . . . . . . . . . . . 32
4.3 Perspective Classification  . . . . . . . . . . . . . . . . . . . . . 33
5 Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Happy Dance, Slow Clap: Using Reaction GIFs to Predict Induced Affect on Twitter . . . 35
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2 Automatic Supervision using GIFs . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.1 The Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 Category Clustering  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3 ReactionGIF Dataset  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 Baselines  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing 45
1 Revisiting the Ethics of Crowdsourcing . . . . . . . . . . . . . . . . 46
2 The Rise and Rise of NLP Crowdsourcing . . . . . . . . . . . . . . . . 47
2.1 Categorizing Crowdsourced Tasks  . . . . . . . . . . . . . . . . . . 48
2.2 Surveys and Gamification . . . . . . . . . . . . . . . . . . . . . . 49
3 The Rules and Institutions of Research Ethics  . . . . . . . . . . . . 50
3.1 The Genesis of Modern Research Ethics  . . . . . . . . . . . . . . . 50
3.2 The Belmont Principles and IRBs  . . . . . . . . . . . . . . . . . . 50
3.3 Are IRBs Universal?  . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Do NLP Tasks Constitute Research Involving Human Subjects? . . . . . . 52
4.1 Are Crowdsourcing Tasks Research?  . . . . . . . . . . . . . . . . . 52
4.2 Are Crowdworkers Human Subjects? . . . . . . . . . . . . . . . . . . 53
5 Dispelling IRB Misconceptions  . . . . . . . . . . . . . . . . . . . . 54
5.1 Researchers Cannot Exempt Themselves from an IRB Application . . . . 54
5.2 Worker IDs Constitute IPI  . . . . . . . . . . . . . . . . . . . . . 55
5.3 Obtaining Anonymous Data Does Not Automatically Absolve from IRB Review  . . . . . . . . 55
5.4 Payment to Crowdworkers Does Not Exempt Researchers from IRB Review  . . . . . . . . . . 55
5.5 Non-Published Research Also Requires IRB Review . . . . . . . . . . . 56
6 Risks and Harms for Crowdworkers  . . . . . . . . . . . . . . . . . . . 56
6.1 Inducing Psychological Harms  . . . . . . . . . . . . . . . . . . . . 57
6.2 Exposing Sensitive Information of Workers . . . . . . . . . . . . . . 57
6.3 Unwittingly Including Vulnerable Populations  . . . . . . . . . . . . 58
6.4 Breaching Anonymity and Privacy . . . . . . . . . . . . . . . . . . . 59
6.5 Triggering Addictive Behaviour  . . . . . . . . . . . . . . . . . . . 59
7 Ways Forward  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Bibliography  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

A Reactive Supervision  . . . . . . . . . . . . . . . . . . . . . . . . . . 79
A.1 Search Pattern Production . . . . . . . . . . . . . . . . . . . . . . . 79
A.2 Author Sequence Distribution . . . . . . . . . . . . . . . . . .  . . . 80
A.3 Tweet Position Distributio. . . . . . . . . . . . . . . . . . . . . . . 81

B ReactionGIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
B.1 Dataset Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
                                

Gavin Abercrombie, Valerio Basile, Sara Tonelli, Verena Rieser, and Alexandra Uma, editors. Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022. European Language Resources Association, Marseille, France, June 2022. ISBN 979-10-95546-98-6. URL https://aclanthology.org/2022.nlperspectives-1.

Amazon Mechanical Turk. Amazon Mechanical Turk Pricing. https://web.archive.org/web/20201126061244/https://requester.mturk.com/pricing,
2020. Accessed: 2020-11-26.

American Association for Public Opinion Research. IRB FAQs for Survey Researchers.
https://www.aapor.org/Standards-Ethics/Institutional-Review-Boards/

IRB-FAQs-for-Survey-Researchers.aspx,
2020. Accessed: 2020-11-11.

Saeideh Bakhshi, David A. Shamma, Lyndon Kennedy, Yale Song, Paloma de Juan, and Joseph ’Jofish’ Kaye. Fast, cheap, and good: Why animated GIFs engage us. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI ’16, page 575–586, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450333627. doi: 10.1145/2858036.2858532. URL https://doi.org/10.1145/2858036.2858532.

David Bamman and Noah A Smith. Contextualized Sarcasm Detection on Twitter. In Ninth International AAAI Conference on Web and Social Media, 2015. URL https://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10538/10445.

Luca Belli, Sofia Ira Ktena, Alykhan Tejani, Alexandre Lung-Yut-Fong, Frank Port-man, Xiao Zhu, Yuanpu Xie, Akshay Gupta, Michael M. Bronstein, Amra Delic, Gabriele Sottocornola, Vito Walter Anelli, Nazareno Andrade, Jessie Smith, and Wenzhe Shi. Privacy-preserving recommender systems challenge on Twitter’s home timeline. CoRR, abs/2004.13715, 2020. URL https://arxiv.org/abs/2004.13715.

Emily M. Bender, Dirk Hovy, and Alexandra Schofield. Integrating ethics into the NLP curriculum. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, pages 6–9, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-tutorials.2. URL https://www.aclweb.org/anthology/2020.acl-tutorials.2.

Adrian Benton, Glen Coppersmith, and Mark Dredze. Ethical research protocols for social media health research. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, pages 94–102, 2017.

Christopher M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg, 2006. ISBN 0387310738.

Laura-Ana-Maria Bostan and Roman Klinger. An analysis of annotated corpora for emotion classification in text. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2104–2119, Santa Fe, New Mexico, USA, August 2018. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/C18-1179.

Allan M. Brandt. Racism and research: The case of the Tuskegee syphilis study. The Hastings Center Report, 8(6):21–29, 1978. ISSN 00930334, 1552146X. doi: 10.2307/ 3561468. URL www.jstor.org/stable/3561468.

Sven Buechel and Udo Hahn. Readers vs. writers vs. texts: Coping with different perspectives of text understanding in emotion annotation. In Proceedings of the 11th Linguistic Annotation Workshop, pages 1–12, Valencia, Spain, April 2017. Association for Computational Linguistics. doi: 10.18653/v1/W17-0801. URL https://www.
aclweb.org/anthology/W17-0801.

Chris Callison-Burch and Mark Dredze. Creating speech and language data with Amazon’s Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pages 1–12, Los Angeles, June 2010. Association for Computational Linguistics. URL https://aclanthology.org/W10-0701.

Rafael A. Calvo and Sunghwan Mac Kim. Emotions in text: Dimensional and categorical models. Computational Intelligence, 29(3):527–543, 2013. doi: https: //doi.org/10.1111/j.1467-8640.2012.00456.x. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-8640.2012.00456.x.

Arthur L. Caplan. When evil intrudes. The Hastings Center Report, 22(6):29–32, 1992. ISSN 00930334, 1552146X. doi: 10.2307/3562946. URL www.jstor.org/stable/3562946.

Alexander M. Capron. Legal and regulatory standards of informed consent in research. In Ezekiel J. Emanuel, Christine Grady, Robert A. Crouch, Reidar K. Lie, Franklin G. Miller, and David Wendler, editors, The Oxford Textbook of Clinical Research Ethics, pages 613–632. Oxford University Press, New York, NY, 2008.

Carnegie-Mellon University, Language Technologies Institute. Human subjects research determination worksheet. https://web.archive.org/web/20201111132614/http://demo.clab.cs.cmu.edu/11737fa20/slides/multilingual-21-annotation.pdf,
2020. Accessed: 2020-11-11.

Committee on Publication Ethics. https://publicationethics.org, 2020. Accessed: 2020-11-22.

Council for International Organizations of Medical Sciences. International Ethical Guidelines for Biomedical Research Involving Human Subjects. https://cioms.ch/wp-content/uploads/2016/08/International_Ethical_Guidelines_for_Biomedical_Research_Involving_Human_Subjects.pdf, 2002. Accessed: 2021-04-06.

Florian Daniel, Pavel Kucherbaev, Cinzia Cappiello, Boualem Benatallah, and Mohammad Allahbakhsh. Quality control in crowdsourcing: A survey of quality attributes, assessment techniques, and assurance actions. ACM Comput. Surv., 51(1), January 2018. ISSN 0360-0300. doi: 10.1145/3148148. URL https://doi.org/10.1145/3148148.

Dmitry Davidov, Oren Tsur, and Ari Rappoport. Semi-supervised Recognition of Sarcastic Sentences in Twitter and Amazon. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning, CoNLL ’10, pages 107–116, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics. ISBN 978-1-932432-83-1. URL http://dl.acm.org/citation.cfm?id=1870568.1870582. event-place: Uppsala, Sweden.

Dorottya Demszky, Dana Movshovitz-Attias, Jeongwoo Ko, Alan Cowen, Gaurav Ne-made, and Sujith Ravi. GoEmotions: A dataset of fine-grained emotions. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4040–4054, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.372. URL https://www.aclweb.org/anthology/2020.acl-main.372.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pretraining of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://www.aclweb.org/anthology/N19-1423.

Tao Ding and Shimei Pan. Personalized emphasis framing for persuasive message generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 1432–1441, Austin, Texas, November 2016. Association for Computational Linguistics. doi: 10.18653/v1/D16-1150. URL https://www.aclweb.org/anthology/D16-1150.

Paul Ekman. An argument for basic emotions. Cognition and Emotion, 6(3-4): 169–200, 1992. doi: 10.1080/02699939208411068. URL https://doi.org/10.1080/02699939208411068.

Elena Filatova. Irony and Sarcasm: Corpus Generation and Analysis Using Crowd-sourcing. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), pages 392–398, Istanbul, Turkey, May 2012. European Language Resources Association (ELRA). URL http://www.lrec-conf.org/proceedings/lrec2012/pdf/661_Paper.pdf.

Final Rule. Protection of Human Subjects, Part 46. In US Department of Health and Human Services, Code of Federal Regulations (CFR) Title 45, 2018.

Karën Fort, Gilles Adda, and K Bretonnel Cohen. Amazon mechanical turk: Gold mine or coal mine? Computational Linguistics, 37(2):413–420, 2011.
Alf Gabrielsson. Emotion perceived and emotion felt: Same or different? Musicae Scientiae, 5(Special Issue: Current Trends in the Study of Music and Emotion): 123–147, 2001. doi: 10.1177/10298649020050S105. URL https://doi.org/10.1177/10298649020050S105.

Mingkun Gao, Wei Xu, and Chris Callison-Burch. Cost optimization in crowdsourcing translation: Low cost translations made even cheaper. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 705–713, Denver, Colorado, May– June 2015. Association for Computational Linguistics. doi: 10.3115/v1/N15-1072. URL https://www.aclweb.org/anthology/N15-1072.

Raymond W Gibbs. On the psycholinguistics of sarcasm. Journal of Experimental Psychology: General, 115(1):3, 1986. URL https://psycnet.apa.org/doiLanding?doi=10.1037%2F0096-3445.115.1.3.

Richard Gillespie. Research on human subjects: An historical overview. Bioethics News, 8(2):s4–s15, 1989. ISSN 1836-6716. doi: 10.1007/BF03351158.

Igor Gontcharov. Qualitative ethics in a positivist frame: The Canadian experience 1998–2014. In The SAGE Handbook of Qualitative Research Ethics, pages 231–246. SAGE Publications Ltd, 2018. doi: 10.4135/9781526435446.n16.

Roberto González-Ibáñez, Smaranda Muresan, and Nina Wacholder. Identifying Sarcasm in Twitter: A Closer Look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers -Volume 2, HLT ’11, pages 581–586, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. ISBN 978-1-932432-88-6. URL http://dl.acm.org/citation.cfm?id=2002736.2002850.
event-place: Portland, Oregon.

Mark A Graber and Abraham Graber. Internet-based crowdsourcing and research ethics: the case for IRB review. Journal of medical ethics, 39(2):115–118, 2013.

Mary L. Gray and Siddharth Suri. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Houghton Mifflin Harcourt, Boston, MA, 2019.

Xiaochuang Han and Yulia Tsvetkov. Fortifying toxic speech detectors against veiled toxicity. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7732–7739, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.622. URL https://www.aclweb.org/anthology/2020.emnlp-main.622.

Marti A. Hearst. Automatic acquisition of hyponyms from large text corpora. In COLING 1992 Volume 2: The 15th International Conference on Computational Linguistics, 1992. URL https://www.aclweb.org/anthology/C92-2082.

Dirk Hovy and Shannon L. Spruit. The social impact of natural language processing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 591–598, Berlin, Germany, August 2016. Association for Computational Linguistics. doi: 10.18653/v1/P16-2096. URL https://www.aclweb.org/anthology/P16-2096.

Chao-Chun Hsu and Lun-Wei Ku. SocialNLP 2018 EmotionX challenge overview: Recognizing emotions in dialogues. In Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media, pages 27–31, Melbourne, Australia, July 2018. Association for Computational Linguistics. doi: 10.18653/v1/W18-3505. URL https://www.aclweb.org/anthology/W18-3505.

Ursula Huws. Online labour exchanges or ’crowdsourcing’: Implications for occupational safety and health. European Agency for Safety and Health at Work (EU-OSHA), 2015.

Nancy Ide and James Pustejovsky. Handbook of Linguistic Annotation. Springer Publishing Company, Incorporated, 1st edition, 2017. ISBN 9402408797.

Mark Israel. Research Ethics and Integrity for Social Scientists: Beyond Regulatory Compliance. SAGE Publications, Thousand Oaks, CA, 2nd edition, 2015.

Andrew Conway Ivy. The history and ethics of the use of human subjects in medical experiments. Science, 108(2792):1–5, 1948. ISSN 00368075, 10959203. doi: 10.1126/ science.108.2792.1.

Molly Jackman and Lauri Kanerva. Evolving the IRB: Building robust review for industry research. Wash. & Lee L. Rev. Online, 72:445, 2015.

Albert R. Jonsen. The ethics of research with human subjects: A short history. In Albert R. Jonsen,

Robert M. Veatch, and LeRoy Walters, editors, Source Book in Bioethics: A Documentary History, pages 3–10. Georgetown University Press, Washington, DC, 1998. ISBN 9780878406852.

Albert R. Jonsen. On the origins and future of the Belmont report. In James F. Childress, Eric M.

Meslin, and Harold T. Shapiro, editors, Belmont Revisited: Ethical Principles for Research with Human Subjects, pages 3–11. Georgetown University Press, Washington, DC, 2005. ISBN 978-1589010628.

Aditya Joshi, Pushpak Bhattacharyya, Mark Carman, Jaya Saraswati, and Rajita Shukla. How Do Cultural Differences Impact the Quality of Sarcasm Annotation?: A Case Study of Indian Annotators and American Text. In Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pages 95–99, Berlin, Germany, 2016. Association for Computational Linguistics. doi: 10.18653/v1/W16-2111. URL http://aclweb.org/anthology/W16-2111.

Brendan Jou, Subhabrata Bhattacharya, and Shih-Fu Chang. Predicting viewer perceived emotions in animated GIFs. In Proceedings of the 22nd ACM International Conference on Multimedia, MM ’14, page 213–216, New York, NY, USA, 2014. Association for Computing Machinery. ISBN 9781450330633. doi: 10.1145/2647868.2656408. URL https://doi.org/10.1145/2647868.2656408.

Daniel Jurafsky and James H. Martin. Speech and Language Processing (2nd Edition). Prentice-Hall, Inc., USA, 2009. ISBN 0131873210.

Robert L. Klitzman. The Ethics Police?: The Struggle to Make Human Research Safe. Oxford University Press, Oxford, 2015. ISBN 978-0199364602.

A Kumaran, Melissa Densmore, and Shaishav Kumar. Online gaming for crowd-sourcing phrase-equivalents. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 1238–1247, 2014.

Matthew Lease, Jessica Hullman, Jeffrey Bigham, Michael Bernstein, Juho Kim, Walter Lasecki, Saeideh Bakhshi, Tanushree Mitra, and Robert Miller. Mechanical Turk is not anonymous. Available at SSRN 2228728, 2013.

Susan E. Lederer. Subjected to Science: Human Experimentation in America Before the Second World War. Johns Hopkins University Press, Baltimore, MD, 1995.

Jochen L. Leidner and Vassilis Plachouras. Ethical by design: Ethics best practices for natural language processing. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, pages 30–40, Valencia, Spain, April 2017. Association for Computational Linguistics. doi: 10.18653/v1/W17-1604. URL https://www.aclweb.org/anthology/W17-1604.

Y. Li, Y. Song, L. Cao, J. Tetreault, L. Goldberg, A. Jaimes, and J. Luo. TGIF: A new dataset and benchmark on animated GIF description. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4641–4650, Los Alamitos, CA, USA, jun 2016. IEEE Computer Society. doi: 10.1109/CVPR.2016.502. URL https://doi.ieeecomputersociety.org/10.1109/CVPR.2016.502.

Christine Liebrecht, Florian Kunneman, and Antal van den Bosch. The perfect solution for detecting sarcasm in tweets #not. In Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 29–37, Atlanta, Georgia, June 2013. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/W13-1605.

Bing Liu. Sentiment analysis : mining opinions, sentiments, and emotions. 2015. ISBN 9781107017894.

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach, 2019.

Winter Mason and Siddharth Suri. Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44(1):1–23, June 2011. doi: 10.3758/ s13428-011-0124-6. URL https://doi.org/10.3758/s13428-011-0124-6.

Albert Mehrabian. Pleasure-arousal-dominance: A general framework for describing and measuring individual differences in temperament. Current Psychology, 14(4):261–292, 1996. URL https://doi.org/10.1007/BF02686918.

Michelle N. Meyer. There oughta be a law: When does(n’t) the U.S. Common Rule apply? The Journal of Law, Medicine & Ethics, 48(1_suppl):60–73, 2020. doi: 10.1177/1073110520917030. URL https://doi.org/10.1177/1073110520917030. PMID: 32342740.

Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 1003–1011, Suntec, Singapore, August 2009. Association for Computational Linguistics. URL https://aclanthology.org/P09-1113.

NAACL. Ethics FAQ. https://web.archive.org/web/20201129202126/https://2021.naacl.org/ethics/faq/, 2020. Accessed: 2020-11-29.

Nikita Nangia, Clara Vania, Rasika Bhalerao, and Samuel R. Bowman. CrowSpairs: A challenge dataset for measuring social biases in masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1953–1967, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.154. URL https://www.aclweb.org/anthology/2020.emnlp-main.154.

National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The Belmont report: ethical principles and guidelines for the protection of human subjects of research. The Commission, Bethesda, MD, 1978.

Vlad Niculae and Cristian Danescu-Niculescu-Mizil. Conversational markers of constructive discussions. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 568–578, San Diego, California, June 2016. Association for Computational Linguistics. doi: 10.18653/v1/N16-1070. URL https://www.aclweb.org/anthology/N16-1070.

Haruna Ogawa, Hitoshi Nishikawa, Takenobu Tokunaga, and Hikaru Yokono. Gamification platform for collecting task-oriented dialogue data. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 7084–7093, Marseille, France, May 2020. European Language Resources Association. ISBN 979-10-95546-34-4. URL https://www.aclweb.org/anthology/2020.lrec-1.876.

Emily Öhman, Kaisla Kajava, Jörg Tiedemann, and Timo Honkela. Creating a dataset for multilingual fine-grained emotion-detection using gamification-based annotation. In Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 24–30, Brussels, Belgium, October 2018. Association for Computational Linguistics. doi: 10.18653/v1/W18-6205. URL https://www.aclweb.org/anthology/W18-6205.

Silviu Oprea and Walid Magdy. Exploring author context for detecting intended vs perceived sarcasm. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2854–2859, Florence, Italy, July 2019. Association for Computational Linguistics. doi: 10.18653/v1/P19-1275. URL https://www.aclweb.org/anthology/P19-1275.

Silviu Oprea and Walid Magdy. iSarcasm: A Dataset of Intended Sarcasm. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. URL https://www.aclweb.org/anthology/2020.acl-main.118.pdf.

Michael Owen. Ethical review of social and behavioral science research. In Elliott C. Kulakowski and Lynne U. Chronister, editors, Research Administration and Management, pages 543–556. Jones and Bartlett Publishers, Sudbury, MA, 2006.

Carla Parra Escartín, Wessel Reijers, Teresa Lynn, Joss Moorkens, Andy Way, and Chao-Hong Liu. Ethical considerations in NLP shared tasks. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, pages 66–73, Valencia, Spain, April 2017. Association for Computational Linguistics. doi: 10.18653/v1/ W17-1608. URL https://www.aclweb.org/anthology/W17-1608.

Jeffrey Pennington, Richard Socher, and Christopher Manning. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar, October 2014. Association for Computational Linguistics. doi: 10.3115/v1/D14-1162. URL https://www.aclweb.org/anthology/D14-1162.

Verónica Pérez-Rosas and Rada Mihalcea. Experiments in open domain deception detection. In Proceedings of the 2015 Conference on Empirical Methods in Natural
Language Processing, pages 1120–1125, Lisbon, Portugal, September 2015. Association for Computational Linguistics. doi: 10.18653/v1/D15-1133. URL https://www.aclweb.org/anthology/D15-1133.

Rosalind W. Picard. Affective Computing. MIT Press, Cambridge, MA, 1997. ISBN 978-0-262-16170-1.
Chris Pool and Malvina Nissim. Distant supervision for emotion detection using Facebook reactions. In Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES), pages 30–39, Osaka, Japan, December 2016. The COLING 2016 Organizing Committee. URL https://www.aclweb.org/anthology/W16-4304.

Daniel Preotiuc-Pietro, H. Andrew Schwartz, Gregory Park, Johannes Eichstaedt, Margaret Kern, Lyle Ungar, and Elisabeth Shulman. Modelling valence and arousal in Facebook posts. In Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 9–15, San Diego, California, June 2016. Association for Computational Linguistics. doi: 10.18653/v1/W16-0404. URL https://www.aclweb.org/anthology/W16-0404.

Tomáš Ptácek, Ivan Habernal, and Jun Hong. Sarcasm Detection on Czech and English Twitter. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 213–223, Dublin, Ireland, August 2014. Dublin City University and Association for Computational Linguistics. URL https://www.aclweb.org/anthology/C14-1022.pdf.

James Pustejovsky and Amber Stubbs. Natural Language Annotation for Machine Learning: A guide to corpus-building for applications. ” O’Reilly Media, Inc.”, 2012.
David B Resnik. The ethics of research with human subjects: Protecting people, advancing science, promoting trust, volume 74. Springer, Cham, 2018.

Ellen Riloff, Ashequl Qadir, Prafulla Surve, Lalindra De Silva, Nathan Gilbert, and Ruihong Huang. Sarcasm as Contrast between a Positive Sentiment and Negative Situation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 704–714, Seattle, Washington, USA, October 2013. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/D13-1066.

David J. Rothman. Ethics and human experimentation. New England Journal of Medicine, 317(19):1195–1199, 1987. ISSN 0028-4793. doi: 10.1056/ NEJM198711053171906.
Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, and Yejin Choi. Social bias frames: Reasoning about social and power implications of language. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5477–5490, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.486. URL https://www.aclweb.org/anthology/2020.acl-main.486.

Zachary M Schrag. Ethical imperialism: Institutional review boards and the social sciences, 1965–2009. The Johns Hopkins University Press, Baltimore, MD, 2010. ISBN 0801899141.

Cansu Sen, Thomas Hartvigsen, Biao Yin, Xiangnan Kong, and Elke Rundensteiner. Human attention maps for text classification: Do humans and neural networks focus on the same words? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4596–4608, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.419. URL https://www.aclweb.org/anthology/2020.acl-main.419.

Armin Seyeditabari, Narges Tabari, and Wlodek Zadrozny. Emotion detection in text: a review. arXiv preprint arXiv:1806.00674, 2018.
Adil E. Shamoo and David E. Resnik. Responsible Conduct of Research. Oxford University Press, New York, NY, 2 edition, 2009. ISBN 9780199376025.

Ashish Sharma, Adam Miner, David Atkins, and Tim Althoff. A computational approach to understanding empathy expressed in text-based mental health support. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5263–5276, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.425. URL https://www.aclweb.org/anthology/2020.emnlp-main.425.

Boaz Shmueli, Lun-Wei Ku, and Soumya Ray. Reactive supervision: A new method for collecting sarcasm data. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2553–2559, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.201. URL https://aclanthology.org/2020.emnlp-main.201.

Boaz Shmueli, Jan Fell, Soumya Ray, and Lun-Wei Ku. Beyond fair pay: Ethical implications of NLP crowdsourcing. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3758–3769, Online, June 2021a. Association for Computational Linguistics. doi: 10.18653/v1/2021.naacl-main.295. URL https://aclanthology.
org/2021.naacl-main.295.

Boaz Shmueli, Soumya Ray, and Lun-Wei Ku. Happy dance, slow clap: Using reaction GIFs to predict induced affect on Twitter. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 395–401, Online, August 2021b. Association for Computational Linguistics. URL https://aclanthology.org/2021.acl-short.50.

Miriah Steiger, Timir J Bharucha, Sukrit Venkatagiri, Martin J Riedl, and Matthew Lease. The psychological well-being of content moderators. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, CHI ’21, New York, NY, USA, 2021. Association for Computing Machinery. doi: 10.1145/3411764.3445092. URL https://doi.org/10.1145/3411764.3445092.

Carlo Strapparava and Rada Mihalcea. Learning to identify emotions in text. In Proceedings of the 2008 ACM Symposium on Applied Computing, SAC ’08, page 1556–1560, New York, NY, USA, 2008. Association for Computing Machinery. ISBN 9781595937537. doi: 10.1145/1363686.1364052. URL https://doi.org/10.1145/
1363686.1364052.

Stephanie Strassel, David Graff, Nii Martey, and Christopher Cieri. Quality control in large annotation projects involving multiple judges: The case of the TDT corpora. In Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, May 2000. European Language Resources Association (ELRA). URL http://www.lrec-conf.org/proceedings/lrec2000/pdf/212.pdf.

Reid Swanson, Stephanie Lukin, Luke Eisenberg, Thomas Corcoran, and Marilyn Walker. Getting reliable annotations for sarcasm in online dialogues. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 4250–4257, Reykjavik, Iceland, May 2014. European Language Resources Association (ELRA). URL http://www.lrec-conf.org/proceedings/
lrec2014/pdf/1063_Paper.pdf.

Yi Tay, Anh Tuan Luu, Siu Cheung Hui, and Jian Su. Reasoning with Sarcasm by Reading In-Between. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1010–1020, Melbourne, Australia, July 2018. Association for Computational Linguistics. doi: 10.18653/v1/ P18-1093. URL https://www.aclweb.org/anthology/P18-1093.

Leimin Tian, Michal Muszynski, Catherine Lai, Johanna D. Moore, Theodoros Kostoulas, Patrizia Lombardo, Thierry Pun, and Guillaume Chanel. Recognizing induced emotions of movie audiences: Are induced and perceived emotions the same? In 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pages 28–35, 2017. doi: 10.1109/ACII.2017.8273575.

Garreth W. Tigwell and David R. Flatla. Oh that’s what you meant!Reducing emoji misunderstanding. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct, MobileHCI ’16, page 859–866, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450344135. doi: 10.1145/2957265.2961844. URL https://doi.org/10.1145/2957265.2961844.

Jackson Tolins and Patrawat Samermit. GIFs as embodied enactments in text-mediated conversation. Research on Language and Social Interaction, 49(2):75–91, 2016. doi: 10.1080/08351813.2016.1164391. URL https://doi.org/10.1080/08351813.2016.1164391.

Twitter. Developer Agreement and Policy. https://developer.twitter.com/developer-terms/agreement-and-policy,
2020. (Accessed on 02/01/2021).

UNESCO. Universal Declaration on Bioethics and Human Rights. https://unesdoc.unesco.org/ark:/48223/pf0000146180,
2006. Accessed: 2021-04-06.

University of Michigan, Research Ethics and Compliance. Class assignments & IRB approval. https://web.archive.org/web/20210325081932/https://research-compliance.umich.edu/human-subjects/human-research-protection-program-hrpp/hrpp-policies/class-assignments-irb-approval,
2021. Accessed: 2021-03-25.

University of Virginia, Human Research Protection Program. Subject compensation. https://web.archive.org/web/20200611193644/https://research.virginia.edu/irb-hsr/subject-compensation,
2020. Accessed: 2020-11-19.

University of Washington, Office of Research. Human subjects research determination worksheet. https://web.archive.org/web/20201111071801/https:
//www.washington.edu/research/wp/wp-content/uploads/WORKSHEET_Human_Subjects_Research_Determination_v2.30_2020.11.02.pdf,
2020. Accessed: 2020-11-11.

University of Winthrop, The Office of Grants and Sponsored Research Development. Research v/s course assignment. https://web.archive.org/web/20210325082350/https://www.winthrop.edu/uploadedFiles/grants/Research-vs-Course-Assignment.pdf,
2021. Accessed: 2021-03-25.

Jack Urbanek, Angela Fan, Siddharth Karamcheti, Saachi Jain, Samuel Humeau, Emily Dinan, Tim Rocktäschel, Douwe Kiela, Arthur Szlam, and Jason Weston. Learning to speak and act in a fantasy text adventure game. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 673–683, Hong Kong, China, November 2019. Association for Computational Linguistics. doi: 10. 18653/v1/D19-1062. URL https://www.aclweb.org/anthology/D19-1062.

Chao Yang, Shimei Pan, Jalal Mahmud, Huahai Yang, and Padmini Srinivasan. Using personal traits for brand preference prediction. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 86–96, Lisbon, Portugal, September 2015. Association for Computational Linguistics. doi: 10.18653/v1/D15-1009. URL https://www.aclweb.org/anthology/D15-1009.

簡易檢索 / 詳目顯示

相關論文