簡易檢索 / 詳目顯示

研究生: 朱欣祈
Chu, Hsin-Chi
論文名稱: 以慢思促成任務不確定之終身學習與有序小樣本學習
Slow Thinking Enables Task-Uncertain Lifelong and Sequential Few-Shot Learning
指導教授: 吳尚鴻
Wu, Shan-Hung
口試委員: 陳煥宗
Chen, Hwann-Tzong
簡仁宗
Chien, Jen-Tzung
彭文孝
Peng, Wen-Hsiao
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 27
中文關鍵詞: 機器終身學習小樣本學習記憶擴充類神經網絡自適應類神經網絡水塘抽樣
外文關鍵詞: Lifelong Learning, Sequential Few-Shot Learning, Memory Augmented Neural Network, Run-time Adaptation, Reservoir Sampling
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 機器的終身學習著重於使模型適應新來的任務、並且不會遺忘過去學習過的任務;而小樣本學習則致力於利用極少量的學習樣本即可學習當前的任務。這兩種機器學習的研究領域在人工智慧的泛化學習能力站著極重要的地位。然而目前現存此兩方領域的研究文本時常在訓練模型時設立了脫離實際應用面的假設。在機器終身學習中,測試階段的任務數量假設已知,以至於在訓練時可以事先將模型的大小為測試階段客製化;而在小樣本學習中,則經常假設訓練時可以隨時獲取大量訓練任務以供模型參考。然而人類則可以不需要上述假設,也可以達到終身學習與小樣本學習的目標。本論文將上述假設放寬,提出任務不確定終身學習以及有序小樣本學習兩項問題,同時也從人類的學習模式啟發,提出STL(Slow Thinking to Learn)模型,利用非參數記憶模型儲存的過去資訊以及慢思模型(Slow Thinker)的相互影響,即執行期花費時間對相應任務的儲存內容進行適配,讓模型能在終身學習的條件設定下解決上述提出的兩項問題。與此同時,本論文也提出叢集展開的方式,整合STL模型的儲存空間,使其在節省空間的同時保有水塘抽樣的效果。


    Lifelong machine learning focuses on adapting to novel tasks without forgetting the old tasks, whereas few-shot learning strives to learn a single task given a small amount of data. These two different research areas are crucial for artificial general intelligence, however, their existing studies have somehow assumed some impractical settings when training the models. For lifelong learning, the nature (or the quantity) of incoming tasks during inference time is assumed to be known at training time. As for few-shot learning, it is commonly assumed that a large number of tasks is available during training. Humans, on the other hand, can perform these learning tasks without regard to the aforementioned assumptions. Inspired by how the human brain works, we propose a novel model, called the Slow Thinking to Learn (STL), that makes sophisticated (and slightly slower) predictions by iteratively considering interactions between current and previously seen tasks at runtime. Meanwhile, we propose a fixed-size storage method to improve storage efficiency for STL model, letting the required memory space constant along the training procedure. STL has two specialized modules, called the Slow Predictor (SP) and Fast Learners (FLs), that are responsible for making lifelong and few-shot predictions, respectively, yet designed to complement each other to be properly trained even without fulfilling the above assumptions. Having conducted experiments, the results empirically demonstrate the effectiveness of STL for more realistic lifelong and few-shot learning settings.

    1. Introduction..........................................................1 2. Slow Thinking to Learn (STL) .........................................3 3. Reservoir Sampling with Dynamic Clustering and Expanding (RS-DCE) ....8 4. Further Related Work ................................................10 5. Experimental Evaluation .............................................11 5.1 Sequential Few-Shot Learning .......................................12 5.2 RS-DCE Memory for Task-Uncertain Lifelong Learning .................14 5.3 Inference Time .....................................................15 6. Conclusion ..........................................................15 Supplementary Materials 7. Related Work ........................................................17 8. Technical Details ...................................................19 8.1 Solving Eq. (2) ....................................................19 8.2 Empirical Loss of FLs ..............................................20 8.3 Training ...........................................................22 References .............................................................24

    [1] Li Fei-Fei, Rob Fergus, and Pietro Perona. “One-shot learning of object categories”. In: IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 28.4 (2006), pp. 594–611.
    [2] Chelsea Finn, Pieter Abbeel, and Sergy Levine. “Model-agnostic Meta-learning for Fast Adaptation of Deep Networks”. In: Proc. of ICML. 2017.
    [3] Robert M French. “Catastrophic forgetting in connectionist networks”. In: Trends in cognitive sciences 3.4 (1999), pp. 128–135.
    [4] Victor Garcia and Joan Bruna. “Few-shot learning with graph neural networks”. In: Proc. of ICLR. 2018.
    [5] Alexander Gepperth and Cem Karaoguz. “A bio-inspired incremental learning architecture for applied perceptual problems”. In: Cognitive Computation 8.5 (2016), pp. 924–934.
    [6] Xu He and Herbert Jaeger. “Overcoming Catastrophic Interference using Conceptor-Aided Backpropagation”. In: Proc. of ICLR. 2018.
    [7] Wenpeng Hu et al. “Overcoming Catastrophic Forgetting via Model Adaptation”. In: Proc. of ICLR. 2019. URL : https://openreview.net/forum?id=ryGvcoA5YX.
    [8] David Isele and Akansel Cosgun. “Selective experience replay for lifelong learning”. In: Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
    [9] Dolor Rosalie Jacob and Shan-Hung Wu. Learning to Think Fast and Slow. https://hdl.handle.net/11296/qrq8vj. Accessed: 2023-10-20.
    [10] Daniel Kahneman and Patrick Egan. Thinking, fast and slow. Farrar, Straus and Giroux New York, 2011.
    [11] Lukasz Kaiser et al. “Learning to Remember Rare Events”. In: Proc. of ICLR. 2017.
    [12] Ronald Kemker and Christopher Kanan. “FearNet: Brain-Inspired Model for Incremental Learning. International Conference on Learning Representations”. In: Proc. of ICLR.2018.
    [13] Ronald Kemker, Marc McClure, et al. “Measuring Catastrophic Forgetting in Neural Networks”. In: Proc. of AAAI. 2018.
    [14] James Kirkpatrick et al. “Overcoming Catastrophic Forgetting in Neural Networks”. In: Proc. of National Academy of Sciences (PNAS). 2017.
    [15] Dharshan Kumaran, Demis Hassabis, and James L McClelland. “What learning systems do intelligent agents need? Complementary learning systems theory updated”. In: Trends in cognitive sciences 20.7 (2016), pp. 512–534.
    [16] Brenden Lake et al. “One Shot Learning of Simple Visual Concepts”. In: Proc. of the Annual Conf. of the Cognitive Science Society. 2011.
    [17] Sang-Woo Lee et al.“Overcoming Catastrophic Forgetting by Incremental Moment Matching”. In: Proc. of NIPS. 2017.
    [18] Zhizhong Li and Derek Hoiem. “Learning Without Forgetting”. In: Proc. of ECCV. 2016.
    [19] David Lopez-Paz and Marc’ Aurelio Ranzato. “Gradient Episodic Memory for Continual Learning”. In: Proc. of NIPS. 2017.
    [20] Laurens van der Maaten and Geoffrey Hinton. “Visualizing data using t-SNE”. In: Journal of machine learning research 9.Nov (2008), pp. 2579–2605.
    [21] James McClelland, Bruce McNaughton, and Randall O’Reilly. “Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes
    and failures of connectionist models of learning and memory”. In: Psychological Review (1995)
    [22] Michael McCloskey and Neal J. Cohen. “Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem”. In: The Psychology of Learning and Motivation (1989).
    [23] Alex Nichol and John Schulman. “Reptile: a Scalable Metalearning Algorithm”. In: arXiv preprint arXiv:1803.02999 (2018).
    [24] German I Parisi et al. “Continual Lifelong Learning with Neural Networks: A Review”. In: Neural Networks (2019).
    [25] Alexander Pritzel et al. “Neural Episodic Control”. In: Proc. of ICML. 2017.
    [26] Hang Qi, Matthew Brown, and David G Lowe. “Low-Shot Learning With Imprinted Weights”. In: Proc. of CVPR. 2018.
    [27] Sachin Ravi and Hugo Larochelle. “Optimization as a Model for Few-shot Learning”. In: Proc. of ICLR. 2017.
    [28] Sachin Ravi and Hugo Larochelle. “Optimization as a model for few-shot learning”. In: Proc. of ICLR. 2017.
    [29] Matthew Riemer et al. “Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference”. In: Proc. of ICLR. 2019.
    [30] AndreiA.Rusuetal.“Progressiveneuralnetworks”.In:arXivpreprintarXiv:1802.07569(2016). eprint: 1606.04671.
    [31] Adam Santoro et al. “Meta-Learning with Memory-Augmented Neural Networks”. In: Proc. of ICML. 2016.
    [32] Jonathan Schwarz et al. “Progress & Compress: A Scalable Framework for ContinualLearning”. In: Proc. of ICML. 2018.
    [33] Joan Serrà et al. “Overcoming Catastrophic Forgetting with Hard Attention to the Task”.In: Proc. of ICML. 2018.
    [34] Jake Snell, Kevin Swersky, and Richard Zemel. “Prototypical networks for few-shot learning”. In: Proc. of NIPS. 2017.
    [35] Pablo Sprechmann et al. “Memory-based Parameter Adaptation”. In: Proc. of ICLR.2018.
    [36] Flood Sung et al. “Learning to compare: Relation network for few-shot learning”. In: Proc. of CVPR. 2018.
    [37] Oriol Vinyals et al. “Matching Networks for One Shot Learning”. In: Proc. of NIPS.2016.
    [38] Jeffrey S Vitter. “Random sampling with a reservoir”. In: ACM Transactions on Mathematical Software (TOMS) 11.1 (1985), pp. 37–57.
    [39] Jaehong Yoon et al. “Lifelong Learning with Dynamically Expandable Networks”. In: Proc. of ICLR. 2018.
    [40] Friedemann Zenke, Ben Poole, and Surya Ganguli. “Continual Learning Through Synaptic Intelligence”. In: Proc. of ICML. 2017.

    無法下載圖示 全文公開日期 2024/08/24 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE