簡易檢索 / 詳目顯示

研究生: 石哲瑋
Shi, Zhe-Wei
論文名稱: 基於深度學習技術的語義通訊系統
Semantic Communication Systems Based on Deep Learning Techniques
指導教授: 張正尚
Chang, Cheng-Shang
口試委員: 李佳翰
Lee, Chia-Han
李端興
Lee, Duan-Shin
翁詠祿
Ueng, Yeong-Luh
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 通訊工程研究所
Communications Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 52
中文關鍵詞: 語義通訊深度學習自然語言處理通訊系統二值化表示
外文關鍵詞: semantic communication, deep learning, natural language processing, communication systems, binary representation
相關次數: 點閱:82下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著無人工廠、自動駕駛車輛和虛擬實境等新應用的出現,大量的數據需要在有限
    的頻譜資源上實現低延遲和大規模連接。然而,傳統通訊系統主要集中於高精度傳輸
    符號或比特,無法滿足這些新興應用對低延遲、高容量和高精確度通訊的需求。因此,
    智能通訊系統的發展考慮到傳遞訊息中的語義,以提高通訊的準確性和效率。本論文
    提出了一種基於語義的通訊系統,結合了兩個著名的預訓練模型,以增強語義神經網
    絡的抗噪聲能力和泛化能力,進一步提高通訊的準確性和效率。此外,我們利用特徵
    二值化來減少特徵的維度,從而節省傳輸成本,並在實際通訊環境中實現更自然形式
    的語義通訊。最後,我們通過實驗評估了現有評估方法在真實句子傳輸中的限制和缺
    點,提出了一種新的評估方法來評估兩個句子之間的語義相似性。我們的研究結果有
    助於加速語義通訊技術的發展,為各種新興應用提供穩定、高效和準確的通訊保證。


    With the emergence of new applications such as unmanned factories, intelligent connected
    vehicles, and virtual reality, a vast amount of data is being generated, requiring
    low latency and large-scale connectivity over limited spectrum resources. However, traditional
    communication systems primarily focus on transmitting symbols or bits with high
    accuracy, which cannot meet the demands of these emerging applications for low latency,
    high capacity, and high precision communication. This has led to the development of
    intelligent communication systems that consider semantic meaning to improve communication
    accuracy and efficiency. In this thesis, we propose a semantic-based communication
    system that combines two well-known pre-trained models to enhance the noise robustness
    and generalization capability of the semantic neural network and further improve
    communication accuracy and efficiency. Additionally, we utilize feature binarization to
    reduce the dimensionality of features, thus saving transmission costs and allowing for a
    more natural form of semantic communication in practical communication environments.
    Finally, we experimentally evaluate the limitations and drawbacks of existing evaluation
    methods on real sentence transmission and propose a new evaluation method to assess the
    semantic similarity between two sentences. Our research findings contribute to accelerating
    the development of semantic communication technology, providing stable, efficient,
    and accurate communication guarantees for various emerging applications.

    Contents 1 List of Figures 3 List of Tables 4 1 Introduction 5 2 Related Work 9 2.1 Semantic Communication . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Natural Language Processing . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Problem Definition 13 3.1 Semantic Communication System . . . . . . . . . . . . . . . . . . . . . . 13 3.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4 Methodology 17 4.1 Encoder - Decoder Architecture . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Pre-trained Model Methods . . . . . . . . . . . . . . . . . . . . . . . . . 19 1 4.2.1 Transformer Encoder + Pre-trained BART . . . . . . . . . . . . . 21 4.2.2 Pre-trained BERT + Fully Connected Neural Network . . . . . . 22 4.3 Representation Quantization Methods . . . . . . . . . . . . . . . . . . . . 23 4.3.1 Straight-Through Estimator (STE) . . . . . . . . . . . . . . . . . 23 4.3.2 Vector-Quantized Variational AutoEncoder (VQVAE) . . . . . . . 24 4.4 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.4.1 BLEU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.4.2 Sentence Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.4.3 ChatGPT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5 Experiments 31 5.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.2 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.3 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.4.1 Pre-trained Models . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.4.2 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.4.3 ChatGPT score . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6 Conclusions 44

    [1] C. E. Shannon and W. Weaver, The Mathematical Theory of Communication. Urbana,
    IL: The University of Illinois Press, 1949.
    [2] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep
    bidirectional transformers for language understanding,” in Proceedings of the 2019
    Conference of the North American Chapter of the Association for Computational
    Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
    Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp.
    4171–4186. [Online]. Available: https://aclanthology.org/N19-1423
    [3] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov,
    and L. Zettlemoyer, “BART: Denoising sequence-to-sequence pre-training for
    natural language generation, translation, and comprehension,” in Proceedings of the
    58th Annual Meeting of the Association for Computational Linguistics. Online:
    Association for Computational Linguistics, Jul. 2020, pp. 7871–7880. [Online].
    Available: https://aclanthology.org/2020.acl-main.703
    [4] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, “Binarized
    neural networks: Training deep neural networks with weights and activations con-
    47
    strained to +1 or -1,” 2016.
    [5] A. van den Oord, O. Vinyals, and k. kavukcuoglu, “Neural discrete representation
    learning,” in Advances in Neural Information Processing Systems,
    I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan,
    and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017.
    [Online]. Available: https://proceedings.neurips.cc/paper files/paper/2017/file/
    7a98af17e63a0ac09ce2e96d03992fbc-Paper.pdf
    [6] Q. Lan, D. Wen, Z. Zhang, Q. Zeng, X. Chen, P. Popovski, and K. Huang, “What
    is semantic communication? a view on conveying meaning in the era of machine
    intelligence,” Journal of Communications and Information Networks, vol. 6, no. 4,
    pp. 336–371, 2021.
    [7] P. Jiang, C.-K. Wen, S. Jin, and G. Y. Li, “Deep source-channel coding for sentence
    semantic transmission with harq,” IEEE Transactions on Communications, vol. 70,
    no. 8, pp. 5225–5240, 2022.
    [8] N. Farsad, M. Rao, and A. Goldsmith, “Deep learning for joint source-channel coding
    of text,” in 2018 IEEE International Conference on Acoustics, Speech and Signal
    Processing (ICASSP), 2018, pp. 2326–2330.
    [9] H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled semantic communication
    systems,” IEEE Transactions on Signal Processing, vol. 69, pp. 2663–2675,
    2021.
    48
    [10] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u.
    Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural
    Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach,
    R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates,
    Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/
    2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
    [11] Q. Zhou, R. Li, Z. Zhao, C. Peng, and H. Zhang, “Semantic communication with
    adaptive universal transformer,” IEEE Wireless Communications Letters, vol. 11,
    no. 3, pp. 453–457, 2022.
    [12] Z. Weng, Z. Qin, and G. Y. Li, “Semantic communications for speech recognition,”
    in 2021 IEEE Global Communications Conference (GLOBECOM), 2021, pp. 1–6.
    [13] T. Han, Q. Yang, Z. Shi, S. He, and Z. Zhang, “Semantic-preserved communication
    system for highly efficient speech transmission,” IEEE Journal on Selected Areas in
    Communications, vol. 41, no. 1, pp. 245–259, 2023.
    [14] E. Bourtsoulatze, D. B. Kurka, and D. G¨und¨uz, “Deep joint source-channel coding
    for wireless image transmission,” in ICASSP 2019 - 2019 IEEE International Conference
    on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 4774–4778.
    [15] H. Zhang, S. Shao, M. Tao, X. Bi, and K. B. Letaief, “Deep learning-enabled semantic
    communication systems with task-unaware transmitter and dynamic data,”
    IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 170–185,
    2023.
    49
    [16] Q. Hu, G. Zhang, Z. Qin, Y. Cai, G. Yu, and G. Y. Li, “Robust semantic communications
    with masked vq-vae enabled codebook,” IEEE Transactions on Wireless
    Communications, pp. 1–1, 2023.
    [17] F. Zhou, Y. Li, X. Zhang, Q. Wu, X. Lei, and R. Q. Hu, “Cognitive semantic communication
    systems driven by knowledge graph,” in ICC 2022 - IEEE International
    Conference on Communications, 2022, pp. 4860–4865.
    [18] H. Xie, Z. Qin, and G. Y. Li, “Task-oriented multi-user semantic communications
    for vqa,” IEEE Wireless Communications Letters, vol. 11, no. 3, pp. 553–557, 2022.
    [19] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations
    in vector space,” 2013.
    [20] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and
    L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of
    the 2018 Conference of the North American Chapter of the Association for
    Computational Linguistics: Human Language Technologies, Volume 1 (Long
    Papers). New Orleans, Louisiana: Association for Computational Linguistics, Jun.
    2018, pp. 2227–2237. [Online]. Available: https://aclanthology.org/N18-1202
    [21] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language
    models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
    [22] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan,
    P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,”
    Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
    50
    [23] OpenAI, “GPT-4 technical report,” https://cdn.openai.com/papers/gpt-4.pdf, 2023.
    [24] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer,
    and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,”
    arXiv preprint arXiv:1907.11692, 2019.
    [25] X. Han, Z. Zhang, N. Ding, Y. Gu, X. Liu, Y. Huo, J. Qiu, Y. Yao, A. Zhang,
    L. Zhang et al., “Pre-trained models: Past, present and future,” AI Open, vol. 2,
    pp. 225–250, 2021.
    [26] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
    2015.
    [27] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic
    evaluation of machine translation,” in Proceedings of the 40th annual meeting of the
    Association for Computational Linguistics, 2002, pp. 311–318.
    [28] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence embeddings using Siamese
    BERT-networks,” in Proceedings of the 2019 Conference on Empirical Methods
    in Natural Language Processing and the 9th International Joint Conference on
    Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association
    for Computational Linguistics, Nov. 2019, pp. 3982–3992. [Online]. Available:
    https://aclanthology.org/D19-1410
    [29] P. Koehn, “Europarl: A parallel corpus for statistical machine translation,” in Proceedings
    of machine translation summit x: papers, 2005, pp. 79–86.
    51
    [30] C. Heegard and S. B. Wicker, Turbo Decoding. Boston, MA: Springer US, 1999,
    pp. 121–164. [Online]. Available: https://doi.org/10.1007/978-1-4757-2999-3 6
    [31] R. Gallager, “Low-density parity-check codes,” IRE Transactions on Information
    Theory, vol. 8, no. 1, pp. 21–28, 1962.
    [32] I. S. Reed and G. Solomon, “Polynomial codes over certain finite fields,” Journal of
    the society for industrial and applied mathematics, vol. 8, no. 2, pp. 300–304, 1960.

    QR CODE