基於深度學習技術的語義通訊系統｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	石哲瑋 Shi, Zhe-Wei
論文名稱：	基於深度學習技術的語義通訊系統 Semantic Communication Systems Based on Deep Learning Techniques
指導教授：	張正尚 Chang, Cheng-Shang
口試委員:	李佳翰 Lee, Chia-Han 李端興 Lee, Duan-Shin 翁詠祿 Ueng, Yeong-Luh
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 通訊工程研究所 Communications Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	52
中文關鍵詞：	語義通訊、深度學習、自然語言處理、通訊系統、二值化表示
外文關鍵詞：	semantic communication, deep learning, natural language processing, communication systems, binary representation
相關次數：	點閱：82 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著無人工廠、自動駕駛車輛和虛擬實境等新應用的出現，大量的數據需要在有限
的頻譜資源上實現低延遲和大規模連接。然而，傳統通訊系統主要集中於高精度傳輸
符號或比特，無法滿足這些新興應用對低延遲、高容量和高精確度通訊的需求。因此，
智能通訊系統的發展考慮到傳遞訊息中的語義，以提高通訊的準確性和效率。本論文
提出了一種基於語義的通訊系統，結合了兩個著名的預訓練模型，以增強語義神經網
絡的抗噪聲能力和泛化能力，進一步提高通訊的準確性和效率。此外，我們利用特徵
二值化來減少特徵的維度，從而節省傳輸成本，並在實際通訊環境中實現更自然形式
的語義通訊。最後，我們通過實驗評估了現有評估方法在真實句子傳輸中的限制和缺
點，提出了一種新的評估方法來評估兩個句子之間的語義相似性。我們的研究結果有
助於加速語義通訊技術的發展，為各種新興應用提供穩定、高效和準確的通訊保證。

With the emergence of new applications such as unmanned factories, intelligent connected
vehicles, and virtual reality, a vast amount of data is being generated, requiring
low latency and large-scale connectivity over limited spectrum resources. However, traditional
communication systems primarily focus on transmitting symbols or bits with high
accuracy, which cannot meet the demands of these emerging applications for low latency,
high capacity, and high precision communication. This has led to the development of
intelligent communication systems that consider semantic meaning to improve communication
accuracy and efficiency. In this thesis, we propose a semantic-based communication
system that combines two well-known pre-trained models to enhance the noise robustness
and generalization capability of the semantic neural network and further improve
communication accuracy and efficiency. Additionally, we utilize feature binarization to
reduce the dimensionality of features, thus saving transmission costs and allowing for a
more natural form of semantic communication in practical communication environments.
Finally, we experimentally evaluate the limitations and drawbacks of existing evaluation
methods on real sentence transmission and propose a new evaluation method to assess the
semantic similarity between two sentences. Our research findings contribute to accelerating
the development of semantic communication technology, providing stable, efficient,
and accurate communication guarantees for various emerging applications.

Contents 1
List of Figures 3
List of Tables 4
Introduction 5
Related Work 9
1 Semantic Communication . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Natural Language Processing . . . . . . . . . . . . . . . . . . . . . . . . 11
Problem Definition 13
1 Semantic Communication System . . . . . . . . . . . . . . . . . . . . . . 13
2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Methodology 17
1 Encoder - Decoder Architecture . . . . . . . . . . . . . . . . . . . . . . . 18
2 Pre-trained Model Methods . . . . . . . . . . . . . . . . . . . . . . . . . 19
1
2.1 Transformer Encoder + Pre-trained BART . . . . . . . . . . . . . 21
2.2 Pre-trained BERT + Fully Connected Neural Network . . . . . . 22
3 Representation Quantization Methods . . . . . . . . . . . . . . . . . . . . 23
3.1 Straight-Through Estimator (STE) . . . . . . . . . . . . . . . . . 23
3.2 Vector-Quantized Variational AutoEncoder (VQVAE) . . . . . . . 24
4 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1 BLEU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Sentence Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3 ChatGPT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Experiments 31
1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1 Pre-trained Models . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3 ChatGPT score . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Conclusions 44
                                

[1] C. E. Shannon and W. Weaver, The Mathematical Theory of Communication. Urbana,
IL: The University of Illinois Press, 1949.
[2] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep
bidirectional transformers for language understanding,” in Proceedings of the 2019
Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp.
4171–4186. [Online]. Available: https://aclanthology.org/N19-1423
[3] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov,
and L. Zettlemoyer, “BART: Denoising sequence-to-sequence pre-training for
natural language generation, translation, and comprehension,” in Proceedings of the
58th Annual Meeting of the Association for Computational Linguistics. Online:
Association for Computational Linguistics, Jul. 2020, pp. 7871–7880. [Online].
Available: https://aclanthology.org/2020.acl-main.703
[4] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, “Binarized
neural networks: Training deep neural networks with weights and activations con-
47
strained to +1 or -1,” 2016.
[5] A. van den Oord, O. Vinyals, and k. kavukcuoglu, “Neural discrete representation
learning,” in Advances in Neural Information Processing Systems,
I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan,
and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017.
[Online]. Available: https://proceedings.neurips.cc/paper files/paper/2017/file/
7a98af17e63a0ac09ce2e96d03992fbc-Paper.pdf
[6] Q. Lan, D. Wen, Z. Zhang, Q. Zeng, X. Chen, P. Popovski, and K. Huang, “What
is semantic communication? a view on conveying meaning in the era of machine
intelligence,” Journal of Communications and Information Networks, vol. 6, no. 4,
pp. 336–371, 2021.
[7] P. Jiang, C.-K. Wen, S. Jin, and G. Y. Li, “Deep source-channel coding for sentence
semantic transmission with harq,” IEEE Transactions on Communications, vol. 70,
no. 8, pp. 5225–5240, 2022.
[8] N. Farsad, M. Rao, and A. Goldsmith, “Deep learning for joint source-channel coding
of text,” in 2018 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP), 2018, pp. 2326–2330.
[9] H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled semantic communication
systems,” IEEE Transactions on Signal Processing, vol. 69, pp. 2663–2675,
2021.
48
[10] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u.
Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural
Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach,
R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates,
Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/
2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[11] Q. Zhou, R. Li, Z. Zhao, C. Peng, and H. Zhang, “Semantic communication with
adaptive universal transformer,” IEEE Wireless Communications Letters, vol. 11,
no. 3, pp. 453–457, 2022.
[12] Z. Weng, Z. Qin, and G. Y. Li, “Semantic communications for speech recognition,”
in 2021 IEEE Global Communications Conference (GLOBECOM), 2021, pp. 1–6.
[13] T. Han, Q. Yang, Z. Shi, S. He, and Z. Zhang, “Semantic-preserved communication
system for highly efficient speech transmission,” IEEE Journal on Selected Areas in
Communications, vol. 41, no. 1, pp. 245–259, 2023.
[14] E. Bourtsoulatze, D. B. Kurka, and D. G¨und¨uz, “Deep joint source-channel coding
for wireless image transmission,” in ICASSP 2019 - 2019 IEEE International Conference
on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 4774–4778.
[15] H. Zhang, S. Shao, M. Tao, X. Bi, and K. B. Letaief, “Deep learning-enabled semantic
communication systems with task-unaware transmitter and dynamic data,”
IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 170–185,
2023.
49
[16] Q. Hu, G. Zhang, Z. Qin, Y. Cai, G. Yu, and G. Y. Li, “Robust semantic communications
with masked vq-vae enabled codebook,” IEEE Transactions on Wireless
Communications, pp. 1–1, 2023.
[17] F. Zhou, Y. Li, X. Zhang, Q. Wu, X. Lei, and R. Q. Hu, “Cognitive semantic communication
systems driven by knowledge graph,” in ICC 2022 - IEEE International
Conference on Communications, 2022, pp. 4860–4865.
[18] H. Xie, Z. Qin, and G. Y. Li, “Task-oriented multi-user semantic communications
for vqa,” IEEE Wireless Communications Letters, vol. 11, no. 3, pp. 553–557, 2022.
[19] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations
in vector space,” 2013.
[20] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and
L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of
the 2018 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Volume 1 (Long
Papers). New Orleans, Louisiana: Association for Computational Linguistics, Jun.
2018, pp. 2227–2237. [Online]. Available: https://aclanthology.org/N18-1202
[21] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language
models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
[22] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan,
P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,”
Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
50
[23] OpenAI, “GPT-4 technical report,” https://cdn.openai.com/papers/gpt-4.pdf, 2023.
[24] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer,
and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,”
arXiv preprint arXiv:1907.11692, 2019.
[25] X. Han, Z. Zhang, N. Ding, Y. Gu, X. Liu, Y. Huo, J. Qiu, Y. Yao, A. Zhang,
L. Zhang et al., “Pre-trained models: Past, present and future,” AI Open, vol. 2,
pp. 225–250, 2021.
[26] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
2015.
[27] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic
evaluation of machine translation,” in Proceedings of the 40th annual meeting of the
Association for Computational Linguistics, 2002, pp. 311–318.
[28] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence embeddings using Siamese
BERT-networks,” in Proceedings of the 2019 Conference on Empirical Methods
in Natural Language Processing and the 9th International Joint Conference on
Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association
for Computational Linguistics, Nov. 2019, pp. 3982–3992. [Online]. Available:
https://aclanthology.org/D19-1410
[29] P. Koehn, “Europarl: A parallel corpus for statistical machine translation,” in Proceedings
of machine translation summit x: papers, 2005, pp. 79–86.
51
[30] C. Heegard and S. B. Wicker, Turbo Decoding. Boston, MA: Springer US, 1999,
pp. 121–164. [Online]. Available: https://doi.org/10.1007/978-1-4757-2999-3 6
[31] R. Gallager, “Low-density parity-check codes,” IRE Transactions on Information
Theory, vol. 8, no. 1, pp. 21–28, 1962.
[32] I. S. Reed and G. Solomon, “Polynomial codes over certain finite fields,” Journal of
the society for industrial and applied mathematics, vol. 8, no. 2, pp. 300–304, 1960.

簡易檢索 / 詳目顯示

相關論文