簡易檢索 / 詳目顯示

研究生: 楊凱超
Yang, Kai-Chao
論文名稱: Efficient and Reliable Video Streaming based on Frame Dependency Design
基於畫面相依性的高效率與高可靠度視訊串流設計
指導教授: 王家祥
Wang, Jia-Shung
口試委員:
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 86
中文關鍵詞: 視訊串流VCR容錯編碼GOPH.264/AVC
外文關鍵詞: Video streaming, VCR, Error resilience, GOP, H.264/AVC
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在視訊串流服務中,使用者向伺服器端要求不同的影片播放,而伺服器則利用Chaining或Broadcasting等各種串流技術將影片傳送給使用者。對伺服器端而言,一般期望能減輕伺服器負擔與降低頻寬需求使其能服務更多的使用者。對使用者而言,提供完整提的VCR操作功能與穩定畫質則是主要需求。然而使用者的需求卻往往增加伺服器與骨幹網路的負擔。舉例來說,當使用者進行快轉或跳張等VCR動作時,傳送速度必須增加以趕上解碼端所要求的播放倍率。而播放倍率往往與所需的傳送速度成正比,如此便造成伺服器與網路的重大負擔。另一方面,當網路負擔加重時,封包丟失或延遲比例也隨之增加。當封包未及時送達時,解碼端必須選擇等待或放棄此封包,而造成畫面延遲或品質降低的後果。
    我們在不嚴重降低編碼效率以及不增加網路負擔的前提下,探討如何利用視訊編碼技術滿足使用者的VCR要求以及正確補償丟失的封包。基於影像畫面之間的運動補償相依關係,我們利用改變畫面相依性來重新編碼影像,使其能有效率地支援VCR操作功能或是錯誤補償功能。利用我們所提出的編碼結構,能讓使用者在操作VCR功能時,幾乎不會增加網路以及伺服器的負擔,使得伺服器能同時服務的人數大量增加。此外,我們也發展出另一種編碼結構,使畫面因封包丟失或延遲造成損壞時,錯誤不會傳遞至相鄰的畫面。由於相鄰的前後兩張畫面仍能正確解壓縮,因此我們有更多的資訊可以回復錯誤的畫面,也讓使用者在觀看之時所受到的影響最低。
    我們所提出的方法特別適用於低頻寬的網路環境下,例如手機與各種手持式裝置等。實驗證明,在儘量不增加網路頻寬的前提下,我們能有效率地支援VCR功能或是回復已發生的錯誤,同時也減經伺服器的負擔。


    In a video streaming system, the server sends requested videos to end users by unicasting, broadcasting, chaining, or other streaming techniques. Intuitively, the goal of the server is reduction of server load and bandwidth requirements to serve more users. For client sides, a smooth quality, low delay time, and full VCR functionality might be the most important requirements. However, it is trade-off to fulfill both the user requirements and the server goal because these user requirements usually cause serious server and network load. For example, the server has to send a stream with triple transmission rate than usual when a fast scan operation at speed factor three is requested. According to our simulation, the transmission rate is normally proportional to the requested speed factor. When the network load becomes heavy, there will be more serious packet loss, leading to delay or unstable quality.
    In the thesis, several novel video coding structures are proposed to support VCR functionalities and error resilience with little loss of coding efficiency. Based on the temporal dependency between frames in a GOP, we reconstruct the coding dependency in the GOP to support effective access of frames and stop error propagation. With one of the proposed structures, the server does not have to speed up the transmission rate of frames to fulfill the requested playback rate. The number of frames to be transmitted is also significantly reduced, and thus the network load can be decreased. With the other proposed structure, error propagation can be significantly reduce, and errors in one frame will not propagate to neighboring frames, such that the damaged frame can be recovered from its neighbors.

    Table of Contents 誌謝 i 中文摘要 ii Abstract iii Table of Contents iv List of Figures and Tables vi Chapter 1 Introduction 1 Chapter 2 Background 7 2.1 History of Video Coding Standards 7 2.2 VCR Functionality on Streaming Video 9 2.2.1 GOP-Skipping-Based Dynamic Transmission Scheme 11 2.2.2 Dual Bit-streams with Least-Cost Frame Selection 12 2.2.3 VCR Functionality in Staggered Broadcasting 13 2.2.4 The Split and Merge (SAM) Protocol 15 2.2.5 Single-Rate Multicast Double-Rate Unicast (SRMDRU) Scheme 16 2.3 Video Error Resilience 17 2.3.1 Interleaved Packetization 17 2.3.2 Independent Segment Prediction 18 2.3.3 Reference Picture Selection Based on Feedback Information (RPS) 18 2.4 Hierarchical Prediction Structures 19 Chapter 3 Design of Frame Dependency for VCR Streaming Videos 23 3.1 Model of VCR Functionality 23 3.1.1 The Playback Sequence 23 3.1.2 Optimality of Frame Dependency for VCR Functionality 24 3.1.3 Feasibility Model of VCR Functionality 26 3.2 Restructuring GOP for VCR functionality 29 3.3 Theoretical Analyses 34 3.3.1 Analyses of Redundancy in the Conventional GOP Structure 36 3.3.2 Analyses of Redundancy in the Proposed GOP Structure 39 3.3.3 Suboptimal GOP Structure 44 3.4 Experimental Results 47 Chapter 4 Design of Frame Dependency for Error Resilience 57 4.1 A Simple Example: Binary Tree Structure 57 4.2 Effective Temporal Dependencies: Double-Binary Tree Structure 59 4.2.1 Transmission of Double-Binary Tree GOP 60 4.2.2 Error Propagation Prevention 61 4.2.3 Error Concealment 62 4.3 Effective Temporal Dependencies: Extended GOP 66 4.3.1 Without Feedback Control: Transmitting All Dependencies 67 4.3.2 With Feedback Control: Transmitting Regular Dependencies 68 4.4 Experimental Results 69 4.4.1 Error Propagation 70 4.4.2 Error Concealment Simulations 71 4.4.3 Rate-Distortion Comparisons without Transmission Errors 74 Chapter 5 Future Works and Discussions 79 5.1 Discussions of the Proposed GOP Structures 79 5.2 Future Works 80 5.2.1 Inclusion of hierarchical B-pictures 80 5.2.2 Bit-allocation of the proposed GOP structures 81 5.2.3 Scalable video coding 81 References 83 List of Figures and Tables Figure 1 1 Atypical streaming video system. 3 Figure 1 2 Spatial and temporal error propagation 4 Figure 2 1 An example of the conventional GOP structure. 9 Figure 2 2 Support of VCR functionality based on GOP-skipping. The server sends “Dark” GOPs. 11 Figure 2 3 Support of VCR functionality based on dual bit-streams (GOP size = 14). 13 Figure 2 4 Staggered Broadcasting. 13 Figure 2 5 Support of VCR functionality in staggered broadcasting. 15 Figure 2 6 An example of interleaved packetization. 18 Figure 2 7 An example of RPS in H.263. 19 Figure 2 8 Examples of hierarchical prediction structures. (a) hierarchical B-pictures, (b) non-dyadic hierarchical prediction structure, (c) hierarchical prediction structure with zero delay. 21 Figure 3 1 The requested (gray) frames in different GOPs are different. The speed factor is set to 4 in this example. 24 Figure 3 2 An Example of self-referring sequence. 26 Figure 3 3 The proposed algorithm for structuring optimal frame dependency. 30 Figure 3 4 The illustration of optimal frame dependency. 30 Figure 3 5 Comparisons of frame rates using the conventional GOP and the proposed GOP by performing fast-scan operations at the first GOP. 33 Figure 3 6 The redundancy caused by performing fast-scan operations at different speed factors using proposed GOP with (a) different reference distances and (b) different GOP size. 46 Figure 3 7 Estimated number of frames to be sent for decoding a frame using the proposed method by different speed factors. 49 Figure 3 8 Comparisons of Average PSNR in a GOP for sequence “mobile” in CIF format. 51 Figure 3 9 Comparison of Visual quality using “mobile” at frame 11. (a) The proposed GOP. PSNR = 27.02dB (b) The conventional GOP. PSNR = 26.18dB. 51 Figure 3 10 Comparisons of Average PSNR without DCT and Quantization in a GOP for sequence “football” in SIF format. 52 Figure 3 11 Average bits needed to decode one requested frame at different speed factors. 53 Figure 3 12 The influence of reference distance on RD in the greedy tree GOP, where N = 16 and QP = 28. In foreman sequence, the diamonds from left to right are dref = 1, 2, 3, 4, 5, and 6, respectively. In salesman sequence, the squares from left to right are dref = 2, 3, 1, 4, 5, and 6, respectively. 54 Figure 3 13 RD performance of greedy tree GOP structure (dref = 4) and conventional GOP structure in (a) Salesman and (b) Foreman. 55 Figure 3 14 The number of frames to be decoded (solid curves) and the number of bits to be received at the client side (dot curves). 56 Figure 4 1 The Binary tree structure; size of GOP = 15. 58 Figure 4 2 Double-binary tree structure -- Coding dependencies in a GOP. 60 Figure 4 3 The diagram of interpolation scheme of the double-binary tree. 63 Figure 4 4 Extended GOP structure - Coding dependencies in a GOP. 67 Figure 4 5 The diagram of interpolation scheme of the extended GOP. 68 Figure 4 6 Simulation of block-loss in one frame. 72 Figure 4 7 Simulation of channel loss for “Foreman” and “Stefan” in CIF size. 73 Figure 4 8 Comparisons of Average PSNR and bit-rate in a GOP for sequence “Foreman” in CIF format. 76 Figure 4 9 RD graph with different QP values. 77 TABLE 3 1 Theoretical analyses between different GOP structures. 32 TABLE 3 2 Comparisons of transmitted frames with different GOP structures by performing fast-scan operations in the first GOP. 34 TABLE 3 3 Average Number of Redundant Frames Using Different GOP size and reference distance in proposed GOP structure 44 TABLE 3 4 Average Number of Redundant Frames in Different GOP Structures 47 TABLE 3 5 Average of PSNR and bit-rate comparisons between the conventional GOP and the proposed GOP for 90 frames. 50 TABLE 4 1 The number of influenced frames when a frame in the GOP is damaged. 71 TABLE 4 2 Average PSNR and bit-rate comparison between the conventional GOP structure and the double-binary tree structure for 150 frames. 75 TABLE 4 3 Average PSNR comparison between the conventional GOP structure and the double-binary tree structure for 150 frames (With only motion compensation). 78

    [1] K.-C. Almeroth and M.-H. Ammar, “The Use of Multicast Delivery to Provide a Scalable and Interactive Video-on-Demand Service,” IEEE J. Selected Areas in Communications, Vol. 14, pp. 1110–1122, Aug. 1996.
    [2] T.-C. Su, S.-Y. Huang, C.-L. Chan and J.-S. Wang, “Optimal Chaining Scheme for Video-on-Demand Applications on Collaborative Networks,” IEEE Transactions on Multimedia, Vol.7, pp.972-980, Oct. 2005.
    [3] L.-S. Juhn and L.-M. Tseng, “Harmonic Broadcasting for Video-on-Demand Service, “IEEE Trans. Broadcasting, Vol. 43, pp. 268–271 , Sep. 1997.
    [4] W. Liao and Victor O.-K. Li, “Split-and-Merge (SAM) Protocol for Interactive Video-on-Demand Systems,” IEEE Multimedia, Vol. 4, No. 4, pp. 51-62, Oct.-Dec., 1997.
    [5] J.-M. McManus and K.-W. Ross, “Video-on-Demand Over ATM: Constant-Rate Transmission and Transport,” IEEE Journal on Selected Areas in Communication, Vol. 14, No. 6, pp. 1087-1098, Aug. 1996.
    [6] J.-H. Lee and S.-S. Lee, “A GOP-Skipping-Based Dynamic Transmission Scheme for Supporting Fast Scan Functions of a Stored Video,” in Proc. IEEE 1999 TENCON, pp. 919-922, 1999.
    [7] C.-W. Lin, J. Zhou, J. Youn, and M.-T. Sun, “MPEG Video Streaming with VCR Functionality,” IEEE Transaction on Circuits and Systems for Video Technology, Vol. 11, No. 3, pp. 415-425, Mar. 2001.
    [8] T.-P. Ip, Y.-L. Chan, C.-H. Fu, and W.-C. Siu, “A Simplified Dual-Bitstream MPEG Video Streaming System with VCR Functionalities,” IEEE International Conference on Image Processing, vol. 6, pp. 481-484, Sept. 2007.
    [9] T.-P. Ip, Y.-L. Chan, and W.-C. Siu, “Redundancy Reduction Technique for Dual-Bitstream MPEG Video Streaming with VCR Functionalities,” IEEE Transactions on Broadcasting, vol. 54, no. 3, pp. 412-418, 2008.
    [10] C.-H. Fu, Y.-L. Chan, and W.-C. Siu, “Efficient Reverse-Play Algorithms for MPEG Video with VCR Support,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 16, No. 1, pp. 19-30, Jan. 2006.
    [11] M. Karczewicz and R. Kurceren, “The SP- and SI-frames Design for H.264/AVC,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp. 637-644, July 2003.
    [12] C.-M. Huang, K.-C. Yang, and J.-S. Wang, “A Low Cost Unrestricted Fast Playback Scheme for Video Streaming,” IEEE Transactions on Circuits and Systems II, Vol. 52, No. 7, pp. 384-388, July 2005.
    [13] T. Wiegand, G.-J. Sullivan, G. Bjntegaard, and A. Luthra, “Overview of the H.264/AVC Video Coding Standard,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7 , pp. 560-576, July 2003.
    [14] Y. Wang, J. Ostermann, and Y. Q. Zhang, Video Processing And Communications, Prentice Hall, 2002.
    [15] C. P. Lim, E. A. W. Tan, M. Ghanbari, and S. Ghanbari, “Cell Loss Concealment And Packetization in Packet Video,” Int. J. Imaging Syst. and Technol. --Special Issue on Image and Video Compression, vol. 10, no. 1, pp. 54-58, 1999.
    [16] ITU-T, SG15/WP15/1, LBC-95-033, Telenor R&D, “An Error Resilience Method Based on Back Channel Signaling And FEC,” Jan. 1996.
    [17] ISO/IEC JTC1/SC29/WG11 MPEG96/M0768, Iterated Systems Inc., “An Error Recovery Strategy for Videophone Applications,” Mar. 1996.
    [18] S. Fukunaga, T. Nakai, and H. Inoue, “Error-Resilient Video Coding by Dynamic Replacing of Reference Pictures,” GLOBECOM’96, Nov. 1996.
    [19] Y. Tomita, T. Kimura, and T. Ichikawa, “Error Resilient Modified Interframe Coding System for Limited Reference Picture Memories,” in Proc. Picture Coding Symp., pp. 743–748, Sep. 1997.
    [20] R. Zhang, S. L. Regunathan, and K. Rose, “Video Coding with Optimal Inter/Intra Mode Switching for Packet Loss Resilience,” IEEE J. Select. Areas Commun, Vol. 18, pp. 966–976, Jun. 2000.
    [21] J. Liang and R. Talluri, “Tools for Robust Image And Video Coding in JPEG-2000 And MPEG-4 Standards,” in Proc. IS&T-SPIE Visual Commun. Image Processing, Vol. 3653, pp. 40–51, 1999.
    [22] R. Talluri, “Error-Resilient Video Coding in ISO MPEG-4 Standard,” IEEE Commun. Mag., Vol. 36, No. 6, pp. 112–119, Jul. 1998.
    [23] S. Wenger, “Video Redundancy Coding in H.263+,” in Proc. AVSPN 97, 1997.
    [24] S. Wenger, G. Knorr, J. Ott, and F. Kossentini, “Error Resilience Support in H.263+,” IEEE Trans. Circuits and Syst. Video Technol., Vol. 8, pp. 867-87, Nov. 1998.
    [25] S. Wenger, “H.264/AVC over IP,” IEEE Trans. Circuits and Syst. Video Technol., Vol. 13, No. 7, pp. 645-656, July 2003.
    [26] S. Kumar, L. Xu, M. Mandal, and S. Panchanathan, “Overview of Error Resiliency Schemes in H.264/AVC Video Coding Standard,” To appear in Elsevier Journal of Visual Commun. and Image Representation (Special issue on H.264/AVC Video Coding Standard), 2005.
    [27] J. Apostolopoulos, T. Wong, W. Tan, and S. Wee, “On Multiple Description Streaming with Content Delivery Networks,” IEEE Infocom, Jul. 2002.
    [28] T. Shanableh and M. Ghanbari, “The Importance of the Bi-Directionally Predicted Pictures in Video Streaming,” IEEE Trans. Circuits and Syst. Video Technol., Vol. 11, No. 3, Mar. 2001.
    [29] ITU-T and ISO/IEC JTC 1, “Generic coding of moving pictures and associated audio information – Part2: Video,” ITU-T Recommendation H.262 and ISO/IEC 13818-2 (MPEG-2 Video), Nov. 1994.
    [30] ISO/IEC JTC 1, “Coding of audio-visual objects – Part2: Visual,” ISO/IEC 144962-2 (MPEG-4 Visual), Apr. 1999.
    [31] ITU-T and ISO/IEC JTC 1, Joint Video Team of ITU-T and ISO/IEC, “Draft Text of H.264/AVC Fidelity Range Extensions Amendment”, Doc. JVT-L047, Sept. 2004.
    [32] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the Scalable Video Coding Extension of the H.264/AVC Standard,” IEEE Trans. Circuits Syst. Video Technol., Vol. 17, No. 9, pp. 1103-1120, Sept. 2007.
    [33] J.B. Kwon and H.Y. Yeom, “Providing VCR Functionality in Staggered Video Broadcasting,” IEEE Transactions on Consumer Electronics, Vol. 48, No. 1,pp. 41-48, Feb 2002.
    [34] W. W.-F. Poon and K.-T. Lo, “Design of Multicast Delivery for Providing VCR Functionality in Interactive Video-on-Demand Systems,” IEEE Transactions on Broadcasting, Vol. 45, No. 1, pp. 141-148, Mar. 1999.
    [35] J. M. McManus, and K. W. Ross, “Video-on-Demand Over ATM: Constant-Rate Transmission and Transport,” IEEE Journal on Selected Areas in Communication, Vol. 14, No. 6, pp. 1087-1098, Aug. 1996.
    [36] Available http://www.xvid.org
    [37] H. Schwarz, D. Marpe, and T. Wiegand, "Hierarchical B pictures," Joint Video Team, doc. JVT-P014, Poznan, Poland, July 2005.
    [38] H. Schwarz, D. Marpe, and T. Wiegand, “Analysis of Hierarchical B Pictures and MCTF,” IEEE International Conference on Multimedia and Expo, pp. 1929-1932, Jul. 2006.
    [39] A. Leontaris and P.C. Cosman, “Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames,” IEEE Transactions on Image Processing, Vol. 16, No. 7, pp. 1726 - 1740, Jul. 2007.
    [40] P. Merkle, K. Muller, A. Smolic, and T. Wiegand, “Efficient Compression of Multi-View Video Exploiting Inter-View Dependencies Based on H.264/MPEG4-AVC ,” IEEE International Conference on Multimedia and Expo, pp. 1717-1720, Jul. 2006.
    [41] P. Merkle, A. Smolic, K. Muller, and T. Wiegand, “Efficient Prediction Structures for Multiview Video Coding,” IEEE Trans. Circuits Syst. Video Technol., Vol. 17, No. 11, pp. 1461-1473, Nov. 2007.
    [42] M. Liu and C. Zhu, “Multiple Description Video Coding using Hierarchical B Pictures,” IEEE International Conference on Multimedia and Expo, pp. 1367-1370, Jul. 2007.
    [43] J. Xu, S. Li, S. Zheng, X.K. Yang, and R. Xie, “Bit Allocation for Fine-Granular SNR Scalability Coding with Hierarchical B Pictures,” IEEE International Conference on Multimedia and Expo, pp. 1151-1154, Jul. 2007.
    [44] T. Rusert and J.-R. Ohm, “Macroblock Based Bit Allocation for SNR Scalable Video Coding with Hierarchical B Pictures,” IEEE International Conference on Image Processing, pp. 177-180, Oct. 2006.
    [45] Available http://iphome.hhi.de/suehring/tml/

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE