DOI QR코드

DOI QR Code

A Multi-category Task for Bitrate Interval Prediction with the Target Perceptual Quality

  • Yang, Zhenwei (School of Communication and Information Engineering, Shanghai University) ;
  • Shen, Liquan (Shanghai Institute for Advanced Communicationand Data Science, Shanghai University)
  • Received : 2021.06.26
  • Accepted : 2021.11.27
  • Published : 2021.12.31

Abstract

Video service providers tend to face user network problems in the process of transmitting video streams. They strive to provide user with superior video quality in a limited bitrate environment. It is necessary to accurately determine the target bitrate range of the video under different quality requirements. Recently, several schemes have been proposed to meet this requirement. However, they do not take the impact of visual influence into account. In this paper, we propose a new multi-category model to accurately predict the target bitrate range with target visual quality by machine learning. Firstly, a dataset is constructed to generate multi-category models by machine learning. The quality score ladders and the corresponding bitrate-interval categories are defined in the dataset. Secondly, several types of spatial-temporal features related to VMAF evaluation metrics and visual factors are extracted and processed statistically for classification. Finally, bitrate prediction models trained on the dataset by RandomForest classifier can be used to accurately predict the target bitrate of the input videos with target video quality. The classification prediction accuracy of the model reaches 0.705 and the encoded video which is compressed by the bitrate predicted by the model can achieve the target perceptual quality.

Keywords

References

  1. N. Kim and B. Lee, "Analysis and Improvement of MPEG-DASH-based Internet Live Broadcasting Services in Real-world Environments," KSII Transactions on Internet and Information Systems, vol. 13, no. 5, pp. 2544-2557, May 2019. https://doi.org/10.3837/tiis.2019.05.017
  2. Z. Li, A. Aaron, I. Katsavounidis, A. Moorthy, and M. Manohara, "Toward a practical perceptual video quality metric," Netflix, Los Gatos, CA, USA, The Netflix Tech Blog, 2016. [Online]. Available:https://medium.com/netflix-techblog/toward-a-practicalperceptualvideoquality-metric-653f208b9652
  3. H. Yang, L. Shen, X. Dong, Q. Ding, P. An and G. Jiang, "Low-Complexity CTU Partition Structure Decision and Fast Intra Mode Decision for Versatile Video Coding," IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 6, pp. 1668-1682, Jun. 2020. https://doi.org/10.1109/tcsvt.2019.2904198
  4. L. Shen, Z. Zhang and Z. Liu, "Adaptive Inter-Mode Decision for HEVC Jointly Utilizing Inter-Level and Spatiotemporal Correlations," IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, no. 10, pp. 1709-1722, Oct. 2014. https://doi.org/10.1109/TCSVT.2014.2313892
  5. J. De Cock, Z. Li, M. Manohara and A. Aaron, "Complexity-based consistent-quality encoding in the cloud," in Proc. of IEEE International Conference on Image Processing (ICIP), pp. 1484-1488, Sep. 2016.
  6. I. Katsavounidis, "Dynamic optimizer - a perceptual video encoding optimization framework," Netflix, Los Gatos, CA, USA, The Netflix Tech Blog, Mar.2018. [Online].Available:https://netflixtechblog.com/dynamic-optimizer-a-perceptual-video-encoding-optimization-framework-e19f1e3a277f
  7. A. V. Katsenou, J. Sole and D. R. Bull, "Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming," in Proc. of Picture Coding Symposium (PCS), pp. 1-5, Nov. 2019.
  8. C. Chen, Y. Lin, S. Benting, and A. Kokaram, "Optimized Transcoding for Large Scale Adaptive Streaming Using Playback Statistics," in Proc. of IEEE International Conference on Image Processing (ICIP), pp. 3269-3273, Oct 2018.
  9. S. John, A. Gadde and B. Adsumilli, "Rate Distortion Optimization Over Large Scale Video Corpus With Machine Learning," in Proc. of IEEE International Conference on Image Processing (ICIP), pp. 1286-1290, Oct. 2020.
  10. S. Ling, Y. Baveye, P. L. Callet, J. Skinner and I. Katsavounidis, "Towards Perceptually-Optimized Compression of User Generated Content (UGC): Prediction Of UGC Rate-Distortion Category," in Proc. of IEEE International Conference on Multimedia and Expo (ICME), pp. 1-6, Jul. 2020.
  11. S. Meng, Y. Li, Y. Liao, J. Li and S. Wang, "Learning to encode usergenerated short videos with lower bitrate and the same perceptual quality," in Proc. of IEEE International Conference on Visual Communications and Image Processing (VCIP), pp. 383-386, Dec. 2020.
  12. G. J. Sullivan, J. M. Boyce, Y. Chen, J. Ohm, C. A. Segall and A. Vetro, "Standardized Extensions of High Efficiency Video Coding (HEVC)," IEEE Journal of selected topics in Signal Processing, vol. 7, no. 6, pp. 1001-1016, Dec. 2013. https://doi.org/10.1109/JSTSP.2013.2283657
  13. N. Kamaci, Y. Altunbasak and R. M. Mersereau, "Frame bit allocation for the H.264/AVC video coder via Cauchy-density-based rate and distortion models," IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 8, pp. 994-1006, Aug. 2005. https://doi.org/10.1109/TCSVT.2005.852400
  14. S. Mallat and F. Falzon, "Analysis of low bit rate image transform coding," IEEE Transactions on Signal Processing, vol. 46, no. 4, pp. 1027-1042, April 1998. https://doi.org/10.1109/78.668554
  15. S. Hu, H. Wang and C. -. J. Kuo, "A GMM-based stair quality model for human perceived JPEG images," in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1070-1074, Mar. 2016.
  16. Wang H, Katsavounidis I, Zhou J, et al. "VideoSet: A large-scale compressed video quality dataset based on JND measurement," Journal of Visual Communication and Image Representation, vol 46, pp. 292-302, 2017. https://doi.org/10.1016/j.jvcir.2017.04.009
  17. Z. Li, C. Bampis, J. Novak, A. Aaron, K. Swanson, A. Moorthy, and J. Cock, "Vmaf: The journey continues," Netflix, Los Gatos, CA, USA, The Netflix Tech Blog, Oct. 2018. [Online]. Available:https://netflixtechblog.com/vmaf-the-journeycontinues-44b51ee9ed12
  18. Ozer J. "Finding the Just Noticeable Difference with Netflix VMAF," Sep. 2017. [Online]. Available: https://streaminglearningcenter.com/codecs/finding-the-just-noticeable-difference-with-netflix-vmaf.html
  19. Ozer J. "Fine-Tune Your Encoding With Objective Quality Metrics - Video and Handout," Dec. 2019.[Online]. Available: https://streaminglearningcenter.com/learning/fine-tune-your-encoding-with-objective-quality-metrics-video-and-handout.html
  20. Z. Liu, L. Wang, X. Li and X. Ji, "Optimize x265 Rate Control: An Exploration of Lookahead in Frame Bit Allocation and Slice Type Decision," IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2558-2573, May 2019. https://doi.org/10.1109/tip.2018.2887200
  21. Carreira J, Noland E, Hillier C, et al. "A short note on the kinetics-700 human action dataset," arXiv preprint, 2019.
  22. G. J. Sullivan, J. Ohm, W. Han and T. Wiegand, "Overview of the High Efficiency Video Coding (HEVC) Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649-1668, Dec. 2012. https://doi.org/10.1109/TCSVT.2012.2221191
  23. X. Shen, Z. Ni, W. Yang, X. Zhang, S. Wang and S. Kwong, "Just Noticeable Distortion Profile Inference: A Patch-Level Structural Visibility Learning Approach," IEEE Transactions on Image Processing, vol. 30, pp. 26-38, 2021. https://doi.org/10.1109/tip.2020.3029428
  24. A. Zvezdakova, S. Zvezdakov, D. Kulikov, and D. Vatolin, "Hacking vmaf with video color and contrast distortion," arXiv preprint, 2019.
  25. "Subjective Video Quality Assessment Methods for Multimedia Applications," ITU-R Rec. P.910, 1999.
  26. S. Wolf and M. Pinson, "Video quality measurement techniques," NTIA, Washington D.C., Tech. Rep. 02-392, Jun. 2002.