User-created multi-view video generation with portable camera in mobile environment

모바일 환경의 이동형 카메라를 이용한 사용자 저작 다시점 동영상의 제안

  • 성보경 (숭실대학교 글로벌미디어학과) ;
  • 박준형 (숭실대학교 글로벌미디어학과) ;
  • 여지혜 (숭실대학교 글로벌미디어학과) ;
  • 고일주 (숭실대학교 글로벌미디어학과)
  • Published : 2012.03.30

Abstract

Recently, user-created video shows high increasing in production and consumption. Among these, videos records an identical subject in limited space with multi-view are coming out. Occurring main reason of this kind of video is popularization of portable camera and mobile web environment. Multi-view has studied in visually representation technique fields for point of view. Definition of multi-view has been expanded and applied to various contents authoring lately. To make user-created videos into multi-view contents can be a kind of suggestion as a user experience for new form of video consumption. In this paper, we show the possibility to make user-created videos into multi-view video content through analyzing multi-view video contents even there exist attribute differentiations. To understanding definition and attribution of multi-view classified and analyzed existing multi-view contents. To solve time axis arranging problem occurred in multi-view processing proposed audio matching method. Audio matching method organize feature extracting and comparing. To extract features is proposed MFCC that is most universally used. Comparing is proposed n by n. We proposed multi-view video contents that can consume arranged user-created video by user selection.

Keywords

References

  1. B. J. Shannon, K. K. Paliql, "A comparative study of filter bank spacing for speech recognition," Proceedings of International Micro-electronic engineering research conference, Brisban, Austria, 2003, pp. 1-3.
  2. F. Zheng, G. Zhang, "Integrating the energy information into MFCC," 6th International Conference of Spoken Language Processing, Vol. 1, Beijing, China, 2000, pp. 389-392.
  3. Z. Jun, S. Kwong, W. Gang, Q. Hong. "Using Mel-Frequency Cepstral Coefficients in Missing Data Technique," EURASIP Journal on Applied Signal Processing, Vol. 2004, No. 1, 2004, pp. 340-346. https://doi.org/10.1155/S1110865704309030
  4. M. Xu, NC. Maddage, C. Xu, M. Kankanhalli, Q. Tian, "Creating audio keywords for event detection in soccer video," Multimedia and Expo. ICME03 Proceedings, Vol. 2, 2003, pp. 281-284.
  5. R. Vergin, Oapos, D. Shaughnessy, A. Farhat, "Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition," IEEE Transactions on Speech and Audio Processing, Vol. 7, No. 5, 1999, pp. 525-532. https://doi.org/10.1109/89.784104
  6. ISO/IEC JTC1/SC29/WG11, m16674, "Temporally Enhanced 3-D Test Sequence: Delivery," June 2009.
  7. ISO/IEC JTC1/SC29/WG11 MPEG2008, m15419, "Multiview Video Test Sequence and Camera Parameters," July 2008.
  8. Z. Zhang, "A Flexible New Technique for Camera Calibration," IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 22, No. 11, 2000, pp. 133-1334.
  9. 강윤석, 호요성, "평행 카메라 배열에서 촬영된 다시점 영상의 효율적인 정렬화 방법," 제21회 신호처리합동학술대회 논문집, 제21권, 2008, pp. 92-95.
  10. A. Ilie and G. Welch, "Ensuring Color Consistency Across Multiple Cameras," Proc. of 10th IEEE international Conference on Computer Vision, 2005, pp. II:1268-1275.
  11. A. Wang, T. Qiu, and L. Shao, "A Simple Method of Radial Distortion Correction with Centre of Distortion Estimation," Journal of Mathematical Imaging and Vision, Vol. 25, No. 3, 2009, pp. 165-172.
  12. 호요성, 오관정, "다시점 비디오 부호화," TTA Journal, Vol. 115, 2008, pp. 93-100.
  13. ISO/IEC JTC/SC29/WG11 and ITU-T SG16 Q. 6, "WD 4 Reference software for MVC," JVT-AD207, Geneva, Swiss, Feb. 2007.
  14. JJ. Burred, A. Lerch, "A hierarchical approach to automatic musical genre classification," Proceeding of 6th International Conference on Digital Audio Effects, London, UK, 2003, pp. DAFX1-DAFX4.
  15. E. Scheirer, M. Slaney, "Construction and evaluation of a robust multifeature," Acoustics, Speech, and Signal Processing ICASSP-97 IEEE, Vol. 2, 1997, pp. 1331-1334.
  16. L. Lu, HJ. Zhang, H. Jiang, "Content analysis for audio classification and segmentation," IEEE Transactions on Speech and Audio Processing, Vol. 10, No. 7, 2002, pp. 504-516.
  17. G. Tzanetakis, G. Essl, P. Cook, "Human perception and computer extraction of musical beat strength," Proceeding of 5th International Conference on Digital Audio Effects, Hamburg, Germany, 2002, pp. DAFX257-DAFX261.
  18. A. Harma, UK. Laine, "A comparison of warped and conventional linear predictive coding," IEEE Transactions on Speech and Audio Processing, Vol. 9, No. 5, 2001, pp. 579-588. https://doi.org/10.1109/89.928922
  19. J. Makhoul, "Linear prediction-A tutorial overview," Proceeding of IEEE, Vol. 63, No. 4, 1975, pp. 561-580.
  20. 박찬응, "은닉 마코프 모델과 켑스트럴 계수들에 따른 한국어 속삭임의 인식 비교," 전자공학회논문지 IE편, Vol. 43, No. 2, 2006, pp. 22-29.
  21. 최형기, 박기영, 김종교, "인지적 청각 특성을 이용한 고립 단어 전화 음성 인식," 전자공학회논문지 IE편, Vol. 39, No. 2, 2002, pp. 60-65.