Implementation of Interactive Media Content Production Framework based on Gesture Recognition

Koh, You-jin;Kim, Tae-Won;Kim, Yong-Goo;Choi, Yoo-Joo;

doi:10.5909/JBE.2020.25.4.545

방송공학회논문지 (Journal of Broadcast Engineering)

제25권4호
/
Pages.545-559
/
2020
/
1226-7953(pISSN)
/
2287-9137(eISSN)

한국방송∙미디어공학회 (The Korean Institute of Broadcast and Media Engineers)

DOI QR Code

제스처 인식 기반의 인터랙티브 미디어 콘텐츠 제작 프레임워크 구현

Implementation of Interactive Media Content Production Framework based on Gesture Recognition

고유진 (서울미디어대학원대학교 뉴미디어학부 미디어공학&인공지능 응용소프트웨어학과) ;
김태원 (서울미디어대학원대학교 뉴미디어학부 미디어공학&인공지능 응용소프트웨어학과) ;
김용구 (서울미디어대학원대학교 뉴미디어학부 미디어공학&인공지능 응용소프트웨어학과) ;
최유주 (서울미디어대학원대학교 뉴미디어학부 미디어공학&인공지능 응용소프트웨어학과)

Koh, You-jin (Department of Newmedia, Seoul Media Institute of Technology) ;
Kim, Tae-Won (Department of Newmedia, Seoul Media Institute of Technology) ;
Kim, Yong-Goo (Department of Newmedia, Seoul Media Institute of Technology) ;
Choi, Yoo-Joo (Department of Newmedia, Seoul Media Institute of Technology)

투고 : 2020.06.17
심사 : 2020.06.23
발행 : 2020.07.30

https://doi.org/10.5909/JBE.2020.25.4.545 인용 PDF KSCI KPUBS

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문에서는 사용자의 제스처에 따라 반응하는 인터랙티브 미디어 콘텐츠를 프로그래밍 경험이 없는 사용자가 쉽게 제작할 수 있도록 하는 콘텐츠 제작 프레임워크를 제안한다. 제안 프레임워크에서 사용자는 사용하는 제스처와 이에 반응하는 미디어의 효과를 번호로 정의하고, 텍스트 기반의 구성 파일에서 이를 연결한다. 제안 프레임워크에서는 사용자의 제스처에 따라 반응하는 인터랙티브 미디어 콘텐츠를 사용자의 위치를 추적하여 프로젝션 시키기 위하여 동적 프로젝션 맵핑 모듈과 연결하였다. 또한, 제스처 인식을 위한 처리 속도와 메모리 부담을 줄이기 위하여 사용자의 움직임을 그레이 스케일(gray scale)의 모션 히스토리 이미지(Motion history image)로 표현하고, 이를 입력 데이터로 사용하는 제스처 인식을 위한 합성곱 신경망(Convolutional Neural Network) 모델을 설계하였다. 5가지 제스처를 인식하는 실험을 통하여 합성곱 신경망 모델의 계층수와 하이퍼파라미터를 결정하고 이를 제안 프레임워크에 적용하였다. 제스처 인식 실험에서 97.96%의 인식률과 12.04 FPS의 처리속도를 획득하였고, 3가지 파티클 효과와 연결한 실험에서 사용자의 움직임에 따라 의도하는 적절한 미디어 효과가 실시간으로 보임을 확인하였다.

In this paper, we propose a content creation framework that enables users without programming experience to easily create interactive media content that responds to user gestures. In the proposed framework, users define the gestures they use and the media effects that respond to them by numbers, and link them in a text-based configuration file. In the proposed framework, the interactive media content that responds to the user's gesture is linked with the dynamic projection mapping module to track the user's location and project the media effects onto the user. To reduce the processing speed and memory burden of the gesture recognition, the user's movement is expressed as a gray scale motion history image. We designed a convolutional neural network model for gesture recognition using motion history images as input data. The number of network layers and hyperparameters of the convolutional neural network model were determined through experiments that recognize five gestures, and applied to the proposed framework. In the gesture recognition experiment, we obtained a recognition accuracy of 97.96% and a processing speed of 12.04 FPS. In the experiment connected with the three media effects, we confirmed that the intended media effect was appropriately displayed in real-time according to the user's gesture.

키워드

참고문헌

J. Park, W. Kim, "A Study of the Value and Utility of Projection Mapping in the Contents of Dance Performances-Focused on the Work :"Our Karma"-", Journal of Korean Dance, Vol.46, pp.9-28, August 2018.
Projection Mapping Contents of Media Art - LG CNS Blog, https://blog.lgcns.com/1149 (accessed July. 14, 2016)
S. Kim, Y. Koh and Y. Choi, "Design and Implementation of Immersive Media System Based on Dynamic Projection Mapping and Gesture Recognition" KIPS Transactions on Software and Data Engineering, Vol.9, No.3, pp.109-122, January 2020. https://doi.org/10.3745/KTSDE.2020.9.3.109
H. Yang, "Deep-learning and Gesture recognition" Broadcasting and Media Magazine, Vol. 22, No.1, pp.67-74, January 2017.
B. Lee, D. Oh, T. Kim, "3D Virtual Reality Game with Deep Learning-based Hand Gesture Recognition." Journal of the Korea Computer Graphics Society, Vol. 24, No. 5, pp.41-48, December 2018. https://doi.org/10.15701/kcgs.2018.24.5.41
Maryam Asadi-Aghbolaghi, Albert Clapes, Marco Bellantonio, "A survey on deep learning based approaches for action and gesture recognition in image sequences." Proceeding of IEEE International Conference on Automatic Face & Gesture Recognition, Washington, DC, pp. 476-483, 2017, https://doi.org/10.1109/FG.2017.150 (accessed June.1, 2017)
J. Lee, Y. Kim, D. Kim, "Realtime Projection Mapping on Flexible Dynamic Objects" Proceeding Of HCI Korea Conference, Gangwondo, Korea, pp. 187-190, 2014.
KOCCA, Trends in stage production of large events and performances in the United States, Content Industry Trend of USA, Vol. 21, 2018
bot&dolly - Box, https://youtu.be/lX6JcybgDFo (accessed Sep. 24, 2013)
Ishikawa group Lab - DynaFlash v2 and Post Reality, https://youtu.be/QDppJ9NWtaE (accessed Mar. 5, 2018)
connected colors / real-time face tracking and 3d projection mapping, https://youtu.be/nMvFwC3bo_E (accessed Mar. 14, 2016)
緣 ' NMARA Interdisciplinary Art Performance Works, https://youtu.be/0oa7kVbVxsA (accessed Dec. 16, 2019)
Virtual Reality Interactive Sandbox (Contour Line), https://youtu.be/PyJfIdNbtv4 (accessed Feb. 4, 2015)
Interactive wall and Floor Projection, https://youtu.be/AZA6X3mPdtg (accessed May. 7, 2013)

방송공학회논문지 (Journal of Broadcast Engineering)

제스처 인식 기반의 인터랙티브 미디어 콘텐츠 제작 프레임워크 구현

Implementation of Interactive Media Content Production Framework based on Gesture Recognition

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)