• Title/Summary/Keyword: Audio processing

Search Result 449, Processing Time 0.029 seconds

Real-time Audio Processing for TCP/IP in Server-Client Model (서버-클라이언트 모델에서의 TCP/IP 기반 실시간 음성 처리)

  • Lee, Hyung-ho;Jeong, Dae-young;Park, Kyung-tae;You, Byung-sek;Kim, Jeong-sig
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.619-621
    • /
    • 2013
  • This paper is proposing a real-time audio processing system for TCP/IP with server-client. The server sends the audio data packet which is the same size each time while playing the audio data. And the client plays the received audio data from the server. In general, The receiving speed of audio data packet is faster than processing the audio data. So, the unstable playback is occurred when playing the received audio data at the moment. In order to overcome this problem, the double buffering method is proposed.

  • PDF

A study on the extended fixed-point arithmetic computation for MPEG audio data processing (MPEG Audio 데이터 처리를 위한 확장된 고정소수점 연산처리에 관한 연구)

  • 한상원;공진흥
    • Proceedings of the IEEK Conference
    • /
    • 2000.06b
    • /
    • pp.250-253
    • /
    • 2000
  • In this paper, we Implement a new arithmetic computation for MPEG audio data to overcome the limitations of real number processing in the fixed-point arithmetics, such as: overheads in processing time and power consumption. We aims at efficiently dealing with real numbers by extending the fixed-point arithmetic manipulation for floating-point numbers in MPEG audio data, and implementing the DSP libraries to support the manipulation and computation of real numbers with the fixed-point resources.

  • PDF

A Beamforming-Based Video-Zoom Driven Audio-Zoom Algorithm for Portable Digital Imaging Devices

  • Park, Nam In;Kim, Seon Man;Kim, Hong Kook;Kim, Myeong Bo;Kim, Sang Ryong
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.2 no.1
    • /
    • pp.11-19
    • /
    • 2013
  • A video-zoom driven audio-zoom algorithm is proposed to provide audio zooming effects according to the degree of video-zoom. The proposed algorithm is designed based on a super-directive beamformer operating with a 4-channel microphone array in conjunction with a soft masking process that uses the phase differences between microphones. The audio-zoom processed signal is obtained by multiplying the audio gain derived from the video-zoom level by the masked signal. The proposed algorithm is then implemented on a portable digital imaging device with a clock speed of 600 MHz after different levels of optimization, such as algorithmic level, C-code and memory optimization. As a result, the processing time of the proposed audio-zoom algorithm occupies 14.6% or less of the clock speed of the device. The performance evaluation conducted in a semi-anechoic chamber shows that the signals from the front direction can be amplified by approximately 10 dB compared to the other directions.

  • PDF

A Single-Chip Video/Audio CODEC for Low Bit Rate Application

  • Park, Seong-Mo;Kim, Seong-Min;Kim, Ig-Kyun;Byun, Kyung-Jin;Cha, Jin-Jong;Cho, Han-Jin
    • ETRI Journal
    • /
    • v.22 no.1
    • /
    • pp.20-29
    • /
    • 2000
  • In this paper, we present a design of video and audio single chip encoder/decoder for portable multimedia application. The single-chip called as video audio signal processor (VASP) consists of a video signal processing block and an audio single processing block. This chip has mixed hardware/software architecture to combine performance and flexibility. We designed the chip by partitioning between video and audio block. The video signal processing block was designed to implement hardware solution of pixel input/output, full pixel motion estimation, half pixel motion estimation, discrete cosine transform, quantization, run length coding, host interface, and 16 bits RISC type internal controller. The audio signal processing block is implemented with software solution using a 16 bits fixed point DSP. This chip contains 142,300 gates, 22 Kbits FIFO, 107 kbits SRAM, and 556 kbits ROM, and the chip size is $9.02mm{\times}9.06mm$ which is fabricated using 0.5 micron 3-layer metal CMOS technology.

  • PDF

A Study on the Signal Processing for Content-Based Audio Genre Classification (내용기반 오디오 장르 분류를 위한 신호 처리 연구)

  • 윤원중;이강규;박규식
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.6
    • /
    • pp.271-278
    • /
    • 2004
  • In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.

Serial Transmission of Audio Signals for Multi-channel Speaker Systems (다채널 스피커 시스템을 위한 오디오 신호지 직렬 전송)

  • Kwon, Oh-Kyun;Song, Moon-Vin;Lee, Seung-Won;Lee, Young-Won;Chung, Yun-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.7
    • /
    • pp.387-394
    • /
    • 2005
  • In this paper, we propose a new transmission technique of audio signals for the serial connection of the speakers of multiple-channel audio systems. Analog audio signals from a multi-channel audio system are converted into digital signals with signal processing steps and transferred to each speaker through a serial line. The signal processing steps contain data compression and packet generation in association with audio signal characteristics. Each speaker gets its corresponding digital audio signals from the transmitted packets and converts the signals into analog audio signals to make sounds with the speaker All the proposed functions in this paper are modeled in VHDL. implemented with FPGA chips, and tested for actual multi-channel audio systems.

Audio Context Recognition Using Signal's Reconstructed Phase Space (신호의 복원된 위상 공간을 이용한 오디오 상황 인지)

  • Vinh, La The;Khattak, Asad Masood;Loan, Trinh Van;Lee, Sungyoung;Lee, Young-Ko
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.243-244
    • /
    • 2009
  • So far, many researches have been conducted in the area of audio based context recognition. Nevertheless, most of them are based on existing feature extraction techniques derived from linear signal processing such as Fourier transform, wavelet transform, linear prediction... Meanwhile, environmental audio signal may potentially contains non-linear dynamic properties. Therefore, it is a big potential to utilize non-linear dynamic signal processing techniques in audio based context recognition.

Audio Signal Processing and System Design for improved intelligibility in Conference Room (회의실의 명료성(STI) 향상을 위한 오디오신호 처리 및 시스템 설계)

  • Kang, Cheolyong;Lee, Seokjoo;Jo, Kwangyeon;Lee, Seonhee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.2
    • /
    • pp.225-232
    • /
    • 2017
  • Recently, the development of digital transmission technology of audio signals and the introduction of audio network equipment using digital transmission technology have been made. As a result, audio network technology and equipment are actively applied to the design and construction of audio systems. The meeting room is a place where a large number of participants exchange opinions and communicate with each other. In addition to using an electric acoustic device such as a microphone and a speaker, it improves the intelligibility of the conference room through an example using an audio network.

A Study on Setting the Minimum and Maximum Distances for Distance Attenuation in MPEG-I Immersive Audio

  • Lee, Yong Ju;Yoo Jae-hyoun;Jang, Daeyoung;Kang, Kyeongok;Lee, Taejin
    • Journal of Broadcast Engineering
    • /
    • v.27 no.7
    • /
    • pp.974-984
    • /
    • 2022
  • In this paper, we introduce the minimum and maximum distance setting methods used in geometric distance attenuation processing, which is one of spatial sound reproduction methods. In general, sound attenuation by distance is inversely proportional to distance, that is 1/r law, but when the relative distance between the user and the audio object is very short or long, exceptional processing might be performed by setting the minimum distance or the maximum distance. While MPEG-I Immersive Audio's RM0 uses fixed values for the minimum and maximum distances, this study proposes effective methods for setting the distances considering the signal gain of an audio object. Proposed methods were verified through simulation of the proposed methods and experiments using RM0 renderer.

A Study on the input butter for efficient processing of MPEG Audio bitstream (MPEG Audio 비트스트림의 효율적 처리를 위한 입력 버퍼에 관한 연구)

  • 임성룡;공진흥
    • Proceedings of the IEEK Conference
    • /
    • 2000.06b
    • /
    • pp.181-184
    • /
    • 2000
  • In this paper, we described a design of the input buffer system for efficiently dealing with MPEG audio bitstream to demux header and side information, audio data. In order to overcome the limitations of fixed-word manipulation in bitstream demuxing, we proposed a new variable length bit retrieval system with FSM sequencer supporting MPEG audio frame format, and serial buffer demuxing audio stream, FIFO circular buffer including header and side information.

  • PDF