Search | Korea Science

A study on speech enhancement using complex-valued spectrum employing Feature map Dependent attention gate (특징 맵 중요도 기반 어텐션을 적용한 복소 스펙트럼 기반 음성 향상에 관한 연구)

Jaehee Jung;Wooil Kim
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.6
- /
- pp.544-551
- /
- 2023
Speech enhancement used to improve the perceptual quality and intelligibility of noise speech has been studied as a method using a complex-valued spectrum that can improve both magnitude and phase in a method using a magnitude spectrum. In this paper, a study was conducted on how to apply attention mechanism to complex-valued spectrum-based speech enhancement systems to further improve the intelligibility and quality of noise speech. The attention is performed based on additive attention and allows the attention weight to be calculated in consideration of the complex-valued spectrum. In addition, the global average pooling was used to consider the importance of the feature map. Complex-valued spectrum-based speech enhancement was performed based on the Deep Complex U-Net (DCUNET) model, and additive attention was conducted based on the proposed method in the Attention U-Net model. The results of the experiments on noise speech in a living room environment showed that the proposed method is improved performance over the baseline model according to evaluation metrics such as Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short Time Object Intelligence (STOI), and consistently improved performance across various background noise environments and low Signal-to-Noise Ratio (SNR) conditions. Through this, the proposed speech enhancement system demonstrated its effectiveness in improving the intelligibility and quality of noisy speech.
https://doi.org/10.7776/ASK.2023.42.6.544 인용 PDF

A study on skip-connection with time-frequency self-attention for improving speech enhancement based on complex-valued spectrum (복소 스펙트럼 기반 음성 향상의 성능 향상을 위한 time-frequency self-attention 기반 skip-connection 기법 연구)

Jaehee Jung;Wooil Kim
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.2
- /
- pp.94-101
- /
- 2023
A deep neural network composed of encoders and decoders, such as U-Net, used for speech enhancement, concatenates the encoder to the decoder through skip-connection. Skip-connection helps reconstruct the enhanced spectrum and complement the lost information. The features of the encoder and the decoder connected by the skip-connection are incompatible with each other. In this paper, for complex-valued spectrum based speech enhancement, Self-Attention (SA) method is applied to skip-connection to transform the feature of encoder to be compatible with the features of decoder. SA is a technique in which when generating an output sequence in a sequence-to-sequence tasks the weighted average of input is used to put attention on subsets of input, showing that noise can be effectively eliminated by being applied in speech enhancement. The three models using encoder and decoder features to apply SA to skip-connection are studied. As experimental results using TIMIT database, the proposed methods show improvements in all evaluation metrics compared to the Deep Complex U-Net (DCUNET) with skip-connection only.
https://doi.org/10.7776/ASK.2023.42.2.094 인용 PDF

CCQC modal combination rule using load-dependent Ritz vectors

Xiangxiu Li;Huating Chen
- Structural Engineering and Mechanics
- /
- v.87 no.1
- /
- pp.57-68
- /
- 2023
Response spectrum method is still an effective approach for the design of buildings with supplemental dampers. In practice, complex complete quadratic combination (CCQC) rule is always used in the response spectrum method to consider the effect of non-classical damping. The conventional CCQC rule is based on exact complex mode vectors. Sometimes the calculated complex mode vectors may be not excited by the external loading and errors in the structural responses always arise due to the mode truncation. Load-dependent Ritz (LDR) vectors are associated with the external loading and LDR vectors not excited can be automatically excluded. Also, contributions of higher modes are implicitly contained in the LDR vectors in terms of static responses. To improve the calculation efficiency and accuracy, LDR vectors are introduced in the CCQC rule in the present study. Firstly, the generation procedure of LDR vectors suitable for non-classical damping system is presented. Compared to the conventional LDR vectors, the LDR vectors herein are complex-valued and named as complex LDR (CLDR) vectors. Based on the CLDR vectors, the CCQC rule is then rederived and an improved response spectrum method is developed. Finally, the effectiveness of the proposed method in this paper is verified through three typical non-classical damping buildings. Numerical results show that the CLDR vector is superior to the complex mode with the same number in the calculation. Since the generation of CLDR vectors requires less computational cost and storage space, the method proposed in this paper offers an attractive alternative, especially for structures with a large number of degrees of freedom.
https://doi.org/10.12989/sem.2023.87.1.057 인용

Identification of Defect Frequencies in Rolling Element Bearing Using Directional Spectra of Vibration Signals (구름 베어링의 결함 주파수 규명을 위한 방향 스펙트럼의 이용)

박종포;이종원
- Journal of KSNVE
- /
- v.9 no.2
- /
- pp.393-400
- /
- 1999
Defect frequencies of rolling element bearings are experimentally investigated utilizing the two-sided directional spectra of the complex-valued vibration signals measured from the outer ring of defective bearings. The directional spectra make it possible to discern backward and forward defect frequencies. The experimental results show that the directional zoom spectrum is superior to the conventional spectrum in identification of bearing defect frequencies, in particular the inner race defect frequencies.
PDF

A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum (효과적인 복소 스펙트럼 기반 음성 향상을 위한 시간과 주파수 영역 손실함수 조합에 관한 연구)

Jung, Jaehee;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.1
- /
- pp.38-44
- /
- 2022
Speech enhancement is performed to improve intelligibility and quality of the noise-corrupted speech. In this paper, speech enhancement performance was compared using different loss functions in time and frequency domains. This study proposes a combination of loss functions to utilize advantage of each domain by considering both the details of spectrum and the speech waveform. In our study, Scale Invariant-Source to Noise Ratio (SI-SNR) is used for the time domain loss function, and Mean Squared Error (MSE) is used for the frequency domain, which is calculated over the complex-valued spectrum and magnitude spectrum. The phase loss is obtained using the sin function. Speech enhancement result is evaluated using Source-to-Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI). In order to confirm the result of speech enhancement, resulting spectrograms are also compared. The experimental results over the TIMIT database show the highest performance when using combination of SI-SNR and magnitude loss functions.
https://doi.org/10.7776/ASK.2022.41.1.038 인용 PDF KSCI

Directional Harmonic Wavelet Analysis (방향성 조화 웨이블렛 해석 기법)

한윤식;이종원
- Journal of KSNVE
- /
- v.8 no.5
- /
- pp.957-963
- /
- 1998
A new signal processing technique, the directional harmonic wavelet map(dHWM), is presented to characterize the instantaneous planar motion of a measurement point in a structure from its transient complex-valued vibration signal. It is proven that the directional auto-HWM essentially tracks the shape and directively of the instantaneous planar motion, whereas the phase of the directional cross-HWM indicates its inclination angle. Finally, the technique is suessfully applied to an automobile engine for characterization of its transient motion during crank-on/idling/engine-off.
PDF

Well-Defined series and parallel D-spectra for preparation for linear time-varying systems (선형 시변 시스템에 대한 잘 정의된 (well-defined) 직렬 및 병렬 D-스펙트럼)

Zhu, j.jim;Lee, Ho-Cheol;Choe, Jae-Won
- Journal of Institute of Control, Robotics and Systems
- /
- v.5 no.5
- /
- pp.521-528
- /
- 1999
The nth-order, scalar, linear time-varying (LTV) systems can be dealt with operators on a differential ring. Using this differential algebraic structure and a classical result on differential operator factorizaitons developed by Floquet, a novel eigenstructure(eigenvalues, eigenvectors) concepts for linear time0varying systems are proposed. In this paper, Necessary and sufficient conditions for the existence of well-defined(free of finite-time singularities) SD- and PD- spectra for SPDOs with complex- and real-valued coefficients are also presented. Three numerical examples are presented to illustrate the proposed concepts.
PDF

FPGA Implementation of Unitary MUSIC Algorithm for DoA Estimation (도래방향 추정을 위한 유니터리 MUSIC 알고리즘의 FPGA 구현)

Ju, Woo-Yong;Lee, Kyoung-Sun;Jeong, Bong-Sik
- Journal of the Institute of Convergence Signal Processing
- /
- v.11 no.1
- /
- pp.41-46
- /
- 2010
In this paper, the DoA(Direction of Arrival) estimator using unitary MUSIC algorithm is studied. The complex-valued correlation matrix of MUSIC algorithm is transformed to the real-valued one using unitary transform for easy implementation. The eigenvalue and eigenvector are obtained by the combined Jacobi-CORDIC algorithm. CORDIC algorithm can be implemented by only ADD and SHIFT operations and MUSIC spectrum computed by 256 point DFT algorithm. Results of unitary MUSIC algorithm designed by System Generator for FPGA implementation is entirely consistent with Matlab results. Its performance is evaluated through hardware co-simulation and resource estimation.
PDF KSCI

A NOTE ON ∗-PARANORMAL OPERATORS AND RELATED CLASSES OF OPERATORS

Tanahashi, Kotoro;Uchiyama, Atsushi
- Bulletin of the Korean Mathematical Society
- /
- v.51 no.2
- /
- pp.357-371
- /
- 2014
We shall show that the Riesz idempotent $E_{\lambda}$ of every *-paranormal operator T on a complex Hilbert space H with respect to each isolated point ${\lambda}$ of its spectrum ${\sigma}(T)$ is self-adjoint and satisfies $E_{\lambda}\mathcal{H}=ker(T-{\lambda})= ker(T-{\lambda})^*$. Moreover, Weyl's theorem holds for *-paranormal operators and more general for operators T satisfying the norm condition $||Tx||^n{\leq}||T^nx||\,||x||^{n-1}$ for all $x{\in}\mathcal{H}$. Finally, for this more general class of operators we find a sufficient condition such that $E_{\lambda}\mathcal{H}=ker(T-{\lambda})= ker(T-{\lambda})^*$ holds.
https://doi.org/10.4134/BKMS.2014.51.2.357 인용 PDF KSCI

Implementation of a Real-time Multipath Fading Channel Simulator Using a Hybrid DSP-FPGA Architecture (DSP-FPGA 구조를 갖는 다중경로 페이딩 채널 시뮬레이터 구현)

이주현;이찬길
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.1
- /
- pp.17-23
- /
- 2004
The mobile radio channel can be simulated as a complex-valued random process with narrow-band spectrum. This paper describes a real-time implementation of that process using a INS320C6414 digital signal processor and XC2VP30 Virtex FPGA. The simulator presented here is not only a comprehensive model of the flat fading but also frequency selective fading mobile channel conditions. To replicate the statistical characteristics of the multipath fading environment with the minimum computational burden, multi-rate techniques are employed to resolve practical problems such as variable sampling rate. The simulator produces accurate and consistent results due to digital implementation. It is very flexible and simple to program for various field conditions in mobile communications with a graphical user interface.
PDF KSCI

Search Result 11, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)