• Title/Summary/Keyword: Sub-Feature

Search Result 445, Processing Time 0.036 seconds

The extension of the largest generalized-eigenvalue based distance metric Dij1) in arbitrary feature spaces to classify composite data points

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • v.17 no.4
    • /
    • pp.39.1-39.20
    • /
    • 2019
  • Analyzing patterns in data points embedded in linear and non-linear feature spaces is considered as one of the common research problems among different research areas, for example: data mining, machine learning, pattern recognition, and multivariate analysis. In this paper, data points are heterogeneous sets of biosequences (composite data points). A composite data point is a set of ordinary data points (e.g., set of feature vectors). We theoretically extend the derivation of the largest generalized eigenvalue-based distance metric Dij1) in any linear and non-linear feature spaces. We prove that Dij1) is a metric under any linear and non-linear feature transformation function. We show the sufficiency and efficiency of using the decision rule $\bar{{\delta}}_{{\Xi}i}$(i.e., mean of Dij1)) in classification of heterogeneous sets of biosequences compared with the decision rules min𝚵iand median𝚵i. We analyze the impact of linear and non-linear transformation functions on classifying/clustering collections of heterogeneous sets of biosequences. The impact of the length of a sequence in a heterogeneous sequence-set generated by simulation on the classification and clustering results in linear and non-linear feature spaces is empirically shown in this paper. We propose a new concept: the limiting dispersion map of the existing clusters in heterogeneous sets of biosequences embedded in linear and nonlinear feature spaces, which is based on the limiting distribution of nucleotide compositions estimated from real data sets. Finally, the empirical conclusions and the scientific evidences are deduced from the experiments to support the theoretical side stated in this paper.

Effective Feature Extraction in the Individual frequency Sub-bands for Speech Recognition (음성인식을 위한 주파수 부대역별 효과적인 특징추출)

  • 지상문
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.4
    • /
    • pp.598-603
    • /
    • 2003
  • This paper presents a sub-band feature extraction approach in which the feature extraction method in the individual frequency sub-bands is determined in terms of speech recognition accuracy. As in the multi-band paradigm, features are extracted independently in frequency sub-regions of the speech signal. Since the spectral shape is well structured in the low frequency region, the all pole model is effective for feature extraction. But, in the high frequency region, the nonparametric transform, discrete cosine transform is effective for the extraction of cepstrum. Using the sub-band specific feature extraction method, the linguistic information in the individual frequency sub-bands can be extracted effectively for automatic speech recognition. The validity of the proposed method is shown by comparing the results of speech recognition experiments for our method with those obtained using a full-band feature extraction method.

Feature Extraction by Optimizing the Cepstral Resolution of Frequency Sub-bands (주파수 부대역의 켑스트럼 해상도 최적화에 의한 특징추출)

  • 지상문;조훈영;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.1
    • /
    • pp.35-41
    • /
    • 2003
  • Feature vectors for conventional speech recognition are usually extracted in full frequency band. Therefore, each sub-band contributes equally to final speech recognition results. In this paper, feature Teeters are extracted indepedently in each sub-band. The cepstral resolution of each sub-band feature is controlled for the optimal speech recognition. For this purpose, different dimension of each sub-band ceptral vectors are extracted based on the multi-band approach, which extracts feature vector independently for each sub-band. Speech recognition rates and clustering quality are suggested as the criteria for finding the optimal combination of sub-band Teeter dimension. In the connected digit recognition experiments using TIDIGITS database, the proposed method gave string accuracy of 99.125%, 99.775% percent correct, and 99.705% percent accuracy, which is 38%, 32% and 37% error rate reduction relative to baseline full-band feature vector, respectively.

Noise Robust Speaker Identification using Reliable Sub-Band Selection in Multi-Band Approach (신뢰성 높은 서브밴드 선택을 이용한 잡음에 강인한 화자식별)

  • Kim, Sung-Tak;Ji, Mi-Gyeong;Kim, Hoi-Rin
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.127-130
    • /
    • 2007
  • The conventional feature recombination technique is very effective in the band-limited noise condition, but in broad-band noise condition, the conventional feature recombination technique does not produce notable performance improvement compared with the full-band system. To cope with this drawback, we introduce a new technique of sub-band likelihood computation in the feature recombination, and propose a new feature recombination method by using this sub-band likelihood computation. Furthermore, the reliable sub-band selection based on the signal-to-noise ratio is used to improve the performance of this proposed feature recombination. Experimental results shows that the average error reduction rate in various noise condition is more than 27% compared with the conventional full-band speaker identification system.

  • PDF

Noise Rabust Speaker Verification Using Sub-Band Weighting (서브밴드 가중치를 이용한 잡음에 강인한 화자검증)

  • Kim, Sung-Tak;Ji, Mi-Kyong;Kim, Hoi-Rin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.279-284
    • /
    • 2009
  • Speaker verification determines whether the claimed speaker is accepted based on the score of the test utterance. In recent years, methods based on Gaussian mixture models and universal background model have been the dominant approaches for text-independent speaker verification. These speaker verification systems based on these methods provide very good performance under laboratory conditions. However, in real situations, the performance of speaker verification system is degraded dramatically. For overcoming this performance degradation, the feature recombination method was proposed, but this method had a drawback that whole sub-band feature vectors are used to compute the likelihood scores. To deal with this drawback, a modified feature recombination method which can use each sub-band likelihood score independently was proposed in our previous research. In this paper, we propose a sub-band weighting method based on sub-band signal-to-noise ratio which is combined with previously proposed modified feature recombination. This proposed method reduces errors by 28% compared with the conventional feature recombination method.

Optical Proximity Correction using Sub-resolution Assist Feature in Extreme Ultraviolet Lithography (극자외선 리소그라피에서의 Sub-resolution assist feature를 이용한 근접효과보정)

  • Kim, Jung Sik;Hong, Seongchul;Jang, Yong Ju;Ahn, Jinho
    • Journal of the Semiconductor & Display Technology
    • /
    • v.15 no.3
    • /
    • pp.1-5
    • /
    • 2016
  • In order to apply sub-resolution assist feature (SRAF) in extreme ultraviolet lithography, the maximum non-printing SRAF width and lithography process margin needs to be improved. Through simulation, we confirmed that the maximum SRAF width of 6% attenuated phase shift mask (PSM) is large compared to conventional binary intensity mask. The increase in SRAF width is due to dark region's reflectivity of PSM which consequently improves the process window. Furthermore, the critical dimension error caused by variation of SRAF width and center position is reduced by lower change in diffraction amplitude. Therefore, we speculate that the margin of SRAF application will be improved by using PSM.

Tor Network Website Fingerprinting Using Statistical-Based Feature and Ensemble Learning of Traffic Data (트래픽 데이터의 통계적 기반 특징과 앙상블 학습을 이용한 토르 네트워크 웹사이트 핑거프린팅)

  • Kim, Junho;Kim, Wongyum;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.6
    • /
    • pp.187-194
    • /
    • 2020
  • This paper proposes a website fingerprinting method using ensemble learning over a Tor network that guarantees client anonymity and personal information. We construct a training problem for website fingerprinting from the traffic packets collected in the Tor network, and compare the performance of the website fingerprinting system using tree-based ensemble models. A training feature vector is prepared from the general information, burst, cell sequence length, and cell order that are extracted from the traffic sequence, and the features of each website are represented with a fixed length. For experimental evaluation, we define four learning problems (Wang14, BW, CWT, CWH) according to the use of website fingerprinting, and compare the performance with the support vector machine model using CUMUL feature vectors. In the experimental evaluation, the proposed statistical-based training feature representation is superior to the CUMUL feature representation except for the BW case.

Development of Registration Algorithm considering Coordinate Weights for Automobile Sub-Frame Assembly (가중치를 고려한 자동차 서브프레임의 인증 알고리즘 구현)

  • Lee, Kwang-Il;Yang, Seung-Han;Lee, Young-Moon
    • Journal of the Korean Society of Manufacturing Process Engineers
    • /
    • v.3 no.4
    • /
    • pp.7-12
    • /
    • 2004
  • Inspection and analysis are essential process to determine whether a completed product is in given specification or not. Analysis of products with very complicated shape is difficult to carry out direct comparison between inspected coordinate and designed coordinates. So process called as matching or registrations is needed to solve this problem. By defining error between two coordinates and minimizing the error, registration is done. Registration consists of translation, rotation and scale transformations. Error must be defined to express feature of inspected product. In this paper, registration algorithm is developed to determine pose of sub-frame at assembly with body of automobile by defining error between two coordinates considering geometric feature of sub-frame.

  • PDF

Two Dimensional Slow Feature Discriminant Analysis via L2,1 Norm Minimization for Feature Extraction

  • Gu, Xingjian;Shu, Xiangbo;Ren, Shougang;Xu, Huanliang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.3194-3216
    • /
    • 2018
  • Slow Feature Discriminant Analysis (SFDA) is a supervised feature extraction method inspired by biological mechanism. In this paper, a novel method called Two Dimensional Slow Feature Discriminant Analysis via $L_{2,1}$ norm minimization ($2DSFDA-L_{2,1}$) is proposed. $2DSFDA-L_{2,1}$ integrates $L_{2,1}$ norm regularization and 2D statically uncorrelated constraint to extract discriminant feature. First, $L_{2,1}$ norm regularization can promote the projection matrix row-sparsity, which makes the feature selection and subspace learning simultaneously. Second, uncorrelated features of minimum redundancy are effective for classification. We define 2D statistically uncorrelated model that each row (or column) are independent. Third, we provide a feasible solution by transforming the proposed $L_{2,1}$ nonlinear model into a linear regression type. Additionally, $2DSFDA-L_{2,1}$ is extended to a bilateral projection version called $BSFDA-L_{2,1}$. The advantage of $BSFDA-L_{2,1}$ is that an image can be represented with much less coefficients. Experimental results on three face databases demonstrate that the proposed $2DSFDA-L_{2,1}/BSFDA-L_{2,1}$ can obtain competitive performance.

Damage detection of bridges based on spectral sub-band features and hybrid modeling of PCA and KPCA methods

  • Bisheh, Hossein Babajanian;Amiri, Gholamreza Ghodrati
    • Structural Monitoring and Maintenance
    • /
    • v.9 no.2
    • /
    • pp.179-200
    • /
    • 2022
  • This paper proposes a data-driven methodology for online early damage identification under changing environmental conditions. The proposed method relies on two data analysis methods: feature-based method and hybrid principal component analysis (PCA) and kernel PCA to separate damage from environmental influences. First, spectral sub-band features, namely, spectral sub-band centroids (SSCs) and log spectral sub-band energies (LSSEs), are proposed as damage-sensitive features to extract damage information from measured structural responses. Second, hybrid modeling by integrating PCA and kernel PCA is performed on the spectral sub-band feature matrix for data normalization to extract both linear and nonlinear features for nonlinear procedure monitoring. After feature normalization, suppressing environmental effects, the control charts (Hotelling T2 and SPE statistics) is implemented to novelty detection and distinguish damage in structures. The hybrid PCA-KPCA technique is compared to KPCA by applying support vector machine (SVM) to evaluate the effectiveness of its performance in detecting damage. The proposed method is verified through numerical and full-scale studies (a Bridge Health Monitoring (BHM) Benchmark Problem and a cable-stayed bridge in China). The results demonstrate that the proposed method can detect the structural damage accurately and reduce false alarms by suppressing the effects and interference of environmental variations.