• Title, Summary, Keyword: density-based clustering

Search Result 125, Processing Time 0.049 seconds

Local Distribution Based Density Clustering for Speaker Diarization (화자분할을 위한 지역적 특성 기반 밀도 클러스터링)

  • Rho, Jinsang;Shon, Suwon;Kim, Sung Soo;Lee, Jae-Won;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.34 no.4
    • /
    • pp.303-309
    • /
    • 2015
  • Speaker diarization is the task of determining the speakers for unlabeled data, and DBSCAN (Density-Based Spatial Clustering of Applications with Noise) has been widely used in the field of speaker diarization for its simplicity and computational efficiency. One challenging issue, however, is that if different clusters in non-spatial dataset are adjacent to each other, over-clustering may occur which subsequently degrades the performance of DBSCAN. In this paper, we identify the drawbacks of DBSCAN and propose a new density clustering algorithm based on local distribution property around object. Variable density criterions for local density and spreadness of object are used for effective data clustering. We compare the proposed algorithm to DBSCAN in terms of clustering accuracy. Experimental results confirm that the proposed algorithm exhibits higher accuracy than DBSCAN without over-clustering and confirm that the new approach based on local density and object spreadness is efficient.

A Density-based Clustering Method

  • Ahn, Sung Mahn;Baik, Sung Wook
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.3
    • /
    • pp.715-723
    • /
    • 2002
  • This paper is to show a clustering application of a density estimation method that utilizes the Gaussian mixture model. We define "closeness measure" as a clustering criterion to see how close given two Gaussian components are. Closeness measure is defined as the ratio of log likelihood between two Gaussian components. According to simulations using artificial data, the clustering algorithm turned out to be very powerful in that it can correctly determine clusters in complex situations, and very flexible in that it can produce different sizes of clusters based on different threshold valuesold values

Density Based Spatial Clustering Method Considering Obstruction (장애물을 고려한 밀도 기반의 공간 클러스터링 기법)

  • 임현숙;김호숙;용환승;이상호;박승수
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.3
    • /
    • pp.375-383
    • /
    • 2003
  • Clustering in spatial mining is to group similar objects based on their distance, connectivity or their relative density in space. In the real world. there exist many physical objects such as rivers, lakes and highways, and their presence may affect the result of clustering. In this paper, we define distance to handle obstacles, and using that we propose the density based clustering algorithm called DBSCAN-O to handle obstacles. We show that DBSCAN-O produce different clustering results from previous density based clustering algorithm DBSCAN by our experiment result.

  • PDF

An Enhanced Density and Grid based Spatial Clustering Algorithm for Large Spatial Database (대용량 공간데이터베이스를 위한 확장된 밀도-격자 기반의 공간 클러스터링 알고리즘)

  • Gao, Song;Kim, Ho-Seok;Xia, Ying;Kim, Gyoung-Bae;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5
    • /
    • pp.633-640
    • /
    • 2006
  • Spatial clustering, which groups similar objects based on their distance, connectivity, or their relative density in space, is an important component of spatial data mining. Density-based and grid-based clustering are two main clustering approaches. The former is famous for its capability of discovering clusters of various shapes and eliminating noises, while the latter is well known for its high speed. Clustering large data sets has always been a serious challenge for clustering algorithms, because huge data set would make the clustering process extremely costly. In this paper, we propose an enhanced Density-Grid based Clustering algorithm for Large spatial database by setting a default number of intervals and removing the outliers effectively with the help of a proper measurement to identify areas of high density in the input data space. We use a density threshold DT to recognize dense cells before neighbor dense cells are combined to form clusters. When proposed algorithm is performed on large dataset, a proper granularity of each dimension in data space and a density threshold for recognizing dense areas can improve the performance of this algorithm. We combine grid-based and density-based methods together to not only increase the efficiency but also find clusters with arbitrary shape. Synthetic datasets are used for experimental evaluation which shows that proposed method has high performance and accuracy in the experiments.

Approximate fuzzy clustering based on a density function (밀도 함수를 이용한 근사적 퍼지 클러스터링)

  • 손세호;권순학;최윤혁
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • /
    • pp.94-97
    • /
    • 2000
  • We introduce an approximate fuzzy clustering method, which is simple but computationally efficient, based on density functions in this paper. The density functions are defined by the number of data within the predetermined interval. Numerical examples are presented to show the validity of the proposed clustering method.

  • PDF

Approximate Clustering on Data Streams Using Discrete Cosine Transform

  • Yu, Feng;Oyana, Damalie;Hou, Wen-Chi;Wainer, Michael
    • Journal of Information Processing Systems
    • /
    • v.6 no.1
    • /
    • pp.67-78
    • /
    • 2010
  • In this study, a clustering algorithm that uses DCT transformed data is presented. The algorithm is a grid density-based clustering algorithm that can identify clusters of arbitrary shape. Streaming data are transformed and reconstructed as needed for clustering. Experimental results show that DCT is able to approximate a data distribution efficiently using only a small number of coefficients and preserve the clusters well. The grid based clustering algorithm works well with DCT transformed data, demonstrating the viability of DCT for data stream clustering applications.

Plurality Rule-based Density and Correlation Coefficient-based Clustering for K-NN

  • Aung, Swe Swe;Nagayama, Itaru;Tamaki, Shiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.183-192
    • /
    • 2017
  • k-nearest neighbor (K-NN) is a well-known classification algorithm, being feature space-based on nearest-neighbor training examples in machine learning. However, K-NN, as we know, is a lazy learning method. Therefore, if a K-NN-based system very much depends on a huge amount of history data to achieve an accurate prediction result for a particular task, it gradually faces a processing-time performance-degradation problem. We have noticed that many researchers usually contemplate only classification accuracy. But estimation speed also plays an essential role in real-time prediction systems. To compensate for this weakness, this paper proposes correlation coefficient-based clustering (CCC) aimed at upgrading the performance of K-NN by leveraging processing-time speed and plurality rule-based density (PRD) to improve estimation accuracy. For experiments, we used real datasets (on breast cancer, breast tissue, heart, and the iris) from the University of California, Irvine (UCI) machine learning repository. Moreover, real traffic data collected from Ojana Junction, Route 58, Okinawa, Japan, was also utilized to lay bare the efficiency of this method. By using these datasets, we proved better processing-time performance with the new approach by comparing it with classical K-NN. Besides, via experiments on real-world datasets, we compared the prediction accuracy of our approach with density peaks clustering based on K-NN and principal component analysis (DPC-KNN-PCA).

Detection of Abnormal Region of Skin using Gabor Filter and Density-based Spatial Clustering of Applications with Noise (가버 필터와 밀도 기반 공간 클러스터링을 이용한 피부의 이상 영역 검출)

  • Jeon, Minseong;Cheoi, Kyungjoo
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.117-129
    • /
    • 2018
  • In this paper, we suggest a new system that detects abnormal region of skim. First, an illumination elimination algorithm which uses LAB color model is processed on input facial image to obtain robust facial image for illumination, and then gabor filter is processed to detect the reactivity of discontinuity. And last, the density-based spatial clustering of applications with noise(DBSCAN) algorithm is processed to classify areas of wrinkles, dots, and other skin diseases. This method allows the user to check the skin condition of the images taken in real life.

A Method of Color Image Segmentation Based on DBSCAN(Density Based Spatial Clustering of Applications with Noise) Using Compactness of Superpixels and Texture Information (슈퍼픽셀의 밀집도 및 텍스처정보를 이용한 DBSCAN기반 칼라영상분할)

  • Lee, Jeonghwan
    • Journal of the Korea Society of Digital Industry and Information Management
    • /
    • v.11 no.4
    • /
    • pp.89-97
    • /
    • 2015
  • In this paper, a method of color image segmentation based on DBSCAN(Density Based Spatial Clustering of Applications with Noise) using compactness of superpixels and texture information is presented. The DBSCAN algorithm can generate clusters in large data sets by looking at the local density of data samples, using only two input parameters which called minimum number of data and distance of neighborhood data. Superpixel algorithms group pixels into perceptually meaningful atomic regions, which can be used to replace the rigid structure of the pixel grid. Each superpixel is consist of pixels with similar features such as luminance, color, textures etc. Superpixels are more efficient than pixels in case of large scale image processing. In this paper, superpixels are generated by SLIC(simple linear iterative clustering) as known popular. Superpixel characteristics are described by compactness, uniformity, boundary precision and recall. The compactness is important features to depict superpixel characteristics. Each superpixel is represented by Lab color spaces, compactness and texture information. DBSCAN clustering method applied to these feature spaces to segment a color image. To evaluate the performance of the proposed method, computer simulation is carried out to several outdoor images. The experimental results show that the proposed algorithm can provide good segmentation results on various images.

An Overview of Unsupervised and Semi-Supervised Fuzzy Kernel Clustering

  • Frigui, Hichem;Bchir, Ouiem;Baili, Naouel
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.254-268
    • /
    • 2013
  • For real-world clustering tasks, the input data is typically not easily separable due to the highly complex data structure or when clusters vary in size, density and shape. Kernel-based clustering has proven to be an effective approach to partition such data. In this paper, we provide an overview of several fuzzy kernel clustering algorithms. We focus on methods that optimize an fuzzy C-mean-type objective function. We highlight the advantages and disadvantages of each method. In addition to the completely unsupervised algorithms, we also provide an overview of some semi-supervised fuzzy kernel clustering algorithms. These algorithms use partial supervision information to guide the optimization process and avoid local minima. We also provide an overview of the different approaches that have been used to extend kernel clustering to handle very large data sets.