A review on the t-distributed stochastic neighbors embedding

Kipoong Kim;Choongrak Kim;

doi:10.5351/KJAS.2023.36.2.167

응용통계연구 (The Korean Journal of Applied Statistics)

제36권2호
/
Pages.167-173
/
2023
/
1225-066X(pISSN)
/
2383-5818(eISSN)

한국통계학회 (The Korean Statistical Society)

DOI QR Code

t-SNE에 대한 요약

A review on the t-distributed stochastic neighbors embedding

김기풍 (서울대학교 통계학과) ;
김충락 (부산대학교 통계학과)

Kipoong Kim (Department of Statistics, Seoul National University) ;
Choongrak Kim (Department of Statistics, Pusan National University)

투고 : 2022.12.05
심사 : 2022.12.15
발행 : 2023.04.30

https://doi.org/10.5351/KJAS.2023.36.2.167 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문에서는 고차원의 자료를 저차원으로 변환시켜 시각화하는 다양한 방법들을 소개하였다. 차원 축소는 크게 선형 방법과 비선형 방법으로 나눌 수 있는데 선형 방법으로 주성분 분석, 다차원 척도 등을 간략하게 소개하였고 비선형 방법으로 커널 주성분 분석, 자기조직도, 국소 선형 사상, Isomap, 국소 다차원 척도 등을 간략하게 소개하였으며, 가장 최근에 제안되었으며 매우 널리 사용되고 있지만 통계학 분야에는 비교적 생소한 t-SNE에 대하여 자세히 소개하였다. t-SNE를 이용한 간단한 예제를 제시하고 t-SNE의 장단점을 지적한 최근 연구 논문을 소개하고 제시된 향후 연구 과제들을 살펴보았다.

This paper investigates several methods of visualizing high-dimensional data in a low-dimensional space. At first, principal component analysis and multidimensional scaling are briefly introduced as linear approaches, and then kernel principal component analysis, self-organizing map, locally linear embedding, Isomap, Laplacian Eigenmaps, and local multidimensional scaling are introduced as nonlinear approaches. In particular, t-SNE, which is widely used but relatively unfamiliar in the field of statistics, is described in more detail. We also present a simple example for several methods, including t-SNE. Finally, we provide a review of several recent studies pointing out the limitations of t-SNE and discuss the future research problems presented.

키워드

과제정보

본 연구는 부산대학교 2년 과제 연구비에 의하여 수행되었음.

참고문헌

Amid E and Warmuth MK (2019). TriMap: Large-Scale dimensionality reduction using triplets, Available from: arXiv:1910.00204
Belkin M and Niyogi M (2003). Laplacian eigenmaps for dimensionality reduction and data representation, Neural Computation, 15, 1373-1396. https://doi.org/10.1162/089976603321780317
Chen L and Buja A (2009). Local multidimensional scaling for nonlinear dimension reduction, graph drawing and proximity analysis, Journal of the American Statistical Association, 104, 209-219. https://doi.org/10.1198/jasa.2009.0111
Kobak D and Berens P (2019). The art of t-SNE for single-cell transcriptomics, Nature Communications, 10, 5416.
Kobak D and Linderman GC (2021). Initialization is critical for preserving global data structure in both t-SNE and UMAP, Nature Biotechnology, 39, 156-157. https://doi.org/10.1038/s41587-020-00809-z
Kohonen T (1990). The self-organizing map, Proceedings of the IEEE, 78, 1464-1479. https://doi.org/10.1109/5.58325
McInnes L, Healy J, and Melville J (2018). UMAP: Uniform manifold approximation and projection for dimension reduction, Available from: arXiv:1802.03426
Roweis ST and Saul LK (2000). Nonlinear dimensionality reduction by locally linear embedding, Science, 290, 2323-2326. https://doi.org/10.1126/science.290.5500.2323
Scholkopf B, Smola A, and Muller KR (1999). Kernel principal component analysis. In Scholkopf B, Burges C, and Smola A (Eds), Advances in Kernel Methods - Support Vector Learning (pp. 327-352), MIT Press, Cambridge.
Tenenbaum JB, de Silva V, and Langford JC (2000). A global geometric framework for nonlinear dimensionality reduction, Science, 290, 2319-2323. https://doi.org/10.1126/science.290.5500.2319
van der Maaten L and Hinton G (2008). Visualizing data using t-SNE, Journal of Machine Learning Research, 9, 2579-2605.
Wang Y, Huang H, Rudin C, and Shaposhnik Y (2021). Understanding how dimension reduction tools work: An empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for data visualization, Journal of Machine Learning Research, 22, 1-73.

응용통계연구 (The Korean Journal of Applied Statistics)

t-SNE에 대한 요약

A review on the t-distributed stochastic neighbors embedding

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)