통합 검색 | Korea Science

Empirical Comparison of Word Similarity Measures Based on Co-Occurrence, Context, and a Vector Space Model

Kadowaki, Natsuki;Kishida, Kazuaki
- Journal of Information Science Theory and Practice
- /
- 제8권2호
- /
- pp.6-17
- /
- 2020
Word similarity is often measured to enhance system performance in the information retrieval field and other related areas. This paper reports on an experimental comparison of values for word similarity measures that were computed based on 50 intentionally selected words from a Reuters corpus. There were three targets, including (1) co-occurrence-based similarity measures (for which a co-occurrence frequency is counted as the number of documents or sentences), (2) context-based distributional similarity measures obtained from a latent Dirichlet allocation (LDA), nonnegative matrix factorization (NMF), and Word2Vec algorithm, and (3) similarity measures computed from the tf-idf weights of each word according to a vector space model (VSM). Here, a Pearson correlation coefficient for a pair of VSM-based similarity measures and co-occurrence-based similarity measures according to the number of documents was highest. Group-average agglomerative hierarchical clustering was also applied to similarity matrices computed by individual measures. An evaluation of the cluster sets according to an answer set revealed that VSM- and LDA-based similarity measures performed best.
https://doi.org/10.1633/JISTaP.2020.8.2.1 인용 PDF KSCI HTML

Co-word를 이용한 알트메트리얼 필리트의 지적 구조 연구 (Intellectual Structure of the Altmetrics field: A Co-Word Analysis)

이가베;이효맹;이현창;신성윤
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2017년도 추계학술대회
- /
- pp.148-150
- /
- 2017
In recent years, "altmetrics", given birth by social media and the academic community, have become a metric source for measuring the academic impact of scientific literature. This study has undertaken a co-word analysis of author keywords in "Altmetrics" articles from the Web of Science database from 2012 to 2017 and used a co-occurrence matrix to create a clustering of the words. "Altmetrics" co-occurrence network map was derived and the research hotspots was analyzed.
PDF

다중빈도 키워드 가시화에 관한 연구 (A Study on Multi-frequency Keyword Visualization based on Co-occurrence)

이현창;신성윤
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2018년도 춘계학술대회
- /
- pp.103-104
- /
- 2018
Recently, interest in data analysis has increased as the importance of big data becomes more important. Particularly, as social media data and academic research communities become more active and important, analysis becomes more important. In this study, co-word analysis was conducted through altmetrics articles collected from 2012 to 2017. In this way, the co-occurrence network map is derived from the keyword and the emphasized keyword is extracted.
PDF

다중빈도 키워드 가시화에 관한 연구 (A Study on Multi-frequency Keyword Visualization based on Co-occurrence)

이현창;신성윤
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2018년도 춘계학술대회
- /
- pp.424-425
- /
- 2018
Recently, interest in data analysis has increased as the importance of big data becomes more important. Particularly, as social media data and academic research communities become more active and important, analysis becomes more important. In this study, co-word analysis was conducted through altmetrics articles collected from 2012 to 2017. In this way, the co-occurrence network map is derived from the keyword and the emphasized keyword is extracted.
PDF

통계 정보를 이용한 전치사 최적 번역어 결정 모델 (A Statistical Model for Choosing the Best Translation of Prepositions.)

심광섭
- 한국언어정보학회지:언어와정보
- /
- 제8권1호
- /
- pp.101-116
- /
- 2004
This paper proposes a statistical model for the translation of prepositions in English-Korean machine translation. In the proposed model, statistical information acquired from unlabeled Korean corpora is used to choose the best translation from several possible translations. Such information includes functional word-verb co-occurrence information, functional word-verb distance information, and noun-postposition co-occurrence information. The model was evaluated with 443 sentences, each of which has a prepositional phrase, and we attained 71.3% accuracy.
PDF

출현회수에 따른 키워드 가시화 연구 (Keyword Visualization based on the number of occurrences)

이현창;신성윤
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2019년도 춘계학술대회
- /
- pp.484-485
- /
- 2019
Recently, interest in data analysis has increased as the importance of big data becomes more important. Particularly, as social media data and academic research communities become more active and important, analysis becomes more important. In this study, co-word analysis was conducted through altmetrics articles collected from 2012 to 2017. In this way, the co-occurrence network map is derived from the keyword and the emphasized keyword is extracted.
PDF

키워드 빈도수에 따른 시각화 연구 (Keyword Visualization based on the Number of Occurrences)

이현창;신성윤
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2021년도 추계학술대회
- /
- pp.565-566
- /
- 2021
Recently, interest in data analysis has increased as the importance of big data becomes more important. Particularly, as social media data and academic research communities become more active and important, analysis becomes more important. In this study, co-word analysis was conducted through altmetrics articles collected from 2012 to 2017. In this way, the co-occurrence network map is derived from the keyword and the emphasized keyword is extracted.
PDF

동시단어분석을 이용한 품질경영분야 지식구조 분석 (The Analysis of Knowledge Structure using Co-word Method in Quality Management Field)

박만희
- 품질경영학회지
- /
- 제44권2호
- /
- pp.389-408
- /
- 2016
Purpose: This study was designed to analyze the behavioral change of knowledge structures and the trends of research topics in the quality management field. Methods: The network structure and knowledge structure of the words were visualized in map form using co-word analysis, cluster analysis and strategic diagram. Results: Summarizing the research results obtained in this study are as follows. First, the word network derived from co-occurrence matrix had 106 nodes and 5,314 links and its density was analyzed to 0.95. Average betweenness centrality of word network was 2.37. In addition, average closeness centrality and average eigenvector centrality of word network were 0.01. Second, by applying optimal criteria of cluster decision and K-means algorithm to word co-occurrence matrix, 106 words were grouped into seven clusters such as standard & efficiency, product design, reliability, control chart, quality model, 6 sigma, and service quality. Conclusion: According to the results of strategic diagram analysis over time, the traditional research topics of quality management field related to reliability, 6 sigma, control chart topics in the third quadrant were revealed to be declined for their study importance. Research topics related to product design and customer satisfaction were found to be an important research topic over analysis periods. Research topic related to management innovation was emerging state and the scope of research topics related to process model was extended to research topics with system performance. Research topic related to service quality located in the first quadrant was analyzed as the key research topic.
https://doi.org/10.7469/JKSQM.2016.44.2.389 인용 PDF KSCI

Research trends related to childhood and adolescent cancer survivors in South Korea using word co-occurrence network analysis

Kang, Kyung-Ah;Han, Suk Jung;Chun, Jiyoung;Kim, Hyun-Yong
- Child Health Nursing Research
- /
- 제27권3호
- /
- pp.201-210
- /
- 2021
Purpose: This study analyzed research trends related to childhood and adolescent cancer survivors (CACS) using word co-occurrence network analysis on studies registered in the Korean Citation Index (KCI). Methods: This word co-occurrence network analysis study explored major research trends by constructing a network based on relationships between keywords (semantic morphemes) in the abstracts of published articles. Research articles published in the KCI over the past 10 years were collected using the Biblio Data Collector tool included in the NetMiner Program (version 4), using "cancer survivors", "adolescent", and "child" as the main search terms. After pre-processing, analyses were conducted on centrality (degree and eigenvector), cohesion (community), and topic modeling. Results: For centrality, the top 10 keywords included "treatment", "factor", "intervention", "group", "radiotherapy", "health", "risk", "measurement", "outcome", and "quality of life". In terms of cohesion and topic analysis, three categories were identified as the major research trends: "treatment and complications", "adaptation and support needs", and "management and quality of life". Conclusion: The keywords from the three main categories reflected interdisciplinary identification. Many studies on adaptation and support needs were identified in our analysis of nursing literature. Further research on managing and evaluating the quality of life among CACS must also be conducted.
https://doi.org/10.4094/chnr.2021.27.3.201 인용 PDF KSCI

소셜네트워크 분석과 Co-word 분석을 사용한 Altmetric 연구 개발동향 (Development Tendency of Altmetrics Research: Using Social Network Analysis and Co-word Analysis)

이현창;이가배;신성윤
- 한국정보통신학회논문지
- /
- 제21권11호
- /
- pp.2089-2094
- /
- 2017
알트메트릭스는 인용을 기반으로 한 전통적인 지표를 보완하기 위한 측정 지표이면서 정략적 데이터이다. 이러한 알트메트릭스 에 관한 연구는 지난 몇 년간 전통적인 계량 정보학의 보완에 힘입어 중요한 비중을 차지해오고 있다. 본 논문은 알트메트릭스 연구 현황과 동향을 파악하는 것을 목적으로 한다. 총 187건의 논문을 분석하였으며, 이를 통해 2005년이후로 알트메트릭스 연구에 지속적인 상승이 있음을 알 수 있다. 소셜 네트워크 분석과 co-word 분석을 사용하여 저자 협동 네트워크와 키워드 공존 네트워크를 구축한다. 계층적 클러스터링으로 4개의 알트메트릭스 연구가 발견되었으며, 그 결과는 알트메트릭스의 추후 연구에 매우 유용할 수 있다.
https://doi.org/10.6109/jkiice.2017.21.11.2089 인용 PDF KSCI

검색결과 104건 처리시간 0.043초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)