Search | Korea Science

Empirical Comparison of Word Similarity Measures Based on Co-Occurrence, Context, and a Vector Space Model

Kadowaki, Natsuki;Kishida, Kazuaki
- Journal of Information Science Theory and Practice
- /
- v.8 no.2
- /
- pp.6-17
- /
- 2020
Word similarity is often measured to enhance system performance in the information retrieval field and other related areas. This paper reports on an experimental comparison of values for word similarity measures that were computed based on 50 intentionally selected words from a Reuters corpus. There were three targets, including (1) co-occurrence-based similarity measures (for which a co-occurrence frequency is counted as the number of documents or sentences), (2) context-based distributional similarity measures obtained from a latent Dirichlet allocation (LDA), nonnegative matrix factorization (NMF), and Word2Vec algorithm, and (3) similarity measures computed from the tf-idf weights of each word according to a vector space model (VSM). Here, a Pearson correlation coefficient for a pair of VSM-based similarity measures and co-occurrence-based similarity measures according to the number of documents was highest. Group-average agglomerative hierarchical clustering was also applied to similarity matrices computed by individual measures. An evaluation of the cluster sets according to an answer set revealed that VSM- and LDA-based similarity measures performed best.
https://doi.org/10.1633/JISTaP.2020.8.2.1 인용 PDF KSCI HTML

Intellectual Structure of the Altmetrics field: A Co-Word Analysis (Co-word를 이용한 알트메트리얼 필리트의 지적 구조 연구)

Li, Jiapei;Li, Xiaomeng;Lee, HyunChang;Shin, SeongYoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2017.10a
- /
- pp.148-150
- /
- 2017
In recent years, "altmetrics", given birth by social media and the academic community, have become a metric source for measuring the academic impact of scientific literature. This study has undertaken a co-word analysis of author keywords in "Altmetrics" articles from the Web of Science database from 2012 to 2017 and used a co-occurrence matrix to create a clustering of the words. "Altmetrics" co-occurrence network map was derived and the research hotspots was analyzed.
PDF

A Study on Multi-frequency Keyword Visualization based on Co-occurrence (다중빈도 키워드 가시화에 관한 연구)

Lee, HyunChang;Shin, SeongYoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2018.05a
- /
- pp.103-104
- /
- 2018
Recently, interest in data analysis has increased as the importance of big data becomes more important. Particularly, as social media data and academic research communities become more active and important, analysis becomes more important. In this study, co-word analysis was conducted through altmetrics articles collected from 2012 to 2017. In this way, the co-occurrence network map is derived from the keyword and the emphasized keyword is extracted.
PDF

A Study on Multi-frequency Keyword Visualization based on Co-occurrence (다중빈도 키워드 가시화에 관한 연구)

Lee, HyunChang;Shin, SeongYoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2018.05a
- /
- pp.424-425
- /
- 2018
Recently, interest in data analysis has increased as the importance of big data becomes more important. Particularly, as social media data and academic research communities become more active and important, analysis becomes more important. In this study, co-word analysis was conducted through altmetrics articles collected from 2012 to 2017. In this way, the co-occurrence network map is derived from the keyword and the emphasized keyword is extracted.
PDF

A Statistical Model for Choosing the Best Translation of Prepositions. (통계 정보를 이용한 전치사 최적 번역어 결정 모델)

심광섭
- Language and Information
- /
- v.8 no.1
- /
- pp.101-116
- /
- 2004
This paper proposes a statistical model for the translation of prepositions in English-Korean machine translation. In the proposed model, statistical information acquired from unlabeled Korean corpora is used to choose the best translation from several possible translations. Such information includes functional word-verb co-occurrence information, functional word-verb distance information, and noun-postposition co-occurrence information. The model was evaluated with 443 sentences, each of which has a prepositional phrase, and we attained 71.3% accuracy.
PDF

Keyword Visualization based on the number of occurrences (출현회수에 따른 키워드 가시화 연구)

Lee, HyunChang;Shin, SeongYoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2019.05a
- /
- pp.484-485
- /
- 2019
Recently, interest in data analysis has increased as the importance of big data becomes more important. Particularly, as social media data and academic research communities become more active and important, analysis becomes more important. In this study, co-word analysis was conducted through altmetrics articles collected from 2012 to 2017. In this way, the co-occurrence network map is derived from the keyword and the emphasized keyword is extracted.
PDF

Keyword Visualization based on the Number of Occurrences (키워드 빈도수에 따른 시각화 연구)

Lee, HyunChang;Shin, SeongYoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2021.10a
- /
- pp.565-566
- /
- 2021
Recently, interest in data analysis has increased as the importance of big data becomes more important. Particularly, as social media data and academic research communities become more active and important, analysis becomes more important. In this study, co-word analysis was conducted through altmetrics articles collected from 2012 to 2017. In this way, the co-occurrence network map is derived from the keyword and the emphasized keyword is extracted.
PDF

The Analysis of Knowledge Structure using Co-word Method in Quality Management Field (동시단어분석을 이용한 품질경영분야 지식구조 분석)

Park, Man-Hee
- Journal of Korean Society for Quality Management
- /
- v.44 no.2
- /
- pp.389-408
- /
- 2016
Purpose: This study was designed to analyze the behavioral change of knowledge structures and the trends of research topics in the quality management field. Methods: The network structure and knowledge structure of the words were visualized in map form using co-word analysis, cluster analysis and strategic diagram. Results: Summarizing the research results obtained in this study are as follows. First, the word network derived from co-occurrence matrix had 106 nodes and 5,314 links and its density was analyzed to 0.95. Average betweenness centrality of word network was 2.37. In addition, average closeness centrality and average eigenvector centrality of word network were 0.01. Second, by applying optimal criteria of cluster decision and K-means algorithm to word co-occurrence matrix, 106 words were grouped into seven clusters such as standard & efficiency, product design, reliability, control chart, quality model, 6 sigma, and service quality. Conclusion: According to the results of strategic diagram analysis over time, the traditional research topics of quality management field related to reliability, 6 sigma, control chart topics in the third quadrant were revealed to be declined for their study importance. Research topics related to product design and customer satisfaction were found to be an important research topic over analysis periods. Research topic related to management innovation was emerging state and the scope of research topics related to process model was extended to research topics with system performance. Research topic related to service quality located in the first quadrant was analyzed as the key research topic.
https://doi.org/10.7469/JKSQM.2016.44.2.389 인용 PDF KSCI

Research trends related to childhood and adolescent cancer survivors in South Korea using word co-occurrence network analysis

Kang, Kyung-Ah;Han, Suk Jung;Chun, Jiyoung;Kim, Hyun-Yong
- Child Health Nursing Research
- /
- v.27 no.3
- /
- pp.201-210
- /
- 2021
Purpose: This study analyzed research trends related to childhood and adolescent cancer survivors (CACS) using word co-occurrence network analysis on studies registered in the Korean Citation Index (KCI). Methods: This word co-occurrence network analysis study explored major research trends by constructing a network based on relationships between keywords (semantic morphemes) in the abstracts of published articles. Research articles published in the KCI over the past 10 years were collected using the Biblio Data Collector tool included in the NetMiner Program (version 4), using "cancer survivors", "adolescent", and "child" as the main search terms. After pre-processing, analyses were conducted on centrality (degree and eigenvector), cohesion (community), and topic modeling. Results: For centrality, the top 10 keywords included "treatment", "factor", "intervention", "group", "radiotherapy", "health", "risk", "measurement", "outcome", and "quality of life". In terms of cohesion and topic analysis, three categories were identified as the major research trends: "treatment and complications", "adaptation and support needs", and "management and quality of life". Conclusion: The keywords from the three main categories reflected interdisciplinary identification. Many studies on adaptation and support needs were identified in our analysis of nursing literature. Further research on managing and evaluating the quality of life among CACS must also be conducted.
https://doi.org/10.4094/chnr.2021.27.3.201 인용 PDF KSCI

Development Tendency of Altmetrics Research: Using Social Network Analysis and Co-word Analysis (소셜네트워크 분석과 Co-word 분석을 사용한 Altmetric 연구 개발동향)

Lee, Hyun-Chang;Li, Jiapei;Shin, Seong-Yoon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.21 no.11
- /
- pp.2089-2094
- /
- 2017
Altmetrics is the measurement index and quantitative data to complement the traditional indicators based on the citation. Altmetrics research has acquired greater importance in the past few years, partly due to the complement to the traditional bibliometrics. This paper aims to reveal the research status and trends in altmetrics research. A total of 187 articles from 2005 to 2017 are obtained and analyzed, illustrating a steady rise (S-mode) in altmetrics research since 2005. Using social network analysis and co-word analysis, the author cooperation network and keyword co-occurrence network are developed. The core scientists and eight international research groups are discovered, reflecting that researchers in this field have a low degree of cooperation. Four topics of altmetrics research are discovered by hierarchical clustering. The results can be useful for the advanced research of altmetrics.
https://doi.org/10.6109/jkiice.2017.21.11.2089 인용 PDF KSCI

Search Result 104, Processing Time 0.04 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)