Mining Semantically Similar Tags from Delicious

Yi, Kwan;

doi:10.3743/KOSIM.2009.26.2.127

정보관리학회지 (Journal of the Korean Society for information Management)

제26권2호
/
Pages.127-147
/
2009
/
1013-0799(pISSN)
/
2586-2073(eISSN)

한국정보관리학회 (Korean Society for Information Management)

DOI QR Code

딜리셔스에서 유사태그 추출에 관한 연구

Mining Semantically Similar Tags from Delicious

이관

Yi, Kwan (School of Library and Information Science, University of Kentucky)

발행 : 2009.06.30

https://doi.org/10.3743/KOSIM.2009.26.2.127 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

자연언어에서 유사어의 처리는 사람과 컴퓨터간의 의사소통에 적지 않은 장애가 되어왔고, 이는 사용자의 임의적 단어사용에 기반을 두고 있는 웹 2.0 애플리케이션, 특히 소셜태깅 분야에 있어서 그 장애의 정도가 더 심각해질 수 있다. 본 연구는 한 대표적인 웹 2.0 애플리케이션에서 자동 유사어 추출에 관한 문제를 다루고 있다. 더 구체적으로, 가장 널리 사용되는 소셜북마킹 애플리케이션인 딜리셔스를 기반으로, 유사태그를 추출하는 방법(FolkSim)을 제시하고자 한다. 제시한 방법의 평가를 위하여, 문서유사도의 측정을 위해서 쓰여진 고전적 벡터모델에 의거한 유사태그를 추출하는 방법(CosSim)과 그 결과들을 서로 비교분석하여 보았다. 몇 가지 면에서 FolkSim가 더 나은 결과 산출해내는 증거들이 관찰되어졌다. 또한, FolkSim 방법에 의한 유사태그가 만들어지지 않는 경우에 대비하여, 그 대안 또한 제시하고 있다.

The synonym issue is an inherent barrier in human-computer communication, and it is more challenging in a Web 2.0 application, especially in social tagging applications. In an effort to resolve the issue, the goal of this study is to test the feasibility of a Web 2.0 application as a potential source for synonyms. This study investigates a way of identifying similar tags from a popular collaborative tagging application, Delicious. Specifically, we propose an algorithm (FolkSim) for measuring the similarity of social tags from Delicious. We compared FolkSim to a cosine-based similarity method and observed that the top-ranked tags on the similar list generated by FolkSim tend to be among the best possible similar tags in given choices. Also, the lists appear to be relatively better than the ones created by CosSim. We also observed that tag folksonomy and similar list resemble each other to a certain degree so that it possibly serves as an alternative outcome, especially in case the FolkSim-based list is unavailable or infeasible.

키워드

참고문헌

Baeza-Yates, R. and B. Ribeiro-Neto. 1999. Modern Information Retrieval. New York, NY USA: ACM Press
Begelman, G., P. Keller, and F. Smadja. 2006. 'Automated tag clustering: Improving search and exploration in the tag space.' Proceedings of the Tagging Workshop at the 15th International World Wide Web Conference: 22-26
Chen, Hsinchun and Kevin J. Lynch. 1992. 'Automatic construction of networks of concepts characterizing document databases.' IEEE Transactions on Systems, Man and Cybernetics, 22(5): 885-902 https://doi.org/10.1109/21.179830
Choy, S. O. and A. K. Lui. 2006. 'Web information retrieval in collaborative tagging systems.' Proceedings of the International Conference on Web Intelligence. Hong Kong, 18-22 December 2006: 352-355
Crouch, C. J. 1990. 'An approach to the automatic construction of global thesauri.' Information Processing and Management, 26: 629-640 https://doi.org/10.1016/0306-4573(90)90106-C
Dhillon, I. S. and D. S. Modha. 2001. 'Concept decompositions for large sparse text data using clustering.' Machine learning, 42(1): 143-175 https://doi.org/10.1023/A:1007612920971
Furnas, G. W., T. K. Landauer, L. M. Gomez, and S. T. Dumais. 1987. 'The vocabulary problem in human-system communication.' Communications of the ACM, 30: 964-971 https://doi.org/10.1145/32206.32212
Garg, Nikhil and Ingmar Weber. 2008. 'Personalized Tag Suggestion for Flickr.' Proceedings of the World Wide Web conference, Beijing, China, 21-25 April 2008: 1063-1064
Golder, S. and B. A. Huberman. 2006. 'Usage patterns of collaborative tagging systems.' Journal of Information Science, 32(2): 198-208 https://doi.org/10.1177/0165551506062337
Hotho, Andreas, Robert Jaistoph Schmitz, and G. Stumme. 2006. 'Information retrieval in folksonomies: Search and ranking.' Proceedings of the 3rd European Semantic Web Conference, Budva, Montenegro, 11-14 June 2006: 411-426
Jannink, Jan and G. Wiederhold. 1999. 'Thesaurus entry extraction from an on-line dictionary.' Proceedings of the Second International Conference on Information Fusion. Sunnyvale CA
Lin, D. 1998. 'Automatic retrieval and clustering of similar words.' Proceedings of the 17th International Conference on Computational Linguistics. Montreal, Quebec, Canada, 10-14 August 1998: 768-774
Lin, Dekang, S. Zhao, L. Qin, and M. Zhou. 2003. 'Identifying synonyms among distributionally similar words.' Proceedings of International Joint Conferences on Artificial Intelligence. Acapulco, Mexico, 9-15 August 2003: 1492-1493
Turney, Peter D. 2001. Mining the Web for synonyms: PMI_IR versus LSA on TOEFL. Proceedings of the 12th European Conference on Machine Learning. Freiburg, Germany, 3-7 September 2001: 491-502
Turney, Peter D. 2002. 'Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews.' Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, PA, 6-12 July 2002: 417-424
Vander Wal, T. 2007. 'Folksonomy coninage and definition.' Retrieved on 15 June 2009: http://vanderwal.net/folksonomy.html
White, S. and P. Smyth. 2005. 'A spectral clustering approach to finding communities in graphs.' Proceedings of the Fifth SIAM International Conference on Data Mining. Newport Beach, CA, 21-23 April 2005: 274-285
Wu, Hua and Ming Zhou. 2003. 'Optimizing synonym extraction using monolingual and bilingual resources.' Proceedings of the Second International Workshop on Paraphrasing: Paraphrase Acquisition and Applications. Sapporo, Japan, July 11, 2003: 72-79
Yi, Kwan. 2008. 'Mining a Web2.0 service for the discovery of semantically similar terms: a case study with Del.icio.us.' Proceedings of the International Conference on Asia-Pacific Digital Libraries. Bali, Indonesia, 02-05 December 2008: 321-326

피인용 문헌

An empirical study on the automatic resolution of semantic ambiguity in social tags vol.48, pp.1, 2011, https://doi.org/10.1002/meet.2011.14504801175
A Comparative Study on Clustering Methods for Grouping Related Tags vol.43, pp.3, 2009, https://doi.org/10.4275/KSLIS.2009.43.3.399

정보관리학회지 (Journal of the Korean Society for information Management)

딜리셔스에서 유사태그 추출에 관한 연구

Mining Semantically Similar Tags from Delicious

초록

키워드

참고문헌

피인용 문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)