Topics and Trends in Metadata Research

  • Oh, Jung Sun ;
  • Park, Ok Nam
  • Received : 2017.12.07
  • Accepted : 2018.07.09
  • Published : 2018.12.30


While the body of research on metadata has grown substantially, there has been a lack of systematic analysis of the field of metadata. In this study, we attempt to fill this gap by examining metadata literature spanning the past 20 years. With the combination of a text mining technique, topic modeling, and network analysis, we analyzed 2,713 scholarly papers on metadata published between 1995 and 2014 and identified main topics and trends in metadata research. As the result of topic modeling, 20 topics were discovered and, among those, the most prominent topics were reviewed in detail. In addition, the changes over time in the topic composition, in terms of both the relative topic proportions and the structure of topic networks, were traced to find past and emerging trends in research. The results show that a number of core themes in metadata research have been established over the past decades and the field has advanced, embracing and responding to the dynamic changes in information environments as well as new developments in the professional field.


topic modeling;metadata research;research trends;library and information science


  1. Aktas, M. S., Fox, G. C., & Pierce, M. (2010). A federated approach to information management in grids. International Journal of Web Services Research, 7(1), 65-98.
  2. Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77-84.
  3. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022.
  4. Blessinger, K., & Hrycaj, P. (2010). Highly cited articles in library and information science: An analysis of content and authorship trends. Library & Information Science Research, 32(2), 156-162.
  5. Boyd, K., & King, D. (2006). South Carolina goes digital: The creation and development of the University of South Carolina’s Digital Activities Department. OCLC Systems & Services, 22(3), 179-191.
  6. Cho, J. (2013). The recent trends of information organization research in Japan and Korea. Library Collections, Acquisitions, and Technical Services, 37(3-4), 107-117.
  7. Chuttur, M. Y. (2012). An experimental study of metadata training effectiveness on errors in metadata records. Journal of Library Metadata, 12(4), 372-395.
  8. Chuttur, M. Y. (2014). Investigating the effect of definitions and best practice guidelines on errors in Dublin Core metadata records. Journal of Information Science, 40(1), 28-37.
  9. Conners, D. (2008). A ghost in the catalog: The gradual obsolescence of the main entry. The Serials Librarian, 55(1-2), 85-97.
  10. Cumming, K. (2007). Purposeful data: The roles and purposes of recordkeeping metadata. Records Management Journal, 17(3), 186-200.
  11. Danskin, A. (2014). Implementing RDA at the British Library. CILIP Update, 40-41.
  12. Daud, A. (2012). Using time topic modeling for semanticsbased dynamic research interest finding. Knowledge- Based Systems, 26, 154-163.
  13. Emery, J. (2007). Ghosts in the machine: The promise of electronic resource management tools. The Serials Librarian, 51(3-4), 201-208.
  14. Evans, J. (2007). Evaluating the recordkeeping capabilities of metadata schemas. Archives and Manuscripts, 35(2), 56-84.
  15. Evans, J., & Rouche, N. (2004). Utilizing systems development methods in archival systems research: Building a metadata schema registry. Archival Science, 4(3-4), 315-334.
  16. Feicheng, M., & Yating, L. (2014). Utilising social network analysis to study the characteristics and functions of the co-occurrence network of online tags. Online Information Review, 38(2), 232-247.
  17. Feick, T., Henderson, H., & England, D. (2011). One identifier: Find your oasis with NISO's I2 (institutional identifiers) standard. The Serials Librarian, 60(1-4), 213-222.
  18. Feinerer, I., & Hornik, K. (2014). tm: Text Mining Package: A framework for text mining applications within R. Retrieved September 22, 2018 from
  19. Ferris, A. M. (2002). Cataloging internet resources using MARC21 and AACR2: Online training for working catalogers. Cataloging & Classification Quarterly, 34(3), 339-353.
  20. Greifeneder, E. (2014, Semptember). Trends in information behaviour research. Paper presented at ISIC: the information behaviour conference (part 1), Leeds, United Kingdom.
  21. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(Suppl 1), 5228-5235.
  22. Grun, B., & Hornik, K. (2011). topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40(13), 1-30.
  23. Hall, D., Jurafsky, D., & Manning, C. D. (2008). Studying the history of ideas using topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 363-371). Hawaii: Association for Computational Linguistics.
  24. Han, M. J., & Hswe, P. (2011). The evolving role of the metadata librarian. Library Resources & Technical Services, 54(3), 129-141.
  25. Hunter, J. (2003). Working towards MetaUtopia: A survey of current metadata research. Library Trends, 52(2), 318-344.
  26. Julien, H., Pecoskie, J. L., & Reed, K. (2011). Trends in information behavior research, 1999-2008: A content analysis. Library & Information Science Research, 33(1), 19-24.
  27. Kanellopoulos, D. N., & Kotsiantis, S. B. (2007). Semantic web: A state of the art survey. International Review on Computer and Software, 2(5), 428-442.
  28. Lagace, N., Breeding, M., Romano Reynolds, R., & Han, N. (2013). Everyone’s a player: Creation of standards in a fast-paced shared world. The Serials Librarian, 64(1-4), 158-166.
  29. Lalitha, P. (2009). Importance of digitization of cultural and heritage materials. SRELS Journal of Information Management, 46(3), 249-266.
  30. McCallum, A. (2002). MALLET: A Machine Learning for Language Toolkit. Retrieved September 22, 2018 from
  31. Medeiros, N. (2003). A pioneering spirit: Using administrative metadata to manage electronic resources. OCLC Systems and Services, 19(3), 86-88.
  32. Mimno, D., & McCallum, A. (2008). Topic models conditioned on arbitrary features with dirichletmultinomial regression. In Proceedings of 24th Conference on Uncertainty in Artificial 1 Intelligence (pp. 411-418). Arlington: AUAI Press.
  33. Mimno, D., McCallum, A., & Mann, G. S. (2006). Bibliometric impact measures leveraging topic analysis. In Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital libraries (JCDL '06) (pp. 65-74). New York: ACM.
  34. Mullen, A. (2001). GILS metadata initiatives at the state level. Government Information Quarterly, 18(3), 167-180.
  35. Palmer, C. L., Zavalina, O. L., & Mustafoff, M. (2007, June). Trends in metadata practices: A longitudinal study of collection federation. In Proceedings of the 7th ACM/ IEEE-CS Joint Conference on Digital Libraries (pp. 386-395). New York: ACM.
  36. Park, J. R. (2009). Metadata quality in digital repositories: A survey of the current state of the art. Cataloging & Classification Quarterly, 47(3-4), 213-228.
  37. Park, J. R., & Tosaka, Y. (2010). Metadata quality control in digital repositories and collections: Criteria, semantics, and mechanisms. Cataloging & Classification Quarterly, 48(8), 696-715.
  38. Patra, S. K., Bhattacharya, P., & Verma, N. (2006). Bibliometric study of literature on bibliometrics. DESIDOC Journal of Library & Information Technology, 26(1), 27-32.
  39. Pattuelli, M. C. (2010). Knowledge organization landscape: A content analysis of introductory courses. Journal of Information Science, 36(6), 812-822.
  40. Qin, J., & Paling, S. (2001). Converting a controlled vocabulary into an ontology: The case of GEM. Information Research: An International Electronic Journal, 6(2). Retrieved September 22, 2018 from
  41. R Core Team (2014). R: A language and environment for statistical computing. Vienna, Austria: The R Foundation for Statistical Computing.
  42. Schottlaender, B. E. C. (2003). Why metadata? Why me? Why now? Cataloging and Classification Quarterly, 36(3-4), 19-29.
  43. Shiri, A. (2003). Digital library research: Current developments and trends. Library Review, 52(5), 198-202.
  44. Sievert, C., & Shirley, K. (2014). LDAvis: A method for visualizing and interpreting topics. In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces (pp. 63-70). Baltimore: Association for Computational Linguistics.
  45. Symonds, E., & May, C. (2009). Documenting local procedures: The development of standard digitization processes through the Dear Comrade project. Journal of Library Metadata, 9(3-4), 305-323.
  46. Wakimoto, J. C. (2009). Scope of the library catalog in times of transition. Cataloging & Classification Quarterly, 47(5), 409-426.
  47. Woodley, M. S. (2002). A digital library project on a shoestring. Library Collections, Acquisitions, and Technical Services, 26(3), 199-206.
  48. Yague, M. I., Mana, A., & Lopez, J. (2005). A metadatabased access control model for web services. Internet Research, 15(1), 99-116.
  49. Yeh, J., Chen, C., Sie, S., & Liu, C. (2014). X-System: An extensible digital library system for flexible and multipurpose contents management. International Journal of Digital Library Systems, 4(1), 25-40.
  50. Zeng, M. L., & Qin, J. (2016). Metadata (2nd ed.). Chicago: American Library Association.