DOI QR코드

DOI QR Code

Survey of Temporal Information Extraction

  • Lim, Chae-Gyun (School of Computing, Korea Advanced Institute of Science and Technology) ;
  • Jeong, Young-Seob (Dept. of Big Data Engineering, Soonchunhyang University) ;
  • Choi, Ho-Jin (School of Computing, Korea Advanced Institute of Science and Technology)
  • Received : 2019.01.03
  • Accepted : 2019.04.18
  • Published : 2019.08.31

Abstract

Documents contain information that can be used for various applications, such as question answering (QA) system, information retrieval (IR) system, and recommendation system. To use the information, it is necessary to develop a method of extracting such information from the documents written in a form of natural language. There are several kinds of the information (e.g., temporal information, spatial information, semantic role information), where different kinds of information will be extracted with different methods. In this paper, the existing studies about the methods of extracting the temporal information are reported and several related issues are discussed. The issues are about the task boundary of the temporal information extraction, the history of the annotation languages and shared tasks, the research issues, the applications using the temporal information, and evaluation metrics. Although the history of the tasks of temporal information extraction is not long, there have been many studies that tried various methods. This paper gives which approach is known to be the better way of extracting a particular part of the temporal information, and also provides a future research direction.

Acknowledgement

Grant : Development of Knowledge Evolutionary WiseQA Platform Technology for Human Knowledge Augmented Services

Supported by : Institute for Information & communications Technology Planning & Evaluation (IITP), National Research Foundation of Korea (NRF)

References

  1. M. Baldassarre, "Think big: learning contexts, algorithms and data science," Research on Education and Media, vol. 8, no. 2, pp. 69-83, 2016. https://doi.org/10.1515/rem-2016-0020
  2. Wikipedia, "Time," [Online]. Available: https://en.wikipedia.org/wiki/Time.
  3. F. Schilder and C. Habel, "Temporal information extraction for temporal question answering," in New Directions in Question Answering: Papers from the 2003 AAAI Symposium. Menlo Park, CA: AAAI Press, 2003, pp. 35-44.
  4. O. Alonso, M. Gertz, and R. Baeza-Yates, "On the value of temporal information in information retrieval," ACM SIGIR Forum, vol. 41, no. 2, pp. 35-41, 2007. https://doi.org/10.1145/1328964.1328968
  5. A. Setzer and R. J. Gaizauskas, "Annotating events and temporal information in newswire texts," in Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC), Athens, Greece, 2000, pp. 1287-1294.
  6. US Advanced Research Projects Agency, Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference Held in Baltimore, Maryland, August 25-27, 1993. San Francisco, CA: Morgan Kaufmann, 1993.
  7. "Named entity task definition (version 2.0)," 1995 [Online]. Available: https://cs.nyu.edu/cs/faculty/grishman/NEtask20.book_1.html.
  8. N. Chinchor, "Appendix D: MUC-7 Information extraction task definition (version 5.1)," in Proceedings of the 7th Message Understanding Conference (MUC-7), Fairfax, VA, 1998.
  9. P. Kim and S. H. Myaeng, "Usefulness of temporal information automatically extracted from news articles for topic tracking," ACM Transactions on Asian Language Information Processing (TALIP), vol. 3, no. 4, pp. 227-242, 2004. https://doi.org/10.1145/1039621.1039624
  10. J. Allan, J. G. Carbonell, G. Doddington, J. Yamron, and Y. Yang, "Topic detection and tracking pilot study final report," in Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA, 1998, pp. 194-218.
  11. Y. Yang, J. G. Carbonell, R. D. Brown, T. Pierce, B. T. Archibald, and X. Liu, "Learning approaches for detecting and tracking news events," IEEE Intelligent Systems and their Applications, vol. 14, no. 4, pp. 32-43, 1999.
  12. M. Verhagen, R. Gaizauskas, F. Schilder, M. Hepple, J. Moszkowicz, and J. Pustejovsky, "The TempEval challenge: identifying temporal relations in text," Language Resources and Evaluation, vol. 43, no. 2, pp. 161-179, 2009. https://doi.org/10.1007/s10579-009-9086-z
  13. M. Verhagen, R. Sauri, T. Caselli, and J. Pustejovsky, "SemEval-2010 Task 13: TempEval-2," in Proceedings of the 5th International Workshop on Semantic Evaluation, Uppsala, Sweden, 2010, pp. 57-62.
  14. N. UzZaman, H. Llorens, L. Derczynski, J. Allen, M. Verhagen, and J. Pustejovsky, "Semeval-2013 task 1: Tempeval-3: evaluating time expressions, events, and temporal relations," in Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval), Atlanta, GA, 2013, pp. 1-9.
  15. J. Strotgen and M. Gertz, "HeidelTime: High quality rule-based extraction and normalization of temporal expressions," in Proceedings of the 5th International Workshop on Semantic Evaluation, Uppsala, Sweden, 2010, pp. 321-324.
  16. H. Jung and A. Stent, "ATT1: temporal annotation using big windows and rich syntactic and semantic features," in Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval), Atlanta, GA, 2013, pp. 20-24.
  17. S. Bethard, "Cleartk-timeml: a minimalist approach to TempEval 2013," in Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval), Atlanta, GA, 2013, pp. 10-14.
  18. B. E. Boser, I. M. Guyon, and V. N. Vapnik, "A training algorithm for optimal margin classifiers," in Proceedings of the 5th Annual Workshop on Computational Learning Theory, Pittsburgh, PA, 1992, pp. 144-152.
  19. C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995. https://doi.org/10.1007/BF00994018
  20. The Informatics for Integrating Biology and the Bedside (i2b2), "2012 NLP Shared Task," 2012 [Online]. Available: https://www.i2b2.org/NLP/TemporalRelations/.
  21. S. Bethard, L. Derczynski, G. Savova, J. Pustejovsky, and M. Verhagen, "Semeval-2015 task 6: clinical TempEval," in Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval), Denver, CO, 2015, pp. 806-814.
  22. L. Ferro, I. Mani, B. Sundheim, and G. Wilson, "TIDES Temporal Annotation Guidelines (version 1.0.2)," The MITRE Corporation, McLean, VA, 2001.
  23. Data elements and interchange formats - Information interchange -Representation of dates and times, ISO 8601, 2004.
  24. J. Pustejovsky, J. M. Castano, R. Ingria, R. Sauri, R. J. Gaizauskas, A. Setzer, G. Katz, and D. Radev, "TimeML: robust specification of event and temporal expressions in text," in Proceedings of AAAI Spring Symposium on New Directions in Question Answering, Stanford, CA, 2003, pp. 28-34.
  25. G. Katz and F. Arosio, "The annotation of temporal information in natural language sentences," in Proceedings of the Workshop on Temporal and Spatial Information Processing, Stroudsburg, PA, 2001.
  26. Language resources management - Semantic annotation framework (SemAF) - Part1: Time and events, ISO 24617-1:2012, 2012.
  27. T. Caselli, V. B. Lenzi, R. Sprugnoli, E. Pianta, and I. Prodanof, "Annotating events, temporal expressions and relations in Italian: the It-TimeML experience for the Ita-TimeBank,"in Proceedings of the 5th Linguistic Annotation Workshop, Portland, OR, 2011, pp. 143-151.
  28. S. Im, H. You, H. Jang, S. Nam, and H. Shin, "KTimeML: specification of temporal and event expressions in Korean text," in Proceedings of the 7th Workshop on Asian Language Resources, Singapore, 2009, pp. 115-122.
  29. Y. S. Jeong, W. T. Joo, H. W. Do, C. G. Lim, K. S. Choi, and H. J. Choi, "Korean TimeML and Korean TimeBank," in Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC), Portoroz, Slovenia, 2016, pp. 356-359.
  30. B. C. Bruce, "A model for temporal references and its application in a question answering program," Artificial Intelligence: An International Journal, vol. 3, pp. 1-26, 1972. https://doi.org/10.1016/0004-3702(72)90040-9
  31. J. F. Allen, "Maintaining knowledge about temporal intervals," Communications of the ACM, vol. 26, no. 11, pp. 832-843, 1983. https://doi.org/10.1145/182.358434
  32. D. R. Dowty, "The effects of aspectual class on the temporal structure of discourse: semantics or pragmatics?," Linguistics and Philosophy, vol. 9, no. 1, pp. 37-61, 1986. https://doi.org/10.1007/BF00627434
  33. B. L. Webber, "Tense as discourse anaphor," Computational Linguistics, vol. 14, no. 2, pp. 61-73, 1988.
  34. R. J. Passonneau, "A computational model of the semantics of tense and aspect," Computational Linguistics, vol. 14, no. 2, pp. 44-60, 1988.
  35. M. Moens and M. Steedman, "Temporal ontology and temporal reference," Computational Linguistics, vol. 14, no. 2, pp. 15-28, 1988.
  36. F. Song and R. Cohen, "Tense interpretation in the context of narrative," in Proceedings 9th National Conference on Artificial Intelligence, Anaheim, CA, 1991, pp. 131-136.
  37. C. H. Hwang and L. K. Schubert, "Tense trees as the "fine structure" of discourse," in Proceedings of the 30th Annual Meeting on Association for Computational Linguistics, Newark, DE, 1992, pp. 232-240.
  38. D. Llido, R. Berlanga, and M. J. Aramburu, "Extracting temporal references to assign document event-time periods," in Database and Expert Systems Applications. Heidelberg: Springer, 2001, pp. 62-71.
  39. M. J. Aramburu-Cabo and R. Berlanga-Llavori, "Retrieval of information from temporal document databases," in Object-Oriented Technology: ECOOP 1999 Workshop Reader. Heidelberg: Springer, 1999, p. 215.
  40. I. Mani, B. Schiffman, and J. Zhang, "Inferring temporal ordering of events in news," in Proceedings of Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), Edmonton, Canada, 2003, pp. 55-57.
  41. I. Mani and G. Wilson, "Robust temporal processing of news," In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong, China, 2000, pp. 69-76.
  42. I. Mani, "Recent developments in temporal information extraction," in Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets, Bulgaria, 2003, pp. 45-60.
  43. Tango - annotation tool [Online]. Available: http://www.timeml.org/tango/tool.html.
  44. D. S. Day, C. McHenry, R. Kozierok, and L. D. Riek, "Callisto: a configurable annotation workbench," in Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon, Portugal, 2004.
  45. J. Pustejovsky, J. Littman, and R. Sauri, "Argument structure in TimeML," in Dagstuhl Seminar Proceedings. Wadern, Germany: Schloss Dagstuhl, Leibniz-Zentrum fur Informatik, 2006.
  46. M. Verhagen, "Drawing TimeML Relations with TBox," in Annotating, Extracting and Reasoning about Time and Events. Heidelberg: Springer, 2007, pp. 7-28.
  47. R. Sauri, R. Knippen, M. Verhagen, and J. Pustejovsky, "Evita: a robust event recognizer for QA systems," in Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, Canada, 2005, pp. 700-707.
  48. M. Verhagen, I. Mani, R. Sauri, R. Knippen, J. B. Jang, J. Littman, A. Rumshisky, J. Phillips, and J. Pustejovsky, "Automating temporal annotation with TARSQI," in Proceedings of the ACL Interactive Poster and Demonstration Sessions, Ann Arbor, MI, 2005, pp. 81-84.
  49. W. Mingli, L. Wenjie, L. Qin, and L. Baoli, "CTEMP: a Chinese temporal parser for extracting and normalizing temporal information," in Natural Language Processing - IJCNLP 2005. Heidelberg: Springer, 2005, pp. 694-706.
  50. A. Berglund, R. Johansson, and P. Nugues, "A machine learning approach to extract temporal information from texts in Swedish and generate animated 3D scenes," in Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Trento, Italy, 2006, pp. 385-392.
  51. N. Chambers, S. Wang, and D. Jurafsky, "Classifying temporal relations between events," in Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, Prague, Czech Republic, 2007, pp. 173-176.
  52. J. Poveda, M. Surdeanu, and J. Turmo, "A comparison of statistical and rule-induction learners for automatic tagging of time expressions in English," in Proceedings of the 14th International Symposium on Temporal Representation and Reasoning (TIME'07), Alicante, Spain, 2007, pp. 141-149.
  53. T. Brants, "TnT: statistical part-of-speech tagging," 1998 [Online]. Available: http://www.coli.uni-saarland.de/-thorsten/tnt/.
  54. T. Kudo, "YamCha: yet another multipurpose CHunk annotator," 2013 [Online]. Available: http://chasen.org/-taku/software/yamcha/.
  55. N. Chambers and D. Jurafsky, "Jointly combining implicit constraints improves temporal ordering," in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, HI, 2008, pp. 698-706.
  56. K. Yoshikawa, S. Riedel, M. Asahara, and Y. Matsumoto, "Jointly identifying temporal relations with Markov logic," in Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, 2009, pp. 405-413.
  57. N. UzZaman and J. F. Allen, "Event and temporal expression extraction from raw text: first step towards a temporally aware system," International Journal of Semantic Computing, vol. 4, no. 4, pp. 487-508, 2010. https://doi.org/10.1142/S1793351X10001097
  58. J. Strotgen, M. Gertz, and P. Popov, "Extraction and exploration of spatio-temporal information in documents," in Proceedings of the 6th Workshop on Geographic Information Retrieval, Zurich, Switzerland, 2010.
  59. Apache Software Foundation, "Apache UIMA," 2013 [Online]. Available: http://uima.apache.org/.
  60. Qbase, "MetaCarta," [Online]. Available: http://qbase.com/products/metacarta/.
  61. P. Mazur and R. Dale, "WikiWars: a new corpus for research on temporal expressions," in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, MA, 2010, pp. 913-922.
  62. J. Strotgen and M. Gertz, "TimeTrails: a system for exploring spatio-temporal information in documents," Proceedings of the VLDB Endowment, vol. 3, no. 1-2, pp. 1569-1572, 2010. https://doi.org/10.14778/1920841.1921041
  63. X. Ling and D. S. Weld, "Temporal information extraction," in Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, GA, 2010, pp. 1385-1390.
  64. H. Llorens, E. Saquete, and B. Navarro, "TIPSem (English and Spanish): evaluating CRFs and semantic roles in tempeval-2," in Proceedings of the 5th International Workshop on Semantic Evaluation, Los Angeles, CA, 2010, pp. 284-291.
  65. Y. Wang, M. Zhu, L. Qu, M. Spaniol, and G. Weikum, "Timely YAGO: harvesting, querying, and visualizing temporal knowledge from Wikipedia," in Proceedings of the 13th International Conference on Extending Database Technology, 2010, pp. 697-700.
  66. O. Alonso, J. Strotgen, R. A. Baeza-Yates, and M. Gertz, "Temporal information retrieval: challenges and opportunities," in Proceedings of Workshop on Linked Data on the Web, Hyderabad, India, 2011, pp. 1-8.
  67. S. A. Mirroshandel and G. Ghassem-Sani, "Temporal relation extraction using expectation maximization," in Proceedings of the International Conference Recent Advances in Natural Language Processing, Hissar, Bulgaria, 2011, pp. 218-225.
  68. Y. Wang, B. Yang, L. Qu, M. Spaniol, and G. Weikum, "Harvesting facts from textual web sources by constrained label propagation," in Proceedings of the 20th ACM International Conference on Information and Knowledge Management, Glasgow, Scotland, 2011, pp. 837-846.
  69. J. Hoffart, F. M. Suchanek, K. Berberich, E. Lewis-Kelham, G. De Melo, and G. Weikum, "YAGO2: exploring and querying world knowledge in time, space, context, and many languages," in Proceedings of the 20th International Conference Companion on World Wide Web, Hyderabad, India, 2011, pp. 229-232.
  70. M. S. Fabian, K. Gjergji, and W. Gerhard, "Yago: a core of semantic knowledge unifying WordNet and Wikipedia," in Proceedings of the 16th International World Wide Web Conference, Banff, Canada, 2007, pp. 697-706.
  71. WordNet [Online]. Available: https://wordnet.princeton.edu/.
  72. Y. Wang, M. Dylla, M. Spaniol, and G. Weikum, "Coupling label propagation and constraints for temporal fact extraction," in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, Jeju, Korea, 2012, pp. 233-237.
  73. A. X. Chang and C. D. Manning, "SUTime: a library for recognizing and normalizing time expressions," in Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, 2012, pp. 3735-3740.
  74. S. A. Mirroshandel and G. Ghassem-Sani, "Towards unsupervised learning of temporal relations between events," Journal of Artificial Intelligence Research, vol. 45, pp. 125-163, 2012. https://doi.org/10.1613/jair.3693
  75. T. Strohman, D. Metzler, H. Turtle, and W. B. Croft, "Indri: a language model-based search engine for complex queries," in Proceedings of the International Conference on Intelligent Analysis, McLean, VA, 2005, pp. 2-6.
  76. E. Kuzey and G. Weikum, "Extraction of temporal facts and events from Wikipedia," in Proceedings of the 2nd Temporal Web Analytics Workshop, Lyon, France, 2012, pp. 25-32.
  77. I. Berrazega, "Temporal information processing: a survey," International Journal on Naturel Language Computing, vol. 1, no. 2, pp. 1-14, 2012.
  78. B. Tang, Y. Wu, M. Jiang, Y. Chen, J. C. Denny, and H. Xu, "A hybrid system for temporal information extraction from clinical text," Journal of the American Medical Informatics Association, vol. 20, no. 5, pp. 828-835, 2013. https://doi.org/10.1136/amiajnl-2013-001635
  79. P. Jindal and D. Roth, "Extraction of events and temporal expressions from clinical narratives," Journal of Biomedical Informatics, vol. 46, pp. S13-S19, 2013. https://doi.org/10.1016/j.jbi.2013.08.010
  80. S. Bethard, "A synchronous context free grammar for time normalization," in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Seattle, WA, 2013, pp. 821-826.
  81. Y. K. Lin, H. Chen, and R. A. Brown, "MedTime: a temporal information extraction system for clinical narratives," Journal of Biomedical Informatics, vol. 46, pp. S20-S28, 2013. https://doi.org/10.1016/j.jbi.2013.07.012
  82. R. Campos, G. Dias, A. M. Jorge, and A. Jatowt, "Survey of temporal information retrieval and related applications," ACM Computing Surveys (CSUR), vol. 47, no. 2, article no. 15, 2015.
  83. K. Lee, Y. Artzi, J. Dodge, and L. Zettlemoyer, "Context-dependent semantic parsing for time expressions," in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, 2014, pp. 1437-1447.
  84. H. Ji, T. Cassidy, Q. Li, and S. Tamang, "Tackling representation, annotation and classification challenges for temporal knowledge base population," Knowledge and Information Systems, vol. 41, no. 3, pp. 611-646, 2014. https://doi.org/10.1007/s10115-013-0675-1
  85. T. Cassidy, "Temporal information extraction and knowledge base population," Ph.D. dissertation, City University of New York, NY, 2014.
  86. Y. S. Jeong, Z. M. Kim, H. W. Do, C. G. Lim, and H. J. Choi, "Temporal information extraction from Korean texts," in Proceedings of the 19th Conference on Computational Natural Language Learning, Beijing, China, 2015, pp. 279-288.
  87. Y. S. Jeong and H. J. Choi, "Language independent feature extractor," in Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, TX, 2015, pp. 4170-4171.
  88. S. Bethard, G. Savova, W. T. Chen, L. Derczynski, J. Pustejovsky, and M. Verhagen, "Semeval-2016 task 12: clinical TempEval," in Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval), San Diego, CA, 2016, pp. 1052-1062.
  89. S. MacAvaney, A. Cohan, and N. Goharian, "GUIR at SemEval-2017 Task 12: a framework for cross-domain clinical temporal information extraction," in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval), Vancouver, Canada, 2017, pp. 1024-1029.
  90. P. R. Sarath, R. Manikandan, and Y. Niwa, "Hitachi at SemEval-2017 Task 12: system for temporal information extraction from clinical notes," in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval), Vancouver, Canada, 2017, pp. 1005-1009.
  91. A. Leeuwenberg and M. F. Moens, "KULeuven-LIIR at SemEval-2017 Task 12: cross-domain temporal information extraction from clinical records," in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval), Vancouver, Canada, 2017, pp. 1030-1034.
  92. J. Tourille, O. Ferret, X. Tannier, and A. Neveol, "LIMSI-COT at SemEval-2017 Task 12: neural architecture for temporal information extraction from clinical narratives," in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval), 2017, pp. 597-602.
  93. E. Laparra, D. Xu, A. Elsayed, S. Bethard, and M. Palmer, "SemEval 2018 Task 6: parsing time normalizations," in Proceedings of the 12th International Workshop on Semantic Evaluation, New Orleans, LA, 2018, pp. 88-96.
  94. S. Bethard and J. Parker, "A semantically compositional annotation scheme for time normalization," in Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC), Portoroz, Slovenia, 2016, pp. 3779-3786.
  95. A. Olex, L. Maffey, N. Morgan, and B. McInnes, "Chrono at SemEval-2018 Task 6: a system for normalizing temporal expressions," in Proceedings of the 12th International Workshop on Semantic Evaluation, New Orleans, LA, 2018, pp. 97-101.
  96. J. Pustejovsky and A. Stubbs, Natural Language Annotation for Machine Learning: A Guide to Corpus-Building for Applications. Sebastopol, CA: O'Reilly Media, 2012.