• Title/Summary/Keyword: dictionary

Search Result 1,123, Processing Time 0.027 seconds

Computerized Sound Dictionary of Korean and English

  • Kim, Jong-Mi
    • Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.33-52
    • /
    • 2001
  • A bilingual sound dictionary in Korean and English has been created for a broad range of sound reference to cross-linguistic, dialectal, native language (L1)-transferred biological and allophonic variations. The paper demonstrates that the pronunciation dictionary of the lexicon is inadequate for sound reference due to the preponderance of unmarked sounds. The audio registry consists of the three-way comparison of 1) English speech from native English speakers, 2) Korean speech from Korean speakers, and 3) English speech from Korean speakers. Several sub-dictionaries have been created as the foundation research for independent development. They are 1) a pronunciation dictionary of the Korean lexicon in a keyboard-compatible phonetic transcription, 2) a sound dictionary of L1-interfered language, and 3) an audible dictionary of Korean sounds. The dictionary was designed to facilitate the exchange of the speech signal and its corresponding text data on various media particularly on CD-ROM. The methodology and findings of the construction are discussed.

  • PDF

Encoding Dictionary Feature for Deep Learning-based Named Entity Recognition

  • Ronran, Chirawan;Unankard, Sayan;Lee, Seungwoo
    • International Journal of Contents
    • /
    • v.17 no.4
    • /
    • pp.1-15
    • /
    • 2021
  • Named entity recognition (NER) is a crucial task for NLP, which aims to extract information from texts. To build NER systems, deep learning (DL) models are learned with dictionary features by mapping each word in the dataset to dictionary features and generating a unique index. However, this technique might generate noisy labels, which pose significant challenges for the NER task. In this paper, we proposed DL-dictionary features, and evaluated them on two datasets, including the OntoNotes 5.0 dataset and our new infectious disease outbreak dataset named GFID. We used (1) a Bidirectional Long Short-Term Memory (BiLSTM) character and (2) pre-trained embedding to concatenate with (3) our proposed features, named the Convolutional Neural Network (CNN), BiLSTM, and self-attention dictionaries, respectively. The combined features (1-3) were fed through BiLSTM - Conditional Random Field (CRF) to predict named entity classes as outputs. We compared these outputs with other predictions of the BiLSTM character, pre-trained embedding, and dictionary features from previous research, which used the exact matching and partial matching dictionary technique. The findings showed that the model employing our dictionary features outperformed other models that used existing dictionary features. We also computed the F1 score with the GFID dataset to apply this technique to extract medical or healthcare information.

ERP Application Development Using Business Data Dictionary (데이터사전을 이용한 ERP애플리케이션 개발)

  • Minsu Jang;Joo-Chan Sohn;Jong-Myoung Baik
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.1
    • /
    • pp.141-152
    • /
    • 2002
  • Data dictionary is a collection of meta-data, which describes data produced and consumed while performing business processes. Data dictionary is an essential element for business process standardization and automation, and has a fundamental role in ERP application management and customization. Also, data dictionary facilitates B2B processes by enabling painless integration of business processes between various enterprises. We implemented data dictionary support in SEA+, a component- based scalable ERP system developed in ETRI, and found out that it's a plausible feature of business information system. We discovered that data dictionary promotes semantic, not syntactic, data management, which can make it possible to leverage viability of the tool in the coming age of more meta-data oriented computing world. We envision that business data dictionary is a firm foundation of adapting business knowledge, applications and processes into the semantic web based enterprise infra-structure.

  • PDF

ERP Application Development Using Business Data Dictionary

  • Jang, Min-Su;Sohn, Joo-Chan;Baik, Jong-Myoung
    • Proceedings of the CALSEC Conference
    • /
    • 2001.08a
    • /
    • pp.483-491
    • /
    • 2001
  • Data dictionary is a collection of metadata about data defined, produced and consumed while performing business processes. Data dictionary is an essential element for business process standardization and automation. Data dictionary also has a fundamental role in ERP application management and customization. Finally, data dictionary helps B2B by gracefully integrating intra-enterprise business processes and inter-enterprise business processes. This paper gives some clues about the importance of data dictionary in ERP and B2B, and introduces data dictionary support of SEA+, a component-based scalable ERP package system.

  • PDF

Statistical Information of Korean Dictionary to Construct an Enormous Electronic Dictionary (대용량 전자사전 구축을 위한 국어 대사전의 통계 정보)

  • Kim, Cheol-Su;Kim, Yang-Beom
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.6
    • /
    • pp.60-68
    • /
    • 2007
  • There are various application areas of Language information processing such as information retrieval, morphological analysis, spell checker, voice recognition, character recognition, etc. In these language information processing areas, an electronic dictionary is essential. This thesis made researches on basic statistical information on the Korean dictionary and on the construction of electronic dictionary. The targets of analysis were the number of registered word in Korea dictionary, the entry number of registered word in electronic dictionary, the number of used syllables, the number of different syllables, the average length of entry, the distribution of part of speech and the number of used nodes to construct electronic dictionary using Trie, except for words including a archaic word or incomplete syllables. Total entry number of electronic dictionary is 361,980, the number of used syllables is 1,289,659, the average length of entries is 3.56 and the number of different syllables is 2,463. Theses informations would play a beneficial role in constructing an electronic dictionary and in processing Korean information.

Component Implementation of Electronic Dictionary (전자사전 컴포넌트의 구현)

  • Choe, Seong-Un
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.587-592
    • /
    • 2001
  • Many applications are being developed to automate office works, and the electronic dictionary(e-Dictionary) is one of the main components of the office suites. Several requirements are proposed for the efficient e-dictionaries :1) Fast searching time, 2) Data compatibility with other e-dictionaries to deal with words and obsolete word, and 3) Reusable components to develop new customized e-dictionaries with minimized development time and cost. We propose a data format with which any e-dictionary can change data with others. We also develop System Dictionary component and Customer Dictionary component to enable-and-play component reuse. Our e-dictionary achieves fast searching time by efficiently managing Trie and B-tree index structure for the dictionary components.

  • PDF

A construction of dictionary for Korean Text to Sign Language Translation (한글문장-수화 번역기를 위한 사전구성)

  • 권경혁;민홍기
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.841-844
    • /
    • 1998
  • Korean Text to Sign Language Traslator could be applied to learn letters for both the deaf and hard-of-hearing people, and to have a conversation with normal people. This paper describes some useful dictionaries for developing korean text to sign language translator; Base sign language dictionary, Compound sign language dictionary, and Resemble sign language dictionary. As korean sign language is composed entirely of about 6,000 words, the additional dictionaries are required for matching them to korean written language. We design base sign language dictionary which was composed of basic symbols and moving picture of korean sign language, and propose the definition of compound isng language dictionary which was composed of symbols of base sing language. In addition, resemble sign language dictionary offer sign symbols and letters which is used same meaning in conversation. By using these methods, we could search quickly sign language during korean text to sign language translating process, and save storage space. We could also solve the lack of sign language words by using them, which are appeared on translating process.

  • PDF

Text Compression by Word and Etymology Dictionary (단어, 어원 Dictionary에 의한 Text 압축)

  • Lee, Jae-Young;Sung, Koeng-Mo;Lee, Chong-Kak
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.607-611
    • /
    • 1988
  • In this paper, a text compression method is proposed which is capable of reducing mean bits per character by word and etymology dictionary. This dictionary consists of 256 words and 512 etymologies with 10 bits length codes. Using this dictionary, a mean rate of 3.44 bits per character is achieved.

  • PDF

Science and Technology Terminology Dictionary Building Process and Workbench Development in Defense Area (국방과학기술 전문용어 사전 구축을 위한 프로세스 및 워크벤치 개발)

  • Choi, Jung-Whoan;Park, Jeong-Ho;Kim, Kyung-Sun;Kim, Pyung
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.8
    • /
    • pp.420-428
    • /
    • 2012
  • To improve the efficiency of business, it is important to standardize the meaning of terminology. And then, terminology dictionaries have been actively being built and used in various fields. In defense area, the publication of defense terminology dictionary is useful for information exchange of each army and distribution of standardized terminology. Defense agency for technology and quality(DTaQ) publishes terminology dictionary of defense science and technology on a three-year cycle. DTaQ tries to standardize the construction process of terminology dictionary and improve service efficiency by using terminology dictionary in defense area. This proposed method is based on the result of previous study about standardization of terminology dictionary. We suggest the practical steps including terminology dictionary constructing process, composition and role of organization, definition of headword, selection of target documents to be extracted terminology candidate, terminology extraction, generation of terminology candidate group, workbench registration, construction and validation of terminology dictionary. Thesaurus and workbench are developed to use and support terminology dictionary effectively.

A New Approach of Domain Dictionary Generation

  • Xi, Su Mei;Cho, Young-Im;Gao, Qian
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.1
    • /
    • pp.15-19
    • /
    • 2012
  • A Domain Dictionary generation algorithm based on pseudo feedback model is presented in this paper. This algorithm can increase the precision of domain dictionary generation algorithm. The generation of Domain Dictionary is regarded as a domain term retrieval process: Assume that top N strings in the original retrieval result set are relevant to C, append these strings into the dictionary, retrieval again. Iterate the process until a predefined number of domain terms have been generated. Experiments upon corpus show that the precision of pseudo feedback model based algorithm is much higher than existing algorithms.