• Title, Summary, Keyword: 계량언어학

Search Result 9, Processing Time 0.044 seconds

The deduction of objective linguistic information using statistical methods - The grouping of the possibility of interdisciplinary research (통계적 방법을 활용한 객관적 언어정보 도출 - 학제적 연구의 가능성 모색)

  • Choi, Kyoung-Ho;Lee, Yong-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.1
    • /
    • pp.49-55
    • /
    • 2011
  • There are tries to unite through consilience in many fields. Interdisciplinary research is an instance of those. Linguistic studies called linguistic informatics or quantitative linguistics is a field of interdisciplinary research related with statistics linguists have studied chiefly statistics and linguistics. In the statistical aspect, there is need to supplement somewhat of the result of researches by linguists. This study shows statistical method can supplement insufficient objectivity in linguistic studies, and examines the way to raise a degree of completion of interdisciplinary research on statistics and linguistics. This study also shows an introduction and application of the statistical method can be useful for the deduction of objective linguistic information in linguistic studies.

QUANTITATIVE STUDY ON KOREAN MORPHEMES IN JOURNAL EDITORIALS (한국어 형태소의 계량언어학적 연구 -신문 사설을 중심으로-)

  • Bae, Hee-Sook;Shi, Jeong-Kon;Paik, Hae-Seung;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • /
    • pp.17-24
    • /
    • 2001
  • 말뭉치 기반 언어 연구에서 균형성은 매우 중요하게 대두되는 문제이다. 말뭉치의 균형성을 맞추려면 여러 유형의 말뭉치가 갖는 언어적 특성을 고려하여야 한다. 그러나 계량언어학적방법으로 접근한 한국어 말뭉치의 유형별 언어 연구는 아직 미미하다. 본 연구는 언론 매체의 주요 부분인 신문의 사설을 말뭉치로 구성하여 그 언어적 특성을 살펴보고자 한다. 계량언어학의 전형적 방법에 따라 계량화 작업을 먼저 다루고, 이어 신중한 계량화 작업으로 얻어진 자료를 조사 분석하였다.

  • PDF

Quantitative Linguistic Analysis on Literary Works

  • Choi, Kyung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.1057-1064
    • /
    • 2007
  • From the view of natural language process, quantitative linguistic analysis is a linguistic study relying on statistical methods, and is a mathematical linguistics in an attempt to discover various linguistic characters by interpreting linguistic facts quantitatively through statistical methods. In this study, I would like to introduce a quantitative linguistic analysis method utilizing a computer and statistical methods on literary works. I also try to introduce a use of SynKDP, a synthesized Korean data process, and show the relations between distribution of linguistic unit elements which are used by the hero in a novel #Sassinamjunggi# and theme analysis on literary works.

  • PDF

A Measure of Productivity in Derivational Morphology (파생어의 생산성 측정)

  • Cha, Joon-Kyung;Kang, Beom-Mo
    • Annual Conference on Human and Language Technology
    • /
    • /
    • pp.282-289
    • /
    • 1995
  • 이 연구는 지금까지 국어 형태론에서 사용되지 않았던, 코퍼스를 이용한 계량적인 방법으로 파생어의 생산성 정도를 측정하고, 그 결과로 국어 파생 형태론에서의 생산성을 기술한 것이다. 각각의 접사들의 생산성 정도에 대한 수치를 제시함으로써 좀 더 정확하게 상대적인 생산성 비교를 할 수 있도록 하였다. 접사의 생산성 정도 측정방법은 Baayen(1989)에서 제시한 것으로, 특정접사를 가지고 코퍼스에 단 한번 출현하는 단어의 수($n_1$)와, 주어진 접사를 가지고 코퍼스에 나오는 단어의 총수(N)의 비율로 접사의 생산성 정도를 측정한다($P=n_1/N$). 200만 어절 및 1000만 어절 코퍼스를 기반으로 국어의 대표적인 파생접미사들 중 명사파생 접미사 '-이', '-음', '-기', 형용사파생 접미사 '-스럽-', '-롭-', '답-', 동사파생 접미사 '-거리-', '-대-', '-이-'의 생산성 정도를 측정하였다. 본 연구에서 채택한 코퍼스를 이용한 언어 연구 방법은 기존의 사전을 이용하여 파생어의 생산성을 측정하는 것에 비해 앞선 것이라 할 수 있다.

  • PDF

Corpus-Linguistical Analysis of Newspaper Articles (신문 기사의 코퍼스 언어학적 분석)

  • Song, Kyung-Hwa;Kang, Beom-Mo
    • Annual Conference on Human and Language Technology
    • /
    • /
    • pp.7-14
    • /
    • 2006
  • 본 연구에서는 신문 기사에 대한 실증적 언어 분석을 목적으로 하여, <21세기 세종계획>에 의해 구축된 대용량의 신문 기사 말뭉치를 다양한 각도로 계량화하여 분석한다. 신문 기사를 표제, 전문, 본문의 구성으로 나누고 각 구성의 특징에 따라 형태 분석 말뭉치, 형태의미 분석 말뭉치, 구문 분석 말뭉치를 이용하여 분석한다. 본 연구는 대량의 신문 기사 말뭉치를 이용한 계량적 방법이라는데 의의가 있다 이러한 연구 방법을 통하여 기존의 직관을 이용한 연구 방법들과 차별화 된 실증적 연구로서 신문 이론을 검증하고, 신문 기사의 새로운 언어 현상을 발견할 수 있을 것이다.

  • PDF

Application of Statistical Methods in Quantitative Linguistics Study

  • Choi, Kyung-Ho;Hwang, Yong-Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.269-278
    • /
    • 2007
  • Nowadays, from the study of quantitative linguistics, the application of quantitative method is located in a variety of fields as a necessary method. According to this phenomenon, the knowledge of statistical method is requisite for linguists. However, unfortunately, there still remain difficulties for them to acquire the statistical knowledge. So, it is needed for linguists to be helped by statisticians and their active roles. Accordingly, this study is going to emphasizing that statisticians should have more interests in the field of quantitative linguistics. Moreover, it will prove that by using statistical methods, analysis on the linguistic research becomes more objective and scientific.

  • PDF

Aspects of Language Use in Newspaper Articles: A Corpus Linguistic Perspective (신문 기사의 언어 사용 양상: 코퍼스언어학적 접근)

  • Song, Kyung-Hwa;Kang, Beom-Mo
    • Korean Journal of Cognitive Science
    • /
    • v.17 no.4
    • /
    • pp.255-269
    • /
    • 2006
  • The purpose of this study is to analyze newspaper articles from corpus linguistic point of view. We used a large corpus of newspaper articles built from <21st century Sejong Project> and counted occurrences of certain expressions. A newspaper article is divided into the headline, the lead and the body. We tried to figure out how to measure the characteristics of indication and compression which are typical to headlines. Then, we focused on the differences between the headline and the lead. finally, we analyzed the sentence structure and measured the ratio of the frequency of common nouns in the body. This study verifies the existing stylistic theories of newspapers and shows new aspects of language use in newspaper articles. Texts like newspaper articles are the results of human language processing and they in turn affect the development of cognitive ability of language.

  • PDF

Applying Randomization Tests to Collocation Analyses in Large Corpora (언어의 공기관계 분석을 위한 임의화검증의 응용)

  • Yang Kyung-Sook;Kim HeeYoung
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.583-595
    • /
    • 2005
  • Contingency tables are used to compare counts of n-grams to determine if the n-gram is a true collocation, meaning that the words that make up the n-gram are highly associated in the text. Some statistical methods for identifying collocation are used. They are Kulczinsky coefficient, Ochiai coefficient, Frager and McGowan coefficient, Yule coefficient, mutual information, and chi-square, and so on. But the main problem is that these measures are based ell the assumption of a nor-mal or approximately normal distribution of the variables being sampled. While this assumption is valid in most instances, it is not valid when comparing the rates of occurrence of rare events, and texts are composed mostly of rare events. In this paper we have simply reviewed some statistics about testing association of two words. Some randomization tests to evaluate the significance level in analyzing collocation in large corpora are proposed. A related graph can be used to compare different lest statistics that ran be used to analyze the same contingency table.

A Quantitative Linguistic Study on the Functional load of Phonemes in Standard Korean (한국어 음소의 기능부담량 - 계량 언어학적 연구)

  • Jin Nam-Taek
    • MALSORI
    • /
    • no.25_26
    • /
    • pp.65-92
    • /
    • 1993
  • Not all linguistic units are of equal importance in the functioning of language. The present study aims to examine He functional load of phonemes in standard Korean, To achieve this goal, B analysed continuous texts selected from the textbooks of elementary school on a personal computer. The total number of syllables studied in this thesis is 101,637. The characteristics of the Korean syllable structures are as follows. 1) In a syllable head, /n/ occurs most frequently. 2) The frequencies of syllables with an onset are much higher than those with no onset ( 85% : 15% ), 3) In a syllable head, obstruents are preferred because their consonantal strength are great, (57%) 4) In a syllable nucleus, /a/ occurs most frequently. 5) The rate of occurrence of the monophthongs is 90.2%, and that of the diphthongs is 9.8%. Especially the three basic vowels(/i,a,u/) occur at the rate of 46.6%. 6) In a syllable coda, /n/ occurs most frequently. 7) The open syllables are favored (open syllable 68.7%, closed syllable 31.3%).

  • PDF