• Title/Summary/Keyword: Large-scale database

Search Result 297, Processing Time 0.028 seconds

GOMS: Large-scale ontology management system using graph databases

  • Lee, Chun-Hee;Kang, Dong-oh
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.780-793
    • /
    • 2022
  • Large-scale ontology management is one of the main issues when using ontology data practically. Although many approaches have been proposed in relational database management systems (RDBMSs) or object-oriented DBMSs (OODBMSs) to develop large-scale ontology management systems, they have several limitations because ontology data structures are intrinsically different from traditional data structures in RDBMSs or OODBMSs. In addition, users have difficulty using ontology data because many terminologies (ontology nodes) in large-scale ontology data match with a given string keyword. Therefore, in this study, we propose a (graph database-based ontology management system (GOMS) to efficiently manage large-scale ontology data. GOMS uses a graph DBMS and provides new query templates to help users find key concepts or instances. Furthermore, to run queries with multiple joins and path conditions efficiently, we propose GOMS encoding as a filtering tool and develop hash-based join processing algorithms in the graph DBMS. Finally, we experimentally show that GOMS can process various types of queries efficiently.

The Speech Database for Large Scale Word Recognizer (Large scale word recognizer를 위한 음성 database - POW)

  • 임연자
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.291-294
    • /
    • 1995
  • 본논문은 POW algorithm과 알고리즘을 통해 수행된 결과인 large scale word recognizer를 위한 POW set에 대하여 설명하겠다. Large scale word recognizer를 위한 speech database를 구축하기 위해서는 모든 가능한 phonological phenomenon이 POW set에 포함 되어얗 ks다. 또한 POW set의 음운 현상들의 분포는 추출하고자 하는 모집단의 음운현상들의 분포와 유사해야 한다. 위와 같은 목적으로 다음과 같이 3가지 성질을 갖는 POW set을 추출하기 위한 새로운 algorithm을 제안한다. 1. 모집단에서 발생하는 모든 음운현상을 포함해야 한다. 2, 최소한의 단어 집합으로 구성되어야 한다. 3. POW set과 모집단의 음운현상의 분포가 유사해야 한다. 우리는 약 300만 어절의 한국어 text corpus로부터 5천 단어의 고빈도 어절을 추출하고 이로부터 한국어 POW set을 추출하였다.

  • PDF

An Efficient Face Recognition using Feature Filter and Subspace Projection Method

  • Lee, Minkyu;Choi, Jaesung;Lee, Sangyoun
    • Journal of International Society for Simulation Surgery
    • /
    • v.2 no.2
    • /
    • pp.64-66
    • /
    • 2015
  • Purpose : In this paper we proposed cascade feature filter and projection method for rapid human face recognition for the large-scale high-dimensional face database. Materials and Methods : The relevant features are selected from the large feature set using Fast Correlation-Based Filter method. After feature selection, project them into discriminant using Principal Component Analysis or Linear Discriminant Analysis. Their cascade method reduces the time-complexity without significant degradation of the performance. Results : In our experiments, the ORL database and the extended Yale face database b were used for evaluation. On the ORL database, the processing time was approximately 30-times faster than typical approach with recognition rate 94.22% and on the extended Yale face database b, the processing time was approximately 300-times faster than typical approach with recognition rate 98.74 %. Conclusion : The recognition rate and time-complexity of the proposed method is suitable for real-time face recognition system on the large-scale high-dimensional face database.

Development of the design methodology for large-scale database based on MongoDB

  • Lee, Jun-Ho;Joo, Kyung-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.11
    • /
    • pp.57-63
    • /
    • 2017
  • The recent sudden increase of big data has characteristics such as continuous generation of data, large amount, and unstructured format. The existing relational database technologies are inadequate to handle such big data due to the limited processing speed and the significant storage expansion cost. Thus, big data processing technologies, which are normally based on distributed file systems, distributed database management, and parallel processing technologies, have arisen as a core technology to implement big data repositories. In this paper, we propose a design methodology for large-scale database based on MongoDB by extending the information engineering methodology based on E-R data model.

An Experimental Study on the Warehouse Mock-up Fire Test (창고 모델 실물화재 특성에 대한 실험적 연구)

  • Kweon, Oh-Sang;Yoo, Yong-Ho;Kim, Heung-Youl
    • Fire Science and Engineering
    • /
    • v.24 no.4
    • /
    • pp.47-54
    • /
    • 2010
  • This study is analyze the damage of warehouse fire accident be made through the fire characteristic database of combustibles and real scale fire test of warehouse mock-up. Combustibles fire tests are carried out for database using RCT (Room Corner Tester) to predict fire growth the goods. A mockup ($3m{\times}3m{\times}2.4m$) of clothes warehouse was built and real scale fire test by LSC (Large Scale Calorimeter) base on the fire characteristic DB. The mock-up of clothes warehouse is made of two type sandwich panels (Glass wool, EPS foam sandwich panel). As a mock-up test result, test 1 (Glass wool sandwich panel) and test 2 (EPS foam sandwich panel) indicating fire growth such as 5 MW, 11 MW of maximum HRR (Heat Release Rate).

DNS and Analysis on the Interscale Interactions of the Turbulent Flow past a Circular Cylinder for Large Eddy Simulation (원형 실린더를 지나는 난류 유동장의 직접수치해석과 큰 에디모사를 위한 스케일 간 상호작용 연구)

  • Kim, Taek-Keun;Park, No-Ma;Yoo, Jung-Yul
    • Proceedings of the KSME Conference
    • /
    • 2004.04a
    • /
    • pp.1801-1806
    • /
    • 2004
  • Stochastic nature of subgrid-scale stress causes the predictability problem in large eddy simulation (LES) by which the LES solution field decorrelates with field from filtered directnumerical simulation (DNS). In order to evaluate the predictability limit in a priori sense, the information on the interplay between resolved scale and subgrid-scale (SGS) is required. In this study, the analysis on the inter-scale interaction is performed by applying tophat and cutoff filters to DNS database of flow over a circular cylinder at Reynolds number of 3900. The effect of filter shape is investigated on the interpretation of correlation between scales. A critique is given on the use of tophat filter for SGS analysis using DNS database. It is shown that correlations between Karman vortex and SGS kinetic energy drastically decrease when the cutoff filter is used, which implies that the small scale universality holds even in the presence of the large scale coherent structure.

  • PDF

A Range Query Method using Index in Large-scale Database Systems (대규모 데이터베이스 시스템에서 인덱스를 이용한 범위 질의 방법)

  • Kim, Chi-Yeon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.5
    • /
    • pp.1095-1101
    • /
    • 2012
  • As the amount of data increases explosively, a large scale database system is emerged to store, retrieve and manipulate it. There are several issues in this environments such as, consistency, availability and fault tolerance. In this paper, we address a efficient range-query method where data management services are separated from transaction management services in large-scale database systems. A study had been proposed using partitions to protect independence of two modules and to resolve the phantom problem, but this method was efficient only when range-query is specified by a key. So, we present a new method that can improve the efficiency when range-query is specified by a key attribute as well as other attributes. The presented method can guarantee the independence of separated modules and alleviate overheads for range-query using partial index.

Comparison of DBMS Performance for processing Small Scale Database (소용량 데이터베이스 처리를 위한 DBMS의 성능 비교)

  • Jang, Si-Woong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.139-142
    • /
    • 2008
  • While a lot of comparisons of DBMS performance for processing large scale database are given as results of bench-mark tests, there are few comparisons of DBMS performance for processing small scale database. Therefore, in this study, we compared and analyzed on the performance of commercial DBMS and public DBMS for small scale database. Analysis results show that while Oracle has low performance on the operations of update and insert due to the overhead of rollback for data safety, MySQL and MS-SQL have good performance without additional overhead.

  • PDF

Comparison of DBMS Performance for processing Small Scale Database (소용량 데이터베이스 처리를 위한 DBMS의 성능 비교)

  • Jang, Si-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.11
    • /
    • pp.1999-2004
    • /
    • 2008
  • While a lot of comparisons of DBMS performance for processing large scale database are given as results of bench-mark tests, there are few comparisons of DBMS performance for processing small scale database. Therefore, in this study, we compared and analyzed on the performance of commercial DBMS and public DBMS for small scale database. Analysis results show that while Oracle has low performance on the operations of update and insert due to the overhead of rollback for data safely, MySQL and MS-SOL have good performance without additional overhead.

A Data Mining Procedure for Unbalanced Binary Classification (불균형 이분 데이터 분류분석을 위한 데이터마이닝 절차)

  • Jung, Han-Na;Lee, Jeong-Hwa;Jun, Chi-Hyuck
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.36 no.1
    • /
    • pp.13-21
    • /
    • 2010
  • The prediction of contract cancellation of customers is essential in insurance companies but it is a difficult problem because the customer database is large and the target or cancelled customers are a small proportion of the database. This paper proposes a new data mining approach to the binary classification by handling a large-scale unbalanced data. Over-sampling, clustering, regularized logistic regression and boosting are also incorporated in the proposed approach. The proposed approach was applied to a real data set in the area of insurance and the results were compared with some other classification techniques.