Fragment Combination From DNA Sequence Data Using Fuzzy Reasoning Method

퍼지 추론기법을 이용한 DNA 염기 서열의 단편결합

  • 김광백 (신라대학교 컴퓨터공학과) ;
  • 박현정 (신라대학교 건축학부)
  • Published : 2006.12.30


In this paper, we proposed a method complementing failure of combining DNA fragments, defect of conventional contig assembly programs. In the proposed method, very long DNA sequence data are made into a prototype of fragment of about 700 bases that can be analyzed by automatic sequence analyzer at one time, and then matching ratio is calculated by comparing a standard prototype with 3 fragmented clones of about 700 bases generated by the PCR method. In this process, the time for calculation of matching ratio is reduced by Compute Agreement algorithm. Two candidates of combined fragments of every prototype are extracted by the degree of overlapping of calculated fragment pairs, and then degree of combination is decided using a fuzzy reasoning method that utilizes the matching ratios of each extracted fragment, and A, C, G, T membership degrees of each DNA sequence, and previous frequencies of each A, C, G, T. In this paper. DNA sequence combination is completed by the iteration of the process to combine decided optimal test fragments until no fragment remains. For the experiments, fragments or about 700 bases were generated from each sequence of 10,000 bases and 100,000 bases extracted from 'PCC6803', complete protein genome. From the experiments by applying random notations on these fragments, we could see that the proposed method was faster than FAP program, and combination failure, defect of conventional contig assembly programs, did not occur.


  1. Staden, 'A new computer method for the storage and manipupulation of DNA gel reading data,' Nucl. Acids, Res. 8, pp.3673-3694, 1980
  2. Hannu, P., H. Soderlund and E. Ukkonen, 'SEQAID: a DNA sequence addembling program based on a mathmedical model,' Nucl. Acids, Res. 12, pp.307-321, 1984
  3. Xiaoqiu, H, 'A Contig Assembly Program Based on sensitive Detection of Fragment Overlaps,' Genomics, Res. 14, pp.18-25, 1992
  4. 이병욱, 박기정, 박완, 박용하, 'DNA 염기 서열의 단편 조립 프로그램 개발,' Kor. J. Appl. Microbiol. Biotechnol. 제25권, 6호, pp.560-565, 1997
  5. Georage J. K. and Eo Y., Fuzzy Sets and Fuzzy Logic Theory and Applications, Prentice Hall PTR, 1995
  6. Sanger, F., Nicklen, S., and Coulson, A.R. 'DNA Sequencing with chain terminator inhibitors,' Proc. Natal. Acad. Sci. USA 74, pp.5463-5467, 1977