• Title/Summary/Keyword: Genome sequences

Search Result 837, Processing Time 0.031 seconds

Characteristics of Microsatellites in the Transcript Sequences of the Laccaria bicolor Genome

  • Li, Shuxian;Zhang, Xinye;Yin, Tongming
    • Journal of Microbiology and Biotechnology
    • /
    • v.20 no.3
    • /
    • pp.474-479
    • /
    • 2010
  • In this paper, we analyzed the microsatellites in the transcript sequences of the whole Laccaria bicolor genome. Our results revealed that, apart from the triplet repeats, length diversification and richness of the detected microsatellites positively correlated with their repeat motif lengths, which were distinct from the variation trends observed for the transcriptional microsatellites in the genome of higher plants. We also compared the microsatellites detected in the genic regions and in the nongenic regions of the L. bicolor genome. Subsequently, SSR primers were designed for the transcriptional microsatellites in the L. bicolor genome. These SSR primers provide desirable genetic resources to the ectomycorrhizae community, and this study provides deep insight into the characteristics of the micro satellite sequences in the L. bicolor genome.

ChimerDB - Database of Chimeric Sequences in the GenBank

  • Kim, Namshin;Shin, Seokmin;Cho, Kwang-Hwi;Lee, Sanghyuk
    • Genomics & Informatics
    • /
    • v.2 no.2
    • /
    • pp.61-66
    • /
    • 2004
  • Fusion proteins resulting from chimeric sequences are excellent targets for therapeutic drug development. We developed a database of chimeric sequences by examining the genomic alignment of mRNA and EST sequences in the GenBank. We identified 688 chimeric mRNA and 20,998 chimeric EST sequences. Including EST sequences greatly expands the scope of chimeric sequences even though it inevitably accompanies many artifacts. Chimeric sequences are clustered according to the ECgene ID so that the user can easily find chimeric sequences related to a specific gene. Alignments of chimeric sequences are displayed as custom tracks in the UCSC genome browser. ChimerDB, available at http://genome.ewha.ac.kr/ECgene/ChimerDB/, should be a valuable resource for finding drug targets to treat cancers.

Genome data mining for everyone

  • Lee, Gir-Won;Kim, Sang-Soo
    • BMB Reports
    • /
    • v.41 no.11
    • /
    • pp.757-764
    • /
    • 2008
  • The genomic sequences of a huge number of species have been determined. Typically, these genome sequences and the associated annotation data are accessed through Internet-based genome browsers that offer a user-friendly interface. Intelligent use of the data should expedite biological knowledge discovery. Such activity is collectively called data mining and involves queries that can be simple, complex, and even combinational. Various tools have been developed to make genome data mining available to computational and experimental biologists alike. In this mini-review, some tools that have proven successful will be introduced along with examples taken from published reports.

Identification of chromosomal translocation causing inactivation of the gene encoding anthocyanidin synthase in white pomegranate (Punica granatum L.) and development of a molecular marker for genotypic selection of fruit colors

  • Jeong, Hyeon-ju;Park, Moon-Young;Kim, Sunggil
    • Horticulture, Environment, and Biotechnology : HEB
    • /
    • v.59 no.6
    • /
    • pp.857-864
    • /
    • 2018
  • Previous studies have not detected transcripts of the gene encoding anthocyanidin synthase (ANS) in white pomegranates (Punica granatum L.) and suggest that a large-sized insertion in the coding region of the ANS gene might be the causal mutation. To elucidate the identity of the putative insertion, 3887-bp 5' and 3392-bp 3' partial sequences of the insertion site were obtained by genome walking and a gene coding for an expansin-like protein was identified in these genome-walked sequences. An identical protein (GenBank accession OWM71963) isolated from pomegranate was identified from BLAST search. Based on information of OWM71963, a 5.8-Mb scaffold sequence with genes coding for the expansin-like protein and ANS were identified. The scaffold sequence assembled from a red pomegranate cultivar also contained all genome-walked sequences. Analysis of positions and orientations of these genes and genome-walked sequences revealed that the 27,786-bp region, including the 88-bp 5' partial sequences of the ANS gene, might be translocated into an approximately 22-kb upstream region in an inverted orientation. Borders of the translocated region were confirmed by PCR amplification and sequencing. Based on the translocation mutation, a simple PCR codominant marker was developed for efficient genotyping of the ANS gene. This molecular marker could serve as a useful tool for selecting desirable plants at young seedling stages in pomegranate breeding programs.

Functional Annotation and Analysis of Korean Patented Biological Sequences Using Bioinformatics

  • Lee, Byung Wook;Kim, Tae Hyung;Kim, Seon Kyu;Kim, Sang Soo;Ryu, Gee Chan;Bhak, Jong
    • Molecules and Cells
    • /
    • v.21 no.2
    • /
    • pp.269-275
    • /
    • 2006
  • A recent report of the Korean Intellectual Property Office(KIPO) showed that the number of biological sequence-based patents is rapidly increasing in Korea. We present biological features of Korean patented sequences though bioinformatic analysis. The analysis is divided into two steps. The first is an annotation step in which the patented sequences were annotated with the Reference Sequence (RefSeq) database. The second is an association step in which the patented sequences were linked to genes, diseases, pathway, and biological functions. We used Entrez Gene, Online Mendelian Inheritance in Man (OMIM), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO) databases. Through the association analysis, we found that nearly 2.6% of human genes were associated with Korean patenting, compared to 20% of human genes in the U.S. patent. The association between the biological functions and the patented sequences indicated that genes whose products act as hormones on defense responses in the extra-cellular environments were the most highly targeted for patenting. The analysis data are available at http://www.patome.net

Divergent long-terminal-repeat retrotransposon families in the genome of Paragonimus westermani

  • Bae, Young-An;Kong, Yoon
    • Parasites, Hosts and Diseases
    • /
    • v.41 no.4
    • /
    • pp.221-231
    • /
    • 2003
  • To gain information on retrotransposons in the genome of Paragonimus westermani, PCR was carried out with degenerate primers, specific to protease and reverse transcriptase (rt) genes of long-terminal-repeat (LTR) retrotransposons. The PCR products were cloned and sequenced, after which 12 different retrotransposon-related sequences were isolated from the trematode genome. These showed various degrees of identity to the polyprotein of divergent retrotransposon families. A phylogenetic analysis demonstrated that these sequences could be classified into three different families of LTR retrotransposons, namely, Xena, Bel, and Gypsy families. Of these, two mRNA transcripts were detected by reverse transcriptase-PCR, showing that these two elements preserved their mobile activities. The genomic distributions of these two sequences were found to be highly repetitive. These results suggest that there are diverse retrotransposons including the ancient Xena family in the genome of P. westermani, which may have been involved in the evolution of the host genome.

Compiling Multicopy Single-Stranded DNA Sequences from Bacterial Genome Sequences

  • Yoo, Wonseok;Lim, Dongbin;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.14 no.1
    • /
    • pp.29-33
    • /
    • 2016
  • A retron is a bacterial retroelement that encodes an RNA gene and a reverse transcriptase (RT). The former, once transcribed, works as a template primer for reverse transcription by the latter. The resulting DNA is covalently linked to the upstream part of the RNA; this chimera is called multicopy single-stranded DNA (msDNA), which is extrachromosomal DNA found in many bacterial species. Based on the conserved features in the eight known msDNA sequences, we developed a detection method and applied it to scan National Center for Biotechnology Information (NCBI) RefSeq bacterial genome sequences. Among 16,844 bacterial sequences possessing a retron-type RT domain, we identified 48 unique types of msDNA. Currently, the biological role of msDNA is not well understood. Our work will be a useful tool in studying the distribution, evolution, and physiological role of msDNA.

Patome: Database of Patented Bio-sequences

  • Kim, SeonKyu;Lee, ByungWook
    • Genomics & Informatics
    • /
    • v.3 no.3
    • /
    • pp.94-97
    • /
    • 2005
  • We have built a database server called Patome which contains the annotation information for patented bio-sequences from the Korean Intellectual Property Office (KIPO). The aims of the Patome are to annotate Korean patent bio-sequences and to provide information on patent relationship of public database entries. The patent sequences were annotated with Reference Sequence (RefSeq) or NCBI's nr database. The raw patent data and the annotated data were stored in the database. Annotation information can be used to determine whether a particular RefSeq ID or NCBI's nr ID is related to Korean patent. Patome infrastructure consists of three components­the database itself, a sequence data loader, and an online database query interface. The database can be queried using submission number, organism, title, applicant name, or accession number. Patome can be accessed at http://www.patome.net. The information will be updated every two months.

Cloning of Notl-linked DNA Detected by Restriction Landmark Genomic Scanning of Human Genome

  • Kim Jeong-Hwan;Lee Kyung-Tae;Kim Hyung-Chul;Yang Jin-Ok;Hahn Yoon-Soo;Kim Sang-Soo;Kim Seon-Young;Yoo Hyang-Sook;Kim Yong-Sung
    • Genomics & Informatics
    • /
    • v.4 no.1
    • /
    • pp.1-10
    • /
    • 2006
  • Epigenetic alterations are common features of human solid tumors, though global DNA methylation has been difficult to assess. Restriction Landmark Genomic Scanning (RLGS) is one of technology to examine epigenetic alterations at several thousand Notl sites of promoter regions in tumor genome. To assess sequence information for Notl sequences in RLGS gel, we cloned 1,161 unique Notl-linked clones, compromising about 60% of the spots in the soluble region of RLGS profile, and performed BLAT searches on the UCSC genome server, May 2004 Freeze. 1,023 (88%) unique sequences were matched to the CpG islands of human genome showing a large bias of RLGS toward identifying potential genes or CpG islands. The cloned Notl-loci had a high frequency (71%) of occurrence within CpG islands near the 5' ends of known genes rather than within CpG islands near the 3' ends or intragenic regions, making RLGS a potent tool for the identification of gene-associated methylation events. By mixing RLGS gels with all Notl-linked clones, we addressed 151 Notl sequences onto a standard RLGS gel and compared them with previous reports from several types of tumors. We hope our sequence information will be useful to identify novel epigenetic targets in any types of tumor genome.

NOGSEC: A NOnparametric method for Genome SEquence Clustering (녹섹(NOGSEC): A NOnparametric method for Genome SEquence Clustering)

  • 이영복;김판규;조환규
    • Korean Journal of Microbiology
    • /
    • v.39 no.2
    • /
    • pp.67-75
    • /
    • 2003
  • One large topic in comparative genomics is to predict functional annotation by classifying protein sequences. Computational approaches for function prediction include protein structure prediction, sequence alignment and domain prediction or binding site prediction. This paper is on another computational approach searching for sets of homologous sequences from sequence similarity graph. Methods based on similarity graph do not need previous knowledges about sequences, but largely depend on the researcher's subjective threshold settings. In this paper, we propose a genome sequence clustering method of iterative testing and graph decomposition, and a simple method to calculate a strict threshold having biochemical meaning. Proposed method was applied to known bacterial genome sequences and the result was shown with the BAG algorithm's. Result clusters are lacking some completeness, but the confidence level is very high and the method does not need user-defined thresholds.