Development and application of workbench for analysis and visualization of whole genome sequence

Jeong-Hyeon Choi, Hee Jeong Jin, Cheol Min Kim, Chul Hun L. Chang, Hwan Gue Cho

Research output: Contribution to journalArticlepeer-review

Abstract

An increasing number of genome sequencing projects results in explosive growth of whole genome sequences. Furthermore the number of studies on the functions of individual genes has also been rapidly increased. However on-memory algorithms are not applicable to the analysis of whole genome sequences, since the size of individual whole genome ranges from several million base pairs to hundreds billion base pairs. In order to effectively manipulate the huge sequence data, it is necessary to use the indexed data structure for external memory. In this paper, we introduce the development and application of the workbench for the analysis and visualization of whole genome sequences using string B-tree that is suitable for the analysis of huge data. This system consists of two main parts, the analysis query part and the visualization part. The query system supports various transactions such as pattern matching, k-occurrence, and k-mer analysis. The visualization system helps biologists to easily understand whole genome structure and specificity by various kinds of visualization such as whole genome sequence viewer, annotation viewer, CGR (Chaos Game Representation) viewer, k-mer viewer, RWP (Random Walk Plot) viewer, and map viewer. We can find the relationships among organisms, support gene prediction in a genome, and study the function of junk DNA using our workbench. In this paper, we apply our workbench to investigating specific sequence such as avoided sequence, common sequence, and classifiable sequence.

Original languageEnglish (US)
Pages (from-to)205-217
Number of pages13
JournalKorean Journal of Genetics
Volume24
Issue number2
StatePublished - Jun 2002
Externally publishedYes

Keywords

  • Avoided sequence
  • Chaos game representation
  • Classifiable sequence
  • Common sequence
  • Genome
  • Random walk plot
  • Sequence analysis
  • Workbench
  • k-mer analysis

ASJC Scopus subject areas

  • Genetics

Fingerprint

Dive into the research topics of 'Development and application of workbench for analysis and visualization of whole genome sequence'. Together they form a unique fingerprint.

Cite this