Bioinformatics, 2008-2009, (1) Bioinformatics is a new subject of genetic data collection, analysis and dissemination to the research community (Dr. Hwa A. Lim 1987) Bioinformatics refers
to database-like activities, involving persistent sets of data that are maintained in a consistent state over essentially indefinite periods of time (Dr. Hwa A. Lim 1994) (Luscombe,2001) Bioinformatics, 2008-2009, (2) Bioinformatics is the field of science in which biology, computer science, and information
technology merge into a single discipline. The ultimate goal of the field is to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned. Biology in the 21st century is being transformed from a purely lab-based science to an information science as well. from NCBIs science primer www.ncbi.nlm.nih.gov/About/primer/bioinformatics.html Bioinformatics, 2008-2009,
Biology may be viewed as the study of transmission of information: from mother cell to daughter cell, from one cell or tissue type to another, from one generation to the next, and from one species to another. This informational viewpoint is termed bioinformatics DNA
(Eisenberg et al., 2006) Bioinformatics, 2008-2009, 1952 Sanger - 1955 Sanger - 1962 50,000~100,000 / - 1965 Margaret Dayhoff
1970 Needleman-Wunsch 1981 Smith-Waterman 1990 BLAST Bioinformatics, 2008-2009, Bioinformatics, 2008-2009, 20
Bioinformatics, 2008-2009, Insulin Chain A: 8-10 ASV TSI AGV (Brown et al., 1955) Made by GeneDoc Bioinformatics, 2008-2009,
Bioinformatics, 2008-2009, Dayhoff 1. 1978 Margaret Dayhoff 34 71 2. 85% 1572 3. PAM: Accepted Point Mutation 4. (Markov Model)
5. PAM1: ~1% Bioinformatics, 2008-2009, Bioinformatics, 2008-2009, 80 DNA
1. 1974 George I.Bell DNA GenBank 1982~1992 2. 1980 EMBL 3. 1984 DDBJ 4. Refseq Bioinformatics, 2008-2009,
Bioinformatics, 2008-2009, 1. Entrez D.Lipman 2. 3. : 4. : 5. / Bioinformatics, 2008-2009,
http://www.ncbi.nlm.nih.gov/sites/gquery Bioinformatics, 2008-2009, 1. 1970 Gibbs AJ McIntyre GA 2. 1970 Needleman-Wunsch 3. 1981 Smith-Waterman
4. FASTA & BLAST 5. CLustalW/X, POA, MUSCLE. Bioinformatics, 2008-2009, DNA AGCTAGGA GACTAGGC Bioinformatics, 2008-2009,
NeedlemanWunsch GATCTA GATCA Bioinformatics, 2008-2009, vs. ACTGTTCCGAA 100kbp AGCCTGA 100kbp ACTACTG ACGCCTG
ACTGTTCCGAA 100kbp AGCCTGA 100kbp ACTACTG AC------GCC------TG ACTGTTCCGAA 100kbp A-GCCTGA100kbp ACTACTG ACGCCTG Bioinformatics, 2008-2009, RNA
1. RNA 2. 3. RNA Bioinformatics, 2008-2009, RNA Bioinformatics, 2008-2009, ->
1. Ortholog ( ) Paralog ( ) 2. 3. (NeighborJoining), (Maximum Pasimony) (Maximum Likelihood) (MCMC) 4. Bioinformatics, 2008-2009, Ortholog vs.
Paralog : Xenolog: Experimentally very hard to answer.
Bioinformatics, 2008-2009, vs. speciation gene duplication ancestral gene
orthologs paralogs Bioinformatics, 2008-2009, v-jun vs. human AP-1/c-jun AP-1 Bioinformatics, 2008-2009,
Bioinformatics, 2008-2009, Bioinformatics, 2008-2009, 1. 2.
3. http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/milestones.html Bioinformatics, 2008-2009, 1. e.g.,
2. / Perl/Python PHP+ MySQL, JAVA 3. MP, NJ, ML 4. Bioinformatics, 2008-2009, (1) 1. RNA Riboswitch RNAs 5 (1)
RNA? (2) <23,000 (~50,000) (Alternatively Splicing) (3) RNA Bioinformatics, 2008-2009, MicroRNA
Bioinformatics, 2008-2009, Riboswitch RNAs Bioinformatics, 2008-2009, RNA Processing Processing & & Alternative Alternative RNA
Splicing Splicing Exon Intron 5Cap Poly A Termination Signal Site Alternative Exon
Transcription & 5 Capping Cleavage & Polyadenylation Poly A Signal Splicing AAAAA AAAAA
Regarded as a rare event (happening in < 5% human genes) till late 90s Bioinformatics, 2008-2009, (2) 2. (1) (2) (3)
Bioinformatics, 2008-2009, 5 -4 C TB TBC-6 TBC TBC-7
-8 0 TBC-1 -9 C TB TB C-2 TB C3
T TBC BC-11 12 TB C- 13 CTB TBC-1
TBC (Tre-2/Bub2/Cdc16) (Tre-2/Bub2/Cdc16) TBC Bioinformatics, 2008-2009, / TBC TBC-2
Chro Swissprot ID . Strand Start End 6359011 MmUsp6nl Q80XC3
2 ++ 6308329 HsUsp6nl Q92738 10
+- 11544449 11609570 HsUSP6 P35125 17 ++
4974387 HsTbc1d3 Q8IZP1 17 +- 33541609 33550662
HsTbc1d3p2 Q6PD72 17 +- 57698795 57706258 HsMGC51025
Q86UD7 17 ++ 15579388 15586064 Bioinformatics, 2008-2009, 5016994
FOXP2: Bioinformatics, 2008-2009, Out of Africa 1. ( ) 100,000~200,000 2. ~45,000 3.
4. 5. 6. Bioinformatics, 2008-2009, 53 (16,587bp)bp) Bioinformatics, 2008-2009,
Bioinformatics, 2008-2009, (3) 3. ~10% (1) (2)
(3) histone code Bioinformatics, 2008-2009, 1. <-> SUMO 2.
(thrombin) caspase Bioinformatics, 2008-2009, Sumoylation Phosphorylati on Palmitoylatio
n Ubiquitinatio Acetylati n on Bioinformatics, 2008-2009, Histonemodification modification Bioinformatics, 2008-2009,
Same genome but different epigenome Bioinformatics, 2008-2009, (4) 4. DNA>RNA-> (1)
(2) (3) Amyloid-like fibers Bioinformatics, 2008-2009, Bioinformatics, 2008-2009,
(5) 5. (1) (2)
Bioinformatics, 2008-2009, (6) 6. (1) 7. (1) / 8.
(1) (2) Bioinformatics, 2008-2009, Bioinformatics, 2008-2009, 1.
2. 3. / -> S/T Y 4. HMM, SVM, Bayesian ANN 5. / / / 6. Bioinformatics, 2008-2009,