Bioinformatics tools for - University of Connecticut

Bioinformatics tools for - University of Connecticut

Bioinformatics tools for viral quasispecies reconstruction from next-generation sequencing data and vaccine optimization PD: Ion Mndoiu, UConn Co-PDs: Mazhar Khan, UConn Rachel ONeill, UConn Alex Zelikovsky, GSU Outline Background & aims of the project Bioinformatics tools for quasispecies spectrum

reconstruction from NGS reads Experimental validation on IBV data Summary and ongoing work Infectious Bronchitis Virus (IBV) Group 3 coronavirus Biggest single cause of economic loss in US poultry farms Young chickens: coughing, tracheal rales, dyspnea Broiler chickens: reduced growth rate Layers: egg production drops 5-50%, thin-shelled, watery albumin

IBV-infected egg defects Worldwide distribution, with dozens of serotypes in circulation Co-infection with multiple serotypes is not uncommon, creating conditions for IBV-infected recombination embryo normal

embryo IBV Vaccination Broadly used, most commonly with attenuated live vaccine Short lived protection Layers need to be re-vaccinated multiple times during their lifespan

Vaccines might undergo selection in vivo and regain virulence [Hilt, Jackwood, and McKinley 2008] RNA Virus Replication High mutation rate (~10-4) Lauring & Andino, PLoS Pathogens 2011 Evolution of IBV Quasispecies identified by cloning and Sanger sequencing in both IBV infected poultry and

commercial vaccines [Jackwood, Hilt, and Callison 2003; Hilt, Jackwood, and McKinley 2008] How Are Quasispecies Contributing to Virus Persistence and Evolution? Variants differ in

Virulence Ability to escape immune response Resistance to antiviral therapies Tissue tropism Lauring & Andino, PLoS Pathogens 2011 Project Aims Develop bioinformatics tools for accurate reconstruction of quasispecies sequences and their frequencies from next-generation reads Study quasispecies persistence and evolution of IBV in commercial layer flocks following

vaccination Use results of this study to optimize vaccine development and vaccination protocols Outline Background & aims of the project Bioinformatics tools for quasispecies spectrum reconstruction from NGS reads Experimental validation on IBV data Summary and ongoing work Next Generation Sequencing

Illumina HiSeq 2000 Roche/454 FLX Titanium up to 6 billion PE reads/run 400-600 million reads/run 35-100bp read length Length up to 1,000 bp http://www.economist.com/node/16349358 10 Ion Torrent PGM 1-10M reads/run length up to 400bp

SOLiD 4/5500 1.4-2.4 billion PE reads/run 35-50bp read length Shotgun vs. Amplicon Reads Shotgun reads starting positions distributed ~uniformly Amplicon reads reads have predefined start/end positions covering fixed overlapping windows

Reconstruction from Shotgun Reads: ViSpA Shotgun reads Quasispecies sequences w/ frequencies Read Error Correction Frequency Estimation

Read Alignment Contig Assembly Preprocessing of Aligned Reads Read Graph Construction User Specified Parameters: (A) Number of mismatches

(B) Mutation rate Reconstruction from Amplicon Reads: VirA Errorcorrected SAM/BAM Read data Estimate Amplicons Amplicon Read Graph Reference in FASTA

format Viral population variants with frequencies Frequency Estimation Max-Bandwidth Paths Amplicon Sequencing Challenges Multiple reads from consecutive amplicons may match

over their overlap Distinct quasispecies may be indistinguishable in an amplicon interval Outline Background & aims of the project Bioinformatics tools for quasispecies spectrum reconstruction from NGS reads Experimental validation on IBV data Summary and ongoing work IBV Genome

Rev. Bras. Cienc. Avic. vol.12 no.2 Campinas Apr./June 2010 RT-PCR of S1 using redesigned primers Experiment 1 M42 Sample 53 plasmid clones 10 clone pool

C1 20% C2 20% C3 15% C4 15% C5 10% C6 10% C7 4% C8 4% C9 1% C10 1% 454 reads 454 reads

Assembled quasispecies V1 V2 V3 Vn Assembled quasispecies

PV1 PV2 PV3 PVk Evaluated Reconstruction Flows Reads Statistics & Coverage Number of Reads Sample Uncorrected

SAET Corrected Shorah Corrected KEC Corrected M42 isolate 53062 53062 50858

48945 M42 clone pool 21040 21040 19439 17122 Reads Validation

How well we predicted sanger clones How well our prediction is Average Prediction Error Neighbor-Joining Tree for M42 Sanger Clones & Vispa Qsps Experiment 2

Reads Statistics & Coverage Number of Reads Sample Uncorrected SAET corrected Shorah corrected KEC corrected M41 Vaccine

92113 92113 87883 85311 Field #1 38502 38502

33685 32521 Field #2 132513 132513 123370 111686

Field #3 76906 76906 71408 64507 Field #4 44467

44467 41653 37295 Neighbor-Joining Tree for Sanger clones and ViSpA Reconstructed Sequences Outline Background & aims of the project Bioinformatics tools for quasispecies spectrum

reconstruction from NGS reads Experimental validation on IBV data Summary and ongoing work Summary Developed software tools for quasispecies reconstruction from both shotgun and amplicon next-generation reads Code and executables freely available at http://alla.cs.gsu.edu/~software/VISPA/vispa.html http://alan.cs.gsu.edu/vira/ ViSpA plugin developed for users of ION Torrent, available on ION community

Experimental results on both simulated and real data show improved accuracy tradeoffs compared to previous methods Tools are applicable to quasispecies studies of other viruses Ongoing Work Deployment of ViSpA and VirA on Galaxy servers maintained at UConn and GSU Tool validation on ION Torrent reads Comparison of shotgun and amplicon based reconstruction methods Combining long and short read technologies Quasispecies persistence studies using longitudinal sampling Tool Validation for ION Torrent reads

Shotgun IBV reads generated using 316 ION chip 2,384,007 reads (1,177,740 after SAET correction) mean length 203.58 bp ViSpA results 23 quasispecies with estimated frequency > .5%, 2,200 total Longitudinal Sampling Amplicon / shotgun sequencing

Contributors Bassam Tork Ekaterina Nenastyeva Alex Artyomenko Serghei Mangul Nicholas Mancuso Alexander Zelikovsky University of Maryland Irina Astrovskaya, Ph.D. University of Connecticut: Rachel ONeal, PhD.

Mazhar Kahn, Ph.D. Hongjun Wang, Ph.D. Craig Obergfell Andrew Bligh

Recently Viewed Presentations

  • The Strange Fascinations of Noah Hypnotik by David Arnold

    The Strange Fascinations of Noah Hypnotik by David Arnold

    Ms. Bixby's Last Day by John David Anderson. Loving their gifted teacher, three boys are dismayed when she falls ill and leaves for the rest of the school year, a situation that compels them to share their stories while cutting...
  • Respiratory System Review

    Respiratory System Review

    Which of the following is the correct sequence of the organs of the digestive tract? mouth, stomach, esophagus, small intestine, large intestine
  • Fossils & The Geologic Time Scale

    Fossils & The Geologic Time Scale

    Fossils & The Geologic Time Scale Fossils Preserved remains or traces of an organism that lived in the past. Fossils are formed when organisms die and are buried in sediment. Eventually the sediment builds up and hardens to become sedimentary...
  • The Role of The Deacon - Ibfna

    The Role of The Deacon - Ibfna

    Deacons Do Not Have a Role Similar to Boards in Corporations Pastor is the CEO & Deacons are the Board The Role of the Deacon Idea of Officers from 1 Tim 3:10 KJV Deacons Are Not Representatives of the Congregation...
  • Timeline - Industrial ISD

    Timeline - Industrial ISD

    About Veronica McManus. Traveling has always been a passion of mine since I was very young. Honestly, it is the best education you can gift yourself. It is so enjoyable to watch others become excited and know that their journey...
  • Derivativos - Antonio Lopo

    Derivativos - Antonio Lopo

    Desconto Comercial ou Bancário PV = FV (1 - d n) Commercial Paper Titulo de curto prazo emitido por sociedade tomadora de recursos para financiar seu capital de giro A garantia do título é o desempenho da empresa Adquiridos pelas...
  • COMPUTER SYSTEMS An Integrated Approach to Architecture and

    COMPUTER SYSTEMS An Integrated Approach to Architecture and

    11.2.4 File Allocation Table (FAT) Divide disk into partitions. Each partition has a FAT. The directory just has a pointer into the starting sector entry in the FAT for each file. Less chance for errors than linked allocation. FAT becomes...
  • Med - I Tutorial

    Med - I Tutorial

    A 40 year old man presented with fatigue, fever and painful goitre. The results of investigations were as follows: Hb 14.3 g/dl. WBC 11.2 x 109/l