Overcoming the Current Limitations of Next-Generation Sequencing with New Methods for Local Assembly of Genomes and High-Specificity Rare Mutation Detection
MetadataShow full item record
The relatively low cost of Next-Generation Sequencing (NGS) has enabled researchers to generate large amounts of sequencing data in order to identify disease-causing mutations and to assemble simple genomes. However, NGS has inherent limitations due to the short DNA read lengths and high error rate associated with the technique. The short read lengths of NGS prevent the assembly of genomes with long stretches of repetitive DNA, and the high error rate prevents the accurate detection of rare mutations in heterogeneous populations such as tumors and microbiomes. I have co-developed new NGS methods to overcome these challenges. In order to increase the effective read length of NGS reads, local de novo assembly of short reads into long contigs can be achieved through the use of Paired-End Restriction-site Associated DNA Sequencing (RAD-PE-Seq). With the RAD-PE method, I sequenced a stickleback fosmid and generated contigs with an N50 length of 480 nucleotides. In order to eliminate false-positive mutations caused by the high error rate of NGS, the Paired-End Low Error Sequencing (PELE-Seq) method was developed, which uses numerous quality control measures during the sequencing library preparation and data analysis steps in order to effectively eliminate sequencing errors. Control testing of the PELE-Seq demonstrates that the method completely eliminates false-positive mutations at sequencing read depths below 20,000X coverage, compared to a ~20% false-positive rate obtained with previous methods. The high accuracy of the PELE-Seq method allows for the detection of ultra-rare mutations in a genome, which was previously impossible with NGS. This dissertation includes previously published and unpublished co-authored material.