DNA sequencing and analysis using NGS has become a powerful tool in biomedical research. It has given us the ability to understand and diagnose genetic disorders, investigate genetic predispositions for cancer, and explore the relationship between genotype and phenotype. Due to advancements in technology it is now possible to use sequencing as a rapid means of diagnosing illness and formulating a treatment. However, data analysis and interpretation can still be an important challenge in this field.

The WAVES™ DNA sequencing Analysis Kit supports whole genome, exome, and targeted resequencing data provided by NGS systems. Our variant analysis pipeline allows you to identify the genetic variation present in your samples and explore the variation with an intuitive variant filtering tool. The steps in this pipeline include quality assessment, trimming and filtering data, followed by genome alignment. Variants are then identified, evaluated, and annotated. Results can be filtered and displayed in the platform, viewed within the UCSC Genome Browser, or exported as a table or in VCF format.

Maverix Analytic Platform analysis overview for DNA-seq Analysis

Analysis Configuration

Analyzing DNA sequencing data on WAVES™ starts with uploading and describing your data. To simplify the input of your sample information, a template can be created in our system and filled out in a program like Microsoft Excel or a basic text editor. You can easily classify hundreds or thousands of samples with unique labels and identifiers, including adding phenotypic or non-genomic information to each sample.

Quality Assessment

Sequencing data is analyzed for quality and adaptor sequence. Raw reads are first trimmed to remove adapters and low quality bases and then filtered for length. FastQC is used to assess data both before and after trimming and filtering.

Read Alignment

Reads are mapped to the reference genome using BWA-mem. BAM files can be downloaded or viewed in the UCSC Genome Browser.

Variant Calling and Annotation

SNPs and indels are identified using one of four variant callers (GATK and FreeBayes, Platypus, and VarScan) and a VCF file is produced. SnpEff is used to determine the potential effect of each variant on the function of the gene and the VCF file is annotated with this information and the dbSNP ID. Conservation scores, allele frequencies from large public studies, and presence in disease-related databases are also available via the filtering tool on the platform.