You are here

Sequencing Pipelines

 

Sequencing Data Generation

Last updated 8.29.18
 
Sequencing for the Discovery and Extension Phases was conducted by three Large Scale Sequencing/Analysis Centers (LSACs): 1) the Human Genome Sequencing Center at Baylor College of Medicine, 2) Broad Institute, and 3) Genome Institute at Washington University.  The three LSACs generated paired-end sequencing BAM files mapped to build 37 using the settings in the table below. Discovery phase data mapped to build 37 is available for request through dbGaP, phs000572. BAM files are collected together from dbGaP, then called as project-level VCFs by Broad Institute and Baylor College of Medicine (these are intermediate files and not available to the public).  The ADSP Quality Control Work Group combines the two project-level VCF datasets and performs QC and concordance checks into an overall ADSP VCF file.
 
The Genome Center for Alzheimer's Disease (GCAD) remapped all Discovery and Extension phase data to build 38 using the settings in the table below. The whole genome data are available for request through the NIAGADS Data Sharing Service (DSS). More information about the data available can be found on the DSS ADSP study page.  
 
The GCAD Quality Control team along with assistance from the ADSP Quality Control Work Group performs QC and concordance checks and generates a whole-genome project level VCF file.
 

Sequencing Pipeline Tools and Parameters

 
  dbGaP- Build 37 Data NIAGADS DSS- Build 38 Data
Program Baylor Broad WashU GCAD (UPenn)
CASAVA 1.8.3 N/A 1.8.2 N/A
Reference GRCh37-lite GRCh37 (1kg version) GRCh37-lite GRCh38 1000G with HLA and Decoy genomes (GRCh38 with Alts)
Aligner BWA 0.6.2 BWA 0.5.9-tpx BWA 0.5.9 BWA 0.7.15
Aligner Parameters defaults; -t 8 defaults; -t N -q 5 defaults; -t 4 -q 5 without -M, with -K 100000000 and -Y
Sort/Dupe/Mates Picard 1.93 Picard (latest) Picard 1.46 Picard 2.8.1
Merge Picard 1.41 Picard (latest) Picard 1.46 Picard 2.8.1
GATK indels v2.5-2; 1kG, Mills, dbSNP137 v2.6-14 v2.4; 1kG, Mills, dbSNP137 v3.7;  1kG, Mills, dbSNP138
GATK recal v2.5-2 v2.6-14 v2.4 v3.7
ReduceReads no yes no no

 

Whole-Exome Target Regions

 
Broad Institute used the Illumina Rapid Capture Exome (ICE) kit, download target regions.
Baylor and WashU used the Nimblegen's VCRome v2.1, download target regions.
 

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer