You are here

Sequencing Pipelines


Sequencing Data Generation

The three Large Scale Sequencing/Analysis Centers (LSACs) sequencing ADSP data include the Human Genome Sequencing Center at Baylor College of Medicine, Broad Institute, and Genome Institute at Washington University.  The three LSACs generated paired-end sequencing BAM files (currently available in dbGaP, phs000572) using the settings below.
BAM files are collected together from dbGaP, then called as project-level VCFs by Broad Institute and Baylor College of Medicine (these are intermediate files and not available to the public).  The ADSP Quality Control Work Group combines the two project-level VCF datasets and performs QC and concordance checks into an overall ADSP VCF file.

Sequencing Pipeline Tools and Parameters

Program Baylor Broad WashU
CASAVA 1.8.3 N/A 1.8.2
Reference GRCh37-lite GRCh37 (1kg version) GRCh37-lite
Aligner BWA 0.6.2 BWA 0.5.9-tpx BWA 0.5.9
Aligner Parameters defaults; -t 8 defaults; -t N -q 5 defaults; -t 4 -q 5
Sort/Dupe/Mates Picard 1.93 Picard (latest) Picard 1.46
Merge Picard 1.41 Picard (latest) Picard 1.46
GATK indels v2.5-2; 1kG, Mills, dbSNP137 v2.6-14 v2.4; 1kG, Mills, dbSNP137
GATK recal v2.5-2 v2.6-14 v2.4
ReduceReads no yes no


Whole-Exome Target Regions

Broad Institute used the Illumina Rapid Capture Exome (ICE) kit, download target regions.
Baylor and WashU used the Nimblegen's VCRome v2.1, download target regions.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer