You are here

Instructions for ADCs

Instructions for Access to ADSP hg37-Mapped Data through dbGaP for Subjects in your Alzheimer’s Disease Center (ADC)

Last updated 8/19/15

Download a copy of this page in PDF format


This document provides information on how to request access to Alzheimer’s Disease Sequencing Project (ADSP) data for subjects from your Alzheimer’s Disease Center (ADC). Information and instructions in this document are for your reference only, and do not guarantee approval to access or match individual ADSP subjects.  

Many samples included in the ADSP whole genome sequencing (WGS) and whole exome sequencing (WES) study were contributed by the National Institute on Aging (NIA) funded ADCs. The ADSP has released WGS data for 578 subjects and whole-exome sequencing (WES) data for 10,914 subjects to the research community through dbGaP. Please see the ADSP website  for more details on the study.  

When applying for the data, every effort should be made to comply with the NIH Genomics Data Sharing Policy, taking special note of the following provision in the policy:  

4.  Non-Identification

Approved Users agree not to use the requested datasets, either alone or in concert with any other information, to identify or contact individual participants from whom data and/or DNA samples were collected. This provision does not apply to research investigators operating with specific IRB approval, pursuant to 45 CFR 46, to contact individuals within datasets or to obtain and use identifying information under an IRB approved research protocol. All investigators conducting human subjects research² within the scope of 45 CFR 46 must comply with the requirements contained therein.

Requests for ADSP data are reviewed by the ADSP Data Access Committee (DAC).  The relevant IRB(s) and NIH reviewing body have the final say on the approval of your research plan and dbGaP Data Access Request (DAR).

The relevant IRB approval should indicate that the issue of matching sequence data to subject data from your ADC has been taken into consideration, based upon the participants’ consent.  Specifically, the IRB approval face sheet and the DAR that investigators provide when they request ADSP data must state that the IRB has taken into account the issue of matching.  Applications not indicating this approval will be returned to applicants.

By following the procedure for matching sequence data to subject data as described in this document, you will ascertain data only for subjects from your own ADC. The information below applies to the ADSP WGS and WES and related data. If your research does not require matching of ADC subjects, please follow the Application Process for ADSP Sequencing Data page at the ADSP website ( to submit your dbGaP Data Access Request.

Specific information on the application process

  1. To access ADSP data all investigators must submit a Data Access Request (DAR) to dbGaP for review ( Only approved investigators will have access to subject-level ADSP data.
  2. To ensure proper de-identification of individual subjects in the study, each ADSP subject is assigned a unique ADSP ID.
  3. In order for ADCs to link ADSP data to their own Center’s clinical records, the ADSP data need to be matched. This is done using the original patient IDs (PTIDs) from the contributing ADC.
  4. The dbGaP Approved User Code of Conduct ( expects that investigators will obtain IRB approval in order to match the ADSP subjects from their Center (and only from their Center). The IRB approval face sheet and the DAR that investigators provide when they request ADSP data must state that the IRB has taken into account the issue of matching.  Applications not indicating this approval will be returned to applicants.
  5. Use of ADSP data is restricted to the research activities outlined in the approved dbGaP DAR. Investigators wishing to do any additional studies with the data must apply separately for approval to use the data.
  6. The dbGaP Approved User Code of Conduct prohibits approved investigators from sharing genetic data from dbGaP except with the specific individuals who are listed in the same DAR; these are usually data users from the same institution as that signing the DAR. If you wish to add individuals to the study, a new application must be submitted.
  7. Neither the ADSP nor dbGaP can distribute modified ADSP data to the ADCs.
  8. Each individual ADC wanting to access ADSP data needs to process the unique data set from the ADSP whole data set for the matching process.

Overview of the Required Steps

The rest of this document outlines steps required to access and match ADC subject data with ADSP sequence data.

All subjects in the ADSP dataset are assigned a unique ADSP subject ID that can be used for matching. The information for mapping ADSP IDs to the original patient IDs from the contributing ADCs is available from The National Alzheimer’s Coordination Center (NACC). Please contact Duane Beekley at NACC ( to obtain this key. For each ADC, NACC will provide ID mapping information ONLY for subjects from that specific ADC.

Each ADC should follow the 5 steps below to access and match ADSP data to its subjects.

  1. Submit research protocol to your institutional IRB for approval.
  2. Obtain ADSP ADC Patient ID mapping information from NACC.
  3. Submit Data Access Request to dbGaP for approval.
  4. Specify subjects from the ADC and download the corresponding ADSP data from dbGaP via the ADSP portal.
  5. Identify ADC subjects by replacing ADSP IDs with corresponding ADC Patient IDs using mapping information from NACC.

Step 1. Prepare research protocol for IRB approval

ADSP sequence data are de-identified when they are released. Subjects cannot be matched without having explicit knowledge of the subject’s identity, such as mapping information for the original subjects or extensive genetic knowledge from the subject or his/her relatives. The dbGaP Genomic Data User Code of Conduct ( states that:

  1. Investigator(s) will use requested datasets solely in connection with the research project described in the approved Data Access Request for each dataset.
  2. Investigator(s) will make no attempt to identify or contact individual participants from whom these data were collected without appropriate approvals from the relevant IRBs.

If you plan to ascertain data for subjects whose DNA has been sequenced by the ADSP, you will need a full IRB review and explicit permission from the IRB to do so. If you only plan to work on de-identified data, an expedited review may be sufficient. The IRB from your institute has the authority to approve or deny your proposal.

What information should be included in your IRB protocol? A typical IRB protocol should describe the research plan, subjects being studied, and a benefit/risk analysis as justification of the research plan.

1. Which subjects sequenced by the ADSP have data that will be matched in this protocol?

If you follow the matching procedure we described in this document, you will only match  subjects from your own ADC.

2. How will the subjects be matched?

Your IRB protocol should describe the workflow to match relevant subjects in the ADSP data. The following paragraph describes the procedure planned by NIAGADS, NCRAD, and NACC and outlined in Steps 1, 4, and 5. If your workflow is different, you should edit or prepare your own description in your IRB protocol.

For XYZ study, we will receive the key to map ADSP IDs and SRA accession numbers to the original patient identification IDs (PTIDs) from the National Alzheimer’s Coordination Center (NACC). We will receive the mapping key only for subjects recruited from our own ADC. The key has been prepared by the National Cell Repository for Alzheimer's Disease (NCRAD) and the NIA Genetics of Alzheimer's Disease Data Storage Site (NIAGADS). We will download ADSP individual level data from dbGaP, then use the mapping key to match subjects to our PTIDs in the sequencing data, genotype data, pedigrees, and phenotype data files. Only subjects from our ADC will be matched.

3. Why is matching necessary? What is the risk for the subjects who are being matched? (Benefit versus Risk of the research project)

The purpose of matching the ADSP subjects from your ADC is to link the ADSP data with clinical records on your subject. If your research can be carried out without identification of the ADSP subjects, you should avoid matching the subjects altogether. If matching is necessary, you should provide a good scientific justification for proposed research activities.

What information should be provided by the IRB?

In addition to the standard information to be provided by the IRB for dbGaP applications, the IRB approval face sheet must state that the IRB has taken into account the issue of

What about incidental findings? Incidental findings are genetic findings that are irrelevant to the purpose and scope of the proposed analysis. Sometimes these incidental findings may have clinical implications (e.g. genetic variants that have high predisposition to diseases such as certain BRCA1 mutations that may cause breast cancer) and/or ethical ramifications (e.g. previously unknown relatedness or non-relatedness to other individuals). The ADSP Memorandum of Understanding (MOU) ( states that incidental findings will not be revealed to the subjects.

Step 2. Obtain ADSP ID to ADC Patient ID mapping from NACC

The Sequencing Read Archive (SRA) assigns a unique SRR (sequencing run) ID to each sample/sequencing experiment. Raw data (sequencing read and alignment information) are stored using the SRA file format, a file format developed by SRA that can store sequencing data more efficiently. To download the sequencing raw data for a particular sample, you need to know the corresponding SRR ID for the sample.

NACC will provide each ADC with the ID mapping information for the subjects contributed by that specific ADC.

Mapping Information can be downloaded from using your center log-in information. Please contact Duane Beekly ( or if you have questions. If you do not have a NACC web account, please send an email to and ask for an ADSP account.

Each sample contains the following information:

  1. The ADSP Subject ID for the subject being sequenced.
  2. The ADSP Sample ID for the DNA sample being sequenced. The Sample ID contains the corresponding Subject ID as part of the full ID text.
  3. The SRR run ID assigned to the sequencing data.
  4. The original Patient ID (PTID) assigned by the contributing ADC.

Step 3. Apply for access to ADSP data at dbGaP

Please see ( for instructions to prepare for the Data Access Request (DAR).

  1. Make sure the research plan in dbGaP is the same as  the protocol approved by your institute IRB: use of ADSP data is restricted to the research activities outlined in the approved DAR.
  2. (Special requirement for ADSP data) The applicant should provide the following additional information in your application.
    • IRB approval in compliance with the NIH Genomics Sharing Policy Investigator must submit a current IRB approval for the proposed dbGaP project with the specific information that the IRB has taken into account the issue of matching.   
    • Derived/Secondary Data Return Plan.   The Investigator must describe the derived/secondary data that will be return to the NIA Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS). See Sample-Data-Return-NIAGADS WORD
    • National Institute of Aging (NIA) Genomic Data Sharing Plan. This document must be signed by the Investigator and his/her Institutional Signing Official.  See NIA Genomics Sharing Policy WORD.        
    • The NIAGADS Data Distribution Agreement.  This document must be signed by the Investigator and his/her Institutional Signing Official.  See

Step 4. Download ADSP data.

Once you have received approval from dbGaP, you can access ADSP data by logging in to dbGaP using your ERA Commons account. We recommend you use the ADSP portal ( to quickly specify and download ADC samples.

1. If you do not have an ERA Commons account, contact your department or institute office for research to set up your ERA Commons account at NIH.

2. Go to the ADSP portal: go the ADSP website at, then click the Green “Access Data” button at the bottom of the ADSP website.

3. The browser will display a popup message reminding you that you will be redirected to the NIH iTrust website for authentication. Click the “NIH Login” button, then login using your ERA Commons account and password at the NIH iTrust webpage. Successful authentication will direct you back to the ADSP portal web page.

4. You can use the ADSP portal to select the samples you want to download. If you want to quickly download all samples from your ADC, follow these two steps:

1. Make a text file with the SRR IDs corresponding to the samples you want to download. Each line in the text file contains exactly one SRR ID.

2. Drag the text file and drop it onto the ADSP portal web page. The ADSP portal will automatically enter all SRR IDs into your shopping cart.

5. Select the shopping cart tab and click “Check out”. The ADSP portal will save your selection in a cart file.

6. Use SRA Toolkit to download files specified in your shopping cart. See for instructions on how to set up SRA Toolkit and use your shopping cart file.

To learn more about how to use the ADSP data portal, select DATA PORTAL USER GUIDE from the DATA ACCESS menu on the ADSP Website.

Step 5. Match ADC subjects in ADSP data

BAM/SRA files

SRA files are named using the SRR IDs for sequenced subjects. You can rename the files using the ID mapping information from NACC via your ADC.

VCF files

Currently VCF files for the ADSP WES data are not available. We will provide instructions when the VCF files become available.

Phenotype Files

Phenotype data are stored as tables either as text files or excel spreadsheets from dbGaP. Column labels and coding schemes are available at dbGaP or ADSP website. You can re-label the subjects in the phenotype data using the ID mapping information from NACC via your ADC.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer