NG00128 - Proteomic profiling identified plasma biomarkers for SARS-CoV-2 infection and severity of COVID-19 patients

To access this data, please log into DSS and submit an application.
Within the application, add this dataset (accession NG00128) in the “Choose a Dataset” section.
Once approved, you will be able to log in and access the data within the DARM portal.

Description

The goal of WU350 cohort is to address the many complexities of the COVID-19 pandemic. Among the 332 COVID-19 cases, ~90% were symptomatic patients, 93.7% were hospitalized, 46.7% with ICU admission, 24.7% on ventilation, and 19.0% died due to COVID-19 (82 ventilated and 63 died; 44 of the deceased had been ventilated prior to death). COVID-19 patients were 59 years old on average, 58.7% men and 67.8% of African American ancestry.

A total of 150 age-, sex-, and race-matched non-COVID-19 samples were used as controls. Controls samples were collected from the Charles F. and Joanne Knight Alzheimer Disease Research Center (Knight-ADRC), at Washington University in St. Louis. The Knight-ADRC is one of 30 ADRCs funded by NIH. The goal of this collaborative research effort is to advance AD research with the ultimate goal of treatment or prevention of AD.

From the 482 individuals, peripheral blood was collected, and plasma was isolated by centrifuge and stored at -80⁰C. The proteomic data in plasma was measured using SomaScan v4.1 7K, a multiplexed, single-stranded DNA aptamer-based platform from SomaLogic (Boulder, CO). Instead of physical units, the readout in relative fluorescent units (RFU) was used to report the protein concentration targeted by 7,055 modified aptamers.

Additional information can be found on the websites:
https://neurogenomics.wustl.edu/
https://covid.proteomics.wustl.edu/

Sample Summary per Data Type

Sample Set	Accession	Data Type	Number of Samples
Knight ADRC & WU350	snd10038	Proteomics	482

Available Filesets

Fileset	Accession	Latest Release	Description
COVID19 - Proteomic and Phenotypic Data	fsa000034	NG00128.v1	Proteomic and Phenotypic Data

View the File Manifest for a full list of files released in this dataset.

COVID-19 cases (N=350) who presented with respiratory illness symptoms and had a physician-ordered positive SARS-CoV-2 test performed at the Barnes Jewish Hospital between 26 March 2020 and 28 August 2020 (Washington University 350 (WU350) cohort). Knight-ADRC cohort collects cognitive data, plasma, CSF and imaging to study the risk factors for Alzheimer’s disease. 150 age, sex and race matched Knight-ADRC cohort participants were used as COVID-19 Controls.

Sample Set	Accession	Number of Subjects
Knight ADRC & WU350	snd10038	482

Consent Level	Number of Subjects
DS-ADRD-IRB-PUB	482

Visit the Data Use Limitations page for definitions of the consent levels above.

Total number of approved DARs: 4

Investigator:
Cruchaga, Carlos
Institution:
Washington University School of Medicine
Project Title:
The Familial Alzheimer Sequencing (FASe) Project
Date of Approval:
March 28, 2023
Request status:
Expired
Research use statements:
Show statements
Technical Research Use Statement:
The goal of this study is to identify new genes and mutations that cause or increase risk for Alzheimer disease (AD), as well as protective factors. Individuals and families were selected from the Knight-ADRC (Washington University) and the NIA-LOAD study. Only families with at least three first-degree affected individuals were included. Families with pathogenic variants in the known AD or FTD genes, or in which APOE4 segregated with disease were excluded. At least two cases and one control were selected per family. Cases had an age at onset (AAO) after 65 yo and controls had a larger age at last assessment than the latest AAO within the family. Whole exome (WES) and whole genome sequencing (WGS) was generated for 1,235 individuals (285 families) that together with data from our collaborators and the ADSP family-based cohort (3,449 individuals and 757 families) will provide enough statistical power to identify new genes for AD. Dr. Tanzi (Harvard Medical School) will provide WGS from 400 families from the NIMH Alzheimer disease genetics initiative study. We will perform single variant and gene-based analyses to identify genes and variants that increase risk for disease in AD families. Single variant analysis will consist of a combination of association and segregation analyses. We will run family-based gene-based methods to identify genes that show and overall enrichment of variants in AD cases. We will also look for protective and modifier variants. To do this we will identify families loaded with AD cases, that also include individuals with a high burden of known risk variants but that do not develop the disease (escapees). We will use the sequence data and the family structure to identify variants that segregate with the escapee phenotype. The most promising variants and genes will be replicated in independent datasets (ADSP case-control, ADNI, Knight-ADRC, NIA-LOAD ). We will perform single variant and gene-based analyses to replicate the initial findings, and survival analysis to replicate the protective variants. We will select the most promising variants/genes for functional studies
Non-Technical Research Use Statement:
Family-based approaches led to the identification of disease-causing Alzheimer’s Disease (AD) variants in the genes encoding APP, PSEN1 and PSEN2. The identification of these genes led to the A?-cascade hypothesis and to the development of drugs that target this pathway. Recently, we have identified rare coding variants in TREM2, ABCA7, PLD3 and SORL1 with large effect sizes for risk for AD, confirming that rare coding variants play a role in the etiology of AD. In this proposal, we will identify rare risk and protective alleles using sequence data from families densely affected by AD. We hypothesize that these families are enriched for genetic risk factors. We already have sequence data from 695 families (2,462 individuals), that combined with the ADSP and the NIMH dataset will lead to a dataset of more than 1,042 families (4,684 individuals). Our preliminary results support the flexibility of this approach and strongly suggest that protective and risk variants with large effect size will be found, which will lead to a better understanding of the biology of the disease.
Investigator:
Greicius, Michael
Institution:
Stanford University School of Medicine
Project Title:
Examining Genetic Associations in Neurodegenerative Diseases
Date of Approval:
May 22, 2023
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
We are studying the effects of rare (minor allele frequency < 5%) genetic variants on the risk of developing late-onset Alzheimer’s Disease (AD). We are interested in variants that have a protective effect in subjects who are at an increased genetic risk, or variants that lead to multiple dementias. Our aim is to identify any genetic variants that are present in the “case” group but not the “AD control” groups for both types of variants. The raw data we receive will be annotated to identify SNP locations and frequencies using existing databases such as 1,000 Genomes. We will filter the data based on genetic models such as compounded heterozygosity, recessive and dominant models to identify different types of variants.
Non-Technical Research Use Statement:
Current genetic understanding of Alzheimer’s Disease (AD) does not fully explain its heritability. The APOE4 allele is a well-established risk factor for the development of Alzheimer’s Disease (AD). However, some individuals who carry APOE4 remain cognitively healthy until advanced ages. Additionally, the cause of mixed dementia pathology development in individuals remains largely unexplained. We aim to identify genetic factors associated with these “protected” and mixed pathology phenotypes.
Investigator:
Pendergrass, Rion
Institution:
Genentech
Project Title:
Genetic Analyses Using Data from the Alzheimer’s Disease Sequencing Project (ADSP) and related studies
Date of Approval:
August 30, 2023
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
The purpose of our study is to identify novel genetic factors associated with Alzheimer’s Disease, corticobasal degeneration (CBD) and progressive supranuclear palsy (PSP). This includes identifying genetic factors associated with the risk of these conditions, as well as genetic risk factors associated with age-at-onset (AAO) for these conditions. We will also evaluate genetic associations with sub-phenotypes individuals have within these broad disease categories, such as their Braak staging results which provide insights into the level of severity of Alzheimer’s. Thus we are requesting access to the set of genomic Whole Exome and Whole Genome Sequences (WES and WGS) have just been released through the National Institute on Aging Genetics of Alzheimer’s Disease Data Storage Site (DSS NIAGADS). The findings from our genetic association testing have the potential for identification of new therapeutic targets for Alzheimer's Disease, CBD, and PSP. The findings from our studies also have the potential for identification of genetic and phenotypic biomarkers that will be beneficial for subsetting patients in new ways. We will use standard genetic epidemiological methods to handle the WGS and WES data. We will also analyze cell type-specific expression differences in AD to identify biomarkers and disease pathways using standard gene expression analysis methods currently in use. We will also use other multi-omic and other genetic data that has now become available to further understand genetic association results we have found in AD.All data will remain anonymized and securely stored, and only those listed on our application and their staff will have access to these data. We will not share any of the individual level data outside of Genentech nor beyond the researchers on our application. We will adhere to all data use agreement stipulations through the DSS NIAGADS. We have a secure computational environment called Rosalind within Genentech where we will use these data. We have IT security staff that constantly monitor all our research computing, assuring safety and privacy of all of our stored data. We will not collaborate with researchers at other institutions.
Non-Technical Research Use Statement:
Genetic variation and gene expression data allows us to understand more of the genetic contribution to risk and protection from diseases such as Alzheimer’s and dementia. This information also allows us to identify important biological contributors to disease for developing effective treatment strategies, and identifying groups of individuals that would benefit most from new treatments. Our exploration of this relationship between genotype, disease traits, gene expression, and outcomes, through these datasets will allow us to pursue important new findings for disease treatment.
Investigator:
Safo, Sandra
Institution:
University of Minnesota
Project Title:
Innovative Machine and Deep Learning Analyses of Alzheimer's Disease Omics and Phenotypic Data
Date of Approval:
October 27, 2023
Request status:
Approved
Research use statements:
Show statements
Technical Research Use Statement:
AD is the most common cause of dementia and presents a substantial and increasing economic and social burden. Our ability to diagnose and classify AD from cognitive normals (CN), or discriminate among individuals with AD, early mild cognitive impairment [EMCI], or late mild cognitive impairment (LMCI), is essential for the prevention, diagnosis, and treatment of AD. Since individuals with MCI have a high chance of converting to AD, effectively discriminating between those who convert to AD (MCI-C) from those who do not convert (MCINC) is important for early diagnosis of AD. The heterogeneity of AD has motivated attempts to classify distinct subgroups of AD to better inform the underlying physiology. There is evidence to suggest that using data across multiple modalities (e.g. genetics, imaging, metabolomics) has potential to classify AD subgroups better than using single modality. We will apply machine and deep learning methods to gain deeper insight into AD and ADRD pathobiology. We will use datasets that include genomics, genetics, metabolomics, and phenotypic data for this purpose. Data will be divided into discovery and validation sets. On the discovery set, state-of-the-art ML and DL methods for integrative analysis that we and others have developed will be coupled with resampling techniques to determine candidate molecular signatures and pathways discriminating the AD groups considered. Molecular scores will be developed from these candidate biomarkers. The clinical utility of the scores beyond well-known clinical risk factors for AD will be ascertained. We will validate our findings using the validation data. We will visually and quantitatively compare the risk scores across several clinical variables and outcomes. We will use (un)supervised clustering methods to identify molecular clusters, and we will investigate molecular clusters differentiating MCI to AD converters from non-converters. We may explore differences across ethnic subgroups. We will also innovatively apply our multimodal molecular subtyping methods to discover, reproduce, and characterize novel molecular subgroups of AD– this will allow for better risk stratification.
Non-Technical Research Use Statement:
We have been developing novel machine learning (ML) and deep learning (DL) methods that leverage genomics, other omics (including proteomics and metabolomics), clinical and epidemiology data to better understand the pathogenesis of complex diseases. By integrating data from different sources, we have identified molecular signatures contributing to the risk of the development of complex diseases beyond established risk factors. We are proposing to innovatively apply these, and other existing, methods, to data pertaining to Alzheimer’s disease (AD) and Alzheimer’s disease related dementias (ADRD). A deeper understanding of the genes, genetic pathways, and other molecular signatures of AD is essential and could facilitate the identification of potential therapeutic targets for the disease.

Acknowledgment statement for any data distributed by NIAGADS:

Data for this study were prepared, archived, and distributed by the National Institute on Aging Alzheimer’s Disease Data Storage Site (NIAGADS) at the University of Pennsylvania (U24-AG041689), funded by the National Institute on Aging.

Use the study-specific acknowledgement statements below (as applicable):

For investigators using any data from this dataset:

Please cite/reference the use of NIAGADS data by including the accession NG00128.

For investigators using Knight ADRC & WU350 (sa000026) data:

Funding: This work was supported by grants from the National Institutes of Health (R01AG044546 (CC), P01AG003991(CC, JCM), RF1AG053303 (CC), RF1AG058501 (CC), U01AG058922 (CC), and R01AG057777 (OH)), and the Chuck Zuckerberg Initiative (CZI).
The recruitment and clinical characterization of research participants at Washington University were supported by NIH P30AG066444 (JCM), P01AG03991(JCM), and P01AG026276(JCM). O.H. is an Archer Foundation Research Scientist.
This work was supported by access to equipment made possible by the Hope Center for Neurological Disorders, the Neurogenomics and Informatics Center (NGI: https://neurogenomics.wustl.edu/)and the Departments of Neurology and Psychiatry at Washington University School of Medicine.
This study utilized samples obtained from the Washington University School of Medicine’s COVID-19 biorepository, which is supported by: the Barnes-Jewish Hospital Foundation; the Siteman Cancer Center grant P30 CA091842 from the National Cancer Institute of the National Institutes of Health; and the Washington University Institute of Clinical and Translational Sciences grant UL1TR002345 from the National Center for Advancing Translational Sciences of the National Institutes of Health. The content is solely the responsibility of the authors and does not necessarily represent the view of the NIH.

Plasma proteomics of SARS-CoV-2 infection and severity reveals impact on Alzheimer and coronary disease pathways. iScience. 2022.

NG00128 – Proteomic profiling identified plasma biomarkers for SARS-CoV-2 infection and severity of COVID-19 patients

Description

Data Available

Description

Sample Summary per Data Type

Available Filesets

Subject Information

Related Studies

Consent Levels

Approved Users

Acknowledgement

Acknowledgment statement for any data distributed by NIAGADS:

For investigators using any data from this dataset:

For investigators using Knight ADRC & WU350 (sa000026) data:

Related Publications

Cohorts