#################################################### # Readme file for multi-tissue pGWAS/pQTL analyses: # Oct 2019 #################################################### #################################################### # 1) Overview We aim to identify protein-QTLs from three tissues after QC. To do so, we fit the protein-leve (log10-transformed) and GWAS into a linear regression additive model, adjusting for sex, age, and first two genotype-based principal components (PCs) and genotyping platforms (by type [e.g. omni1, omini2.5, NeuroDX] The implementaion of the linear regression additive model is PLINK2 (v2.00a2LM). #################################################### # 2) File location ## 2a) tissue1_CSF/ ### 2a_i) Protein name keys: ForBOX_CSF_afterQC_featureFile.csv #### 16 variables in this file: ## reference: http://somalogic.com/wp-content/uploads/2017/07/SSM-066-04-SOMAscan-Assay-1.3k-Annotations-Column-Description.pdf 01)SOMAseqID: A inhouse modified SOMAmer sequence identifier unique to a specific SOMAmer reagent. 02)SeqId: The SOMAmer sequence identifier unique to a specific SOMAmer reagent. 03)SomaId: The other SOMAmer identifier unique to a specific SOMAmer reagent, starting with SL. 04)TargetFullName: The full protein name associated with the target protein(s) used in the selection of the SOMAmer reagent. 05)Target: The short protein name associated with the target protein(s) used in the selection of the SOMAmer reagent. 06)UniProt: The specific UniProt ID used to pull in the subsequent annotation columns. 07)EntrezGeneID: Entrez gene ID(s) associated with this protein. 08)EntrezGeneSymbol: Entrez gene symbols associated with this protein. 09)Organism: The organism source the protein measured from (Human/Human papillomavirus type 16/Human papillomavirus type 18/isolate BEN/isolate LW123). 10)Units: Relative fluorescent units (RFU). 11)Type: Protein or Non-Human or Rat Protein. 12)Dilution: Dilution group to which the SOMAmer Reagent belongs in the SOMAscan Assay (in percentage). 13)PlateScale.Reference: refererence value for the plate scaling normalization step. 14)CalReference: refererence value for the calibrator scaling normalization step. 15)ColCheck: SOMAscan official report metrics based upon scale factor. 16)Dilution2: Dilution group to which the SOMAmer Reagent belongs in the SOMAscan Assay (in decimal). ### 2a_ii) Summary statistics files labeled with SOMAseqID: *.glm.linear.gz #### In total, 713 summary statistics files. #### 12 variables in this file: ## reference: https://www.cog-genomics.org/plink/2.0/formats#glm_linear 01)#CHROM: Chromosome code 02)POS: Base-pair coordinate 03)ID: Variant ID 04)REF: Reference allele 05)ALT: All alternate alleles, comma-separated 06)A1: Counted allele in regression 07)TEST: Test identifier 08)OBS_CT: Number of samples in the regression 09)BETA: Regression coefficient (for A1 allele) 10)SE: Standard error of log-odds (i.e. beta) 11)T_STAT: t-statistic for linear regression 12)P: Asymptotic p-value for T-stat ## 2b) tissue2_plasma/ ### 2b_i) Protein name keys: ForBOX_plasma_afterQC_featureFile.csv ### 2b_ii) Summary statistics files labeled with SOMAseqID: *.glm.linear.gz #### In total, 931 summary statistics files. ## 2c) tissue3_brain/ ### 2c_i) Protein name keys: ForBOX_brain_afterQC_featureFile.csv ### 2c_ii) Summary statistics files labeled with SOMAseqID: *.glm.linear.gz #### In total, 1079 summary statistics files.