Last updated: 2023-10-13

Checks: 6 1

Knit directory: cTWAS_analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown is untracked by Git. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20211220) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 68eacf2. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .ipynb_checkpoints/

Untracked files:
    Untracked:  LDL_LDLR_locus1.pdf
    Untracked:  LDL_TEME199_genetrack.pdf
    Untracked:  LDL_TEME199_locus.pdf
    Untracked:  Proposal plots.R
    Untracked:  RGS14.pdf
    Untracked:  RNF186.pdf
    Untracked:  Rplots.pdf
    Untracked:  SCZ_annotation.xlsx
    Untracked:  SLC8B1.pdf
    Untracked:  analysis/.ipynb_checkpoints/
    Untracked:  analysis/OxHB.Rmd
    Untracked:  analysis/simulation_3tissues_correlated.Rmd
    Untracked:  analysis/simulation_3tissues_uncorrelated.Rmd
    Untracked:  cache/
    Untracked:  code/.ipynb_checkpoints/
    Untracked:  code/OxHb_out/OxHb_Kidney_Cortex.err
    Untracked:  code/OxHb_out/OxHb_Kidney_Cortex.out
    Untracked:  code/OxHb_out/OxHb_Liver.err
    Untracked:  code/OxHb_out/OxHb_Liver.out
    Untracked:  code/OxHb_out/OxHb_Pancreas.err
    Untracked:  code/OxHb_out/OxHb_Pancreas.out
    Untracked:  code/OxHb_out/OxHb_Whole_Blood.err
    Untracked:  code/OxHb_out/OxHb_Whole_Blood.out
    Untracked:  data/.ipynb_checkpoints/
    Untracked:  data/FUMA_output/
    Untracked:  data/GO_Terms/
    Untracked:  data/GTEx_Analysis_v8_eQTL.tar
    Untracked:  data/G_list.RData
    Untracked:  data/IBD_ME/
    Untracked:  data/LB/
    Untracked:  data/LDL/
    Untracked:  data/LDL_E_S/
    Untracked:  data/LDL_E_S_M/
    Untracked:  data/LDL_M/
    Untracked:  data/LDL_S/
    Untracked:  data/OxHb/
    Untracked:  data/PGC3_SCZ_wave3_public.v2.tsv
    Untracked:  data/Predictive_Models/
    Untracked:  data/Supplementary Table 15 - MAGMA.xlsx
    Untracked:  data/Supplementary Table 20 - Prioritised Genes.xlsx
    Untracked:  data/UKBB/
    Untracked:  data/UKBB_SNPs_Info.text
    Untracked:  data/WhiteBlood_E/
    Untracked:  data/WhiteBlood_E_M/
    Untracked:  data/WhiteBlood_E_S_M/
    Untracked:  data/WhiteBlood_E_S_M_PC/
    Untracked:  data/WhiteBlood_M/
    Untracked:  data/WhiteBlood_M_compare/
    Untracked:  data/WhiteBlood_M_enet/
    Untracked:  data/cpg_annot.RData
    Untracked:  data/eqtl/
    Untracked:  data/gencode.v26.GRCh38.genes.gtf
    Untracked:  data/gene_OMIM.txt
    Untracked:  data/gene_pip_0.8.txt
    Untracked:  data/gwas_sumstats/
    Untracked:  data/magma.genes.out
    Untracked:  data/mashr_Heart_Atrial_Appendage.db
    Untracked:  data/mashr_sqtl/
    Untracked:  data/mqtl/
    Untracked:  data/notes.txt
    Untracked:  data/scz_2018.RDS
    Untracked:  data/summary_known_genes_annotations.xlsx
    Untracked:  data/test/
    Untracked:  hist.pdf
    Untracked:  ld_pip.pdf
    Untracked:  submit.sh
    Untracked:  temp_LDR/
    Untracked:  test-B1.snpgwas.txt

Unstaged changes:
    Deleted:    analysis/Atrial_Fibrillation_Heart_Atrial_Appendage.Rmd
    Deleted:    analysis/Atrial_Fibrillation_Heart_Left_Ventricle.Rmd
    Deleted:    analysis/Autism_Brain_Amygdala.Rmd
    Deleted:    analysis/Autism_Brain_Anterior_cingulate_cortex_BA24.Rmd
    Deleted:    analysis/Autism_Brain_Caudate_basal_ganglia.Rmd
    Deleted:    analysis/Autism_Brain_Cerebellar_Hemisphere.Rmd
    Deleted:    analysis/Autism_Brain_Cerebellum.Rmd
    Deleted:    analysis/Autism_Brain_Cortex.Rmd
    Deleted:    analysis/Autism_Brain_Frontal_Cortex_BA9.Rmd
    Deleted:    analysis/Autism_Brain_Hippocampus.Rmd
    Deleted:    analysis/Autism_Brain_Hypothalamus.Rmd
    Deleted:    analysis/Autism_Brain_Nucleus_accumbens_basal_ganglia.Rmd
    Deleted:    analysis/Autism_Brain_Putamen_basal_ganglia.Rmd
    Deleted:    analysis/Autism_Brain_Spinal_cord_cervical_c-1.Rmd
    Deleted:    analysis/Autism_Brain_Substantia_nigra.Rmd
    Deleted:    analysis/BMI_Brain_Amygdala.Rmd
    Deleted:    analysis/BMI_Brain_Amygdala_S.Rmd
    Deleted:    analysis/BMI_Brain_Anterior_cingulate_cortex_BA24.Rmd
    Deleted:    analysis/BMI_Brain_Anterior_cingulate_cortex_BA24_S.Rmd
    Deleted:    analysis/BMI_Brain_Caudate_basal_ganglia.Rmd
    Deleted:    analysis/BMI_Brain_Caudate_basal_ganglia_S.Rmd
    Deleted:    analysis/BMI_Brain_Cerebellar_Hemisphere.Rmd
    Deleted:    analysis/BMI_Brain_Cerebellar_Hemisphere_S.Rmd
    Deleted:    analysis/BMI_Brain_Cerebellum.Rmd
    Deleted:    analysis/BMI_Brain_Cerebellum_S.Rmd
    Deleted:    analysis/BMI_Brain_Cortex.Rmd
    Deleted:    analysis/BMI_Brain_Cortex_S.Rmd
    Deleted:    analysis/BMI_Brain_Frontal_Cortex_BA9.Rmd
    Deleted:    analysis/BMI_Brain_Frontal_Cortex_BA9_S.Rmd
    Deleted:    analysis/BMI_Brain_Hippocampus.Rmd
    Deleted:    analysis/BMI_Brain_Hippocampus_S.Rmd
    Deleted:    analysis/BMI_Brain_Hypothalamus.Rmd
    Deleted:    analysis/BMI_Brain_Hypothalamus_S.Rmd
    Deleted:    analysis/BMI_Brain_Nucleus_accumbens_basal_ganglia.Rmd
    Deleted:    analysis/BMI_Brain_Nucleus_accumbens_basal_ganglia_S.Rmd
    Deleted:    analysis/BMI_Brain_Putamen_basal_ganglia.Rmd
    Deleted:    analysis/BMI_Brain_Putamen_basal_ganglia_S.Rmd
    Deleted:    analysis/BMI_Brain_Spinal_cord_cervical_c-1.Rmd
    Deleted:    analysis/BMI_Brain_Spinal_cord_cervical_c-1_S.Rmd
    Deleted:    analysis/BMI_Brain_Substantia_nigra.Rmd
    Deleted:    analysis/BMI_Brain_Substantia_nigra_S.Rmd
    Deleted:    analysis/BMI_S_results.Rmd
    Deleted:    analysis/Glucose_Adipose_Subcutaneous.Rmd
    Deleted:    analysis/Glucose_Adipose_Visceral_Omentum.Rmd
    Modified:   analysis/index.Rmd
    Deleted:    code/OxHb_out/OxHb_Artery_Aorta.err
    Deleted:    code/OxHb_out/OxHb_Artery_Aorta.out
    Modified:   code/OxHb_out/OxHb_Artery_Coronary.err
    Modified:   code/OxHb_out/OxHb_Artery_Coronary.out
    Modified:   code/OxHb_out/OxHb_Artery_Tibial.out
    Modified:   code/OxHb_out/OxHb_Heart_Atrial_Appendage.err
    Modified:   code/OxHb_out/OxHb_Heart_Atrial_Appendage.out
    Modified:   code/OxHb_out/OxHb_Heart_Left_Ventricle.out
    Modified:   code/OxHb_out/OxHb_Lung.err
    Modified:   code/OxHb_out/OxHb_Lung.out
    Deleted:    code/WhiteBlood_M_ener_out/WhiteBlood_WholeBlood.out
    Deleted:    code/White_Blood_M_out/White_Blood_BreastMammary.err
    Deleted:    code/White_Blood_M_out/White_Blood_BreastMammary.out
    Deleted:    code/White_Blood_M_out/White_Blood_ColonTransverse.err
    Deleted:    code/White_Blood_M_out/White_Blood_ColonTransverse.out
    Deleted:    code/White_Blood_M_out/White_Blood_KidneyCortex.err
    Deleted:    code/White_Blood_M_out/White_Blood_KidneyCortex.out
    Deleted:    code/White_Blood_M_out/White_Blood_Lung.err
    Deleted:    code/White_Blood_M_out/White_Blood_Lung.out
    Deleted:    code/White_Blood_M_out/White_Blood_MuscleSkeletal.err
    Deleted:    code/White_Blood_M_out/White_Blood_MuscleSkeletal.out
    Deleted:    code/White_Blood_M_out/White_Blood_Ovary.err
    Deleted:    code/White_Blood_M_out/White_Blood_Ovary.out
    Deleted:    code/White_Blood_M_out/White_Blood_Prostate.err
    Deleted:    code/White_Blood_M_out/White_Blood_Prostate.out
    Deleted:    code/White_Blood_M_out/White_Blood_Testis.err
    Deleted:    code/White_Blood_M_out/White_Blood_Testis.out
    Deleted:    code/White_Blood_M_out/White_Blood_WholeBlood.err
    Deleted:    code/White_Blood_M_out/White_Blood_WholeBlood.out
    Deleted:    code/run_IBD_ctwas_rss_LDR_ME.R
    Modified:   code/run_OxHb_analysis.sh
    Modified:   code/run_OxHb_ctwas_rss_LDR.R

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


There are no past versions. Publish this analysis with wflow_publish() to start tracking its development.


Simulation 1: Two causal tissues with equal high PVE

Shared effect size parameters

30% PVE and 2.5e-4 prior inclusion for SNPs, 3% PVE and 0.009 prior inclusion for Brain Cerebellum expression, 3% PVE and 0.009 prior inclusion for Brain Hippocampus expression, 3% PVE and 0.009 prior inclusion for Brain Caudate basal ganglia expression. For the cTWAS analysis, each tissue had its own prior inclusion parameter, and the tissues shared a single effect size parameter.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 1

simutags <- paste(1, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8
results_df <- data.frame(simutag=as.character(),
                              n_causal=as.integer(),
                              n_causal_combined=as.integer(),
                              n_detected_pip=as.integer(),
                              n_detected_pip_in_causal=as.integer(),
                              n_detected_comb_pip=as.integer(),
                              n_detected_comb_pip_in_causal=as.integer(),
                              pve_snp=as.numeric(),
                              pve_weight1=as.numeric(),
                              pve_weight2=as.numeric(),
                              pve_weight3=as.numeric(),
                              prior_weight1=as.numeric(),
                              prior_weight2=as.numeric(),
                              prior_weight3=as.numeric(),
                              prior_var_snp=as.numeric(),
                              prior_var_weight1=as.numeric(),
                              prior_var_weight2=as.numeric(),
                              prior_var_weight3=as.numeric(),
                              n_detected_twas=as.integer(),
                              n_detected_twas_in_causal=as.integer(),
                              n_detected_comb_twas=as.integer(),
                              n_detected_comb_twas_in_causal=as.integer())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #load cTWAS results
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of causal genes
  n_causal <- length(true_genes)
  n_causal_combined <- length(true_genes_combined)
  
  #number of gene+tissue combinations with cTWAS PIP > threshold
  n_ctwas_genes <- sum(ctwas_gene_res$susie_pip > PIP_threshold)
  
  #number of cTWAS genes that are causal
  n_causal_detected <- sum(ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold] %in% true_genes)
  
  #collapse gene+tissues to genes and compute combined PIP
  ctwas_gene_res$gene <- sapply(ctwas_gene_res$id, function(x){unlist(strsplit(x,"[|]"))[1]})
  ctwas_gene_res_combined <- aggregate(ctwas_gene_res$susie_pip, by=list(ctwas_gene_res$gene), FUN=sum)
  colnames(ctwas_gene_res_combined) <- c("gene", "pip_combined")
  
  #number of genes with combined PIP > threshold
  n_ctwas_genes_combined <- sum(ctwas_gene_res_combined$pip_combined > PIP_threshold)
  
  #number of cTWAS genes using combined PIP that are causal
  n_causal_detected_combined <- sum(ctwas_gene_res_combined$gene[ctwas_gene_res_combined$pip_combined > PIP_threshold] %in% true_genes_combined)
  
  #collect number of SNPs analyzed by cTWAS
  ctwas_res_s1 <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s1.susieIrss.txt"))
  n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
  rm(ctwas_res_s1)
  
  #load estimated parameters
  load(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s2.susieIrssres.Rd"))
  
  #estimated group prior (all iterations)
  estimated_group_prior_all <- group_prior_rec
  estimated_group_prior_all["SNP",] <- estimated_group_prior_all["SNP",]*thin #adjust parameter to account for thin argument
  
  #estimated group prior variance (all iterations)
  estimated_group_prior_var_all <- group_prior_var_rec
  
  #set group size
  group_size <- c(table(ctwas_gene_res$type), structure(n_snps, names="SNP"))
  group_size <- group_size[rownames(estimated_group_prior_all)]
  
  #estimated group PVE (all iterations)
  estimated_group_pve_all <- estimated_group_prior_var_all*estimated_group_prior_all*group_size/sample_size #check PVE calculation
  
  #multitissue TWAS analysis with bonferroni adjusted threshold for z scores
  #load(paste0(results_dir, runtag, "_simu", simutag, "_LDR_z_gene.Rd"))
  ld_exprfs <- paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_chr", 1:22, ".exprqc.Rd")
  z_gene <- list()
  for (j in 1:length(ld_exprfs)){
    load(ld_exprfs[j])
    z_gene[[j]] <- z_gene_chr
  }
  z_gene <- do.call(rbind, z_gene)
  alpha <- 0.05
  sig_thresh <- qnorm(1-(alpha/nrow(z_gene)/2), lower=T)
  twas_genes <- z_gene$id[abs(z_gene$z)>sig_thresh]
  twas_genes_combined <- unique(sapply(twas_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  n_twas_genes <- length(twas_genes)
  n_twas_genes_combined <- length(twas_genes_combined)
  
  n_twas_genes_in_causal <- sum(twas_genes %in% true_genes)
  n_twas_genes_in_causal_combined <- sum(twas_genes_combined %in% true_genes_combined)

  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal=as.integer(n_causal),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_pip=as.integer(n_ctwas_genes),
                                n_detected_pip_in_causal=as.integer(n_causal_detected),
                                n_detected_comb_pip=as.integer(n_ctwas_genes_combined),
                                n_detected_comb_pip_in_causal=as.integer(n_causal_detected_combined),
                                pve_snp=as.numeric(rev(estimated_group_pve_all["SNP",])[1]),
                                pve_weight1=as.numeric(rev(estimated_group_pve_all["Brain_Cerebellum_harmonized",])[1]),
                                pve_weight2=as.numeric(rev(estimated_group_pve_all["Brain_Hippocampus_harmonized",])[1]),
                                pve_weight3=as.numeric(rev(estimated_group_pve_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_weight1=as.numeric(rev(estimated_group_prior_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_weight2=as.numeric(rev(estimated_group_prior_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_weight3=as.numeric(rev(estimated_group_prior_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_var_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_var_weight1=as.numeric(rev(estimated_group_prior_var_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_var_weight2=as.numeric(rev(estimated_group_prior_var_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_var_weight3=as.numeric(rev(estimated_group_prior_var_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                n_detected_twas=as.integer(n_twas_genes),
                                n_detected_twas_in_causal=as.integer(n_twas_genes_in_causal),
                                n_detected_comb_twas=as.integer(n_twas_genes_combined),
                                n_detected_comb_twas_in_causal=as.integer(n_twas_genes_in_causal_combined))
  
  results_df <- rbind(results_df, results_current)
}

#results using PIP threshold (gene+tissue)
results_df[,c("simutag", "n_causal", "n_detected_pip", "n_detected_pip_in_causal")]
  simutag n_causal n_detected_pip n_detected_pip_in_causal
1     1-1      217             50                       19
2     1-2      238             91                       36
3     1-3      208             66                       28
#mean percent causal using PIP > 0.8
sum(results_df$n_detected_pip_in_causal)/sum(results_df$n_detected_pip)
[1] 0.4009662
#results using combined PIP threshold
results_df[,c("simutag", "n_causal_combined", "n_detected_comb_pip", "n_detected_comb_pip_in_causal")]
  simutag n_causal_combined n_detected_comb_pip n_detected_comb_pip_in_causal
1     1-1               215                  82                            38
2     1-2               236                 126                            53
3     1-3               206                  96                            43
#mean percent causal using combined PIP > 0.8
sum(results_df$n_detected_comb_pip_in_causal)/sum(results_df$n_detected_comb_pip)
[1] 0.4407895
#prior inclusion and mean prior inclusion
results_df[,c(which(colnames(results_df)=="simutag"), setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df))))]
  simutag prior_snp prior_weight1 prior_weight2 prior_weight3
1     1-1  9.030222   0.006660324   0.008893521   0.007731678
2     1-2  7.954818   0.006692140   0.007848940   0.004741320
3     1-3  9.085452   0.010871561   0.004839081   0.009818128
colMeans(results_df[,setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df)))])
    prior_snp prior_weight1 prior_weight2 prior_weight3 
  8.690164043   0.008074675   0.007193847   0.007430375 
#prior variance and mean prior variance
results_df[,c(which(colnames(results_df)=="simutag"), grep("prior_var", names(results_df)))]
  simutag prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3
1     1-1      9.030222          21.23807          21.23807          21.23807
2     1-2      7.954818          29.72272          29.72272          29.72272
3     1-3      9.085452          17.20456          17.20456          17.20456
colMeans(results_df[,grep("prior_var", names(results_df))])
    prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3 
         8.690164         22.721782         22.721782         22.721782 
#PVE and mean PVE
results_df[,c(which(colnames(results_df)=="simutag"), grep("pve", names(results_df)))]
  simutag   pve_snp pve_weight1 pve_weight2 pve_weight3
1     1-1 0.3014274  0.02757065  0.03336901  0.03131224
2     1-2 0.2732905  0.03876950  0.04121489  0.02687282
3     1-3 0.2931384  0.03645626  0.01470825  0.03221051
colMeans(results_df[,grep("pve", names(results_df))])
    pve_snp pve_weight1 pve_weight2 pve_weight3 
 0.28928544  0.03426547  0.02976405  0.03013186 
#TWAS results
results_df[,c(which(colnames(results_df)=="simutag"), grep("twas", names(results_df)))]
  simutag n_detected_twas n_detected_twas_in_causal n_detected_comb_twas
1     1-1             233                        53                  139
2     1-2             376                        75                  227
3     1-3             271                        65                  167
  n_detected_comb_twas_in_causal
1                             52
2                             73
3                             65
sum(results_df$n_detected_comb_twas_in_causal)/sum(results_df$n_detected_comb_twas)
[1] 0.3564728
#store results for figure
plot_df <- data.frame(simutag=results_df$simutag,
                      method="cTWAS G+T",
                      count=results_df$n_detected_pip-results_df$n_detected_pip_in_causal,
                      ifcausal=F)
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G+T",
                            count=results_df$n_detected_pip_in_causal,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G",
                            count=results_df$n_detected_comb_pip-results_df$n_detected_comb_pip_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G",
                            count=results_df$n_detected_comb_pip_in_causal,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="TWAS",
                            count=results_df$n_detected_comb_twas-results_df$n_detected_comb_twas_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="TWAS",
                            count=results_df$n_detected_comb_twas_in_causal,
                            ifcausal=T))

Separate effect size parameters

For the cTWAS analysis, each tissue had its own prior inclusion parameter end effect size parameter.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(1, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8
results_df <- data.frame(simutag=as.character(),
                              n_causal=as.integer(),
                              n_causal_combined=as.integer(),
                              n_detected_pip=as.integer(),
                              n_detected_pip_in_causal=as.integer(),
                              n_detected_comb_pip=as.integer(),
                              n_detected_comb_pip_in_causal=as.integer(),
                              pve_snp=as.numeric(),
                              pve_weight1=as.numeric(),
                              pve_weight2=as.numeric(),
                              pve_weight3=as.numeric(),
                              prior_weight1=as.numeric(),
                              prior_weight2=as.numeric(),
                              prior_weight3=as.numeric(),
                              prior_var_snp=as.numeric(),
                              prior_var_weight1=as.numeric(),
                              prior_var_weight2=as.numeric(),
                              prior_var_weight3=as.numeric())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #load cTWAS results
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of causal genes
  n_causal <- length(true_genes)
  n_causal_combined <- length(true_genes_combined)
  
  #number of gene+tissue combinations with cTWAS PIP > threshold
  n_ctwas_genes <- sum(ctwas_gene_res$susie_pip > PIP_threshold)
  
  #number of cTWAS genes that are causal
  n_causal_detected <- sum(ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold] %in% true_genes)
  
  #collapse gene+tissues to genes and compute combined PIP
  ctwas_gene_res$gene <- sapply(ctwas_gene_res$id, function(x){unlist(strsplit(x,"[|]"))[1]})
  ctwas_gene_res_combined <- aggregate(ctwas_gene_res$susie_pip, by=list(ctwas_gene_res$gene), FUN=sum)
  colnames(ctwas_gene_res_combined) <- c("gene", "pip_combined")
  
  #number of genes with combined PIP > threshold
  n_ctwas_genes_combined <- sum(ctwas_gene_res_combined$pip_combined > PIP_threshold)
  
  #number of cTWAS genes using combined PIP that are causal
  n_causal_detected_combined <- sum(ctwas_gene_res_combined$gene[ctwas_gene_res_combined$pip_combined > PIP_threshold] %in% true_genes_combined)
  
  #collect number of SNPs analyzed by cTWAS
  ctwas_res_s1 <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s1.susieIrss.txt"))
  n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
  rm(ctwas_res_s1)
  
  #load estimated parameters
  load(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s2.susieIrssres.Rd"))
  
  #estimated group prior (all iterations)
  estimated_group_prior_all <- group_prior_rec
  estimated_group_prior_all["SNP",] <- estimated_group_prior_all["SNP",]*thin #adjust parameter to account for thin argument
  
  #estimated group prior variance (all iterations)
  estimated_group_prior_var_all <- group_prior_var_rec
  
  #set group size
  group_size <- c(table(ctwas_gene_res$type), structure(n_snps, names="SNP"))
  group_size <- group_size[rownames(estimated_group_prior_all)]
  
  #estimated group PVE (all iterations)
  estimated_group_pve_all <- estimated_group_prior_var_all*estimated_group_prior_all*group_size/sample_size #check PVE calculation

  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal=as.integer(n_causal),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_pip=as.integer(n_ctwas_genes),
                                n_detected_pip_in_causal=as.integer(n_causal_detected),
                                n_detected_comb_pip=as.integer(n_ctwas_genes_combined),
                                n_detected_comb_pip_in_causal=as.integer(n_causal_detected_combined),
                                pve_snp=as.numeric(rev(estimated_group_pve_all["SNP",])[1]),
                                pve_weight1=as.numeric(rev(estimated_group_pve_all["Brain_Cerebellum_harmonized",])[1]),
                                pve_weight2=as.numeric(rev(estimated_group_pve_all["Brain_Hippocampus_harmonized",])[1]),
                                pve_weight3=as.numeric(rev(estimated_group_pve_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_weight1=as.numeric(rev(estimated_group_prior_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_weight2=as.numeric(rev(estimated_group_prior_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_weight3=as.numeric(rev(estimated_group_prior_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_var_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_var_weight1=as.numeric(rev(estimated_group_prior_var_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_var_weight2=as.numeric(rev(estimated_group_prior_var_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_var_weight3=as.numeric(rev(estimated_group_prior_var_all["Brain_Caudate_basal_ganglia_harmonized",])[1]))
  
  results_df <- rbind(results_df, results_current)
}

#results using PIP threshold (gene+tissue)
results_df[,c("simutag", "n_causal", "n_detected_pip", "n_detected_pip_in_causal")]
  simutag n_causal n_detected_pip n_detected_pip_in_causal
1     1-1      217             58                       17
2     1-2      238            114                       41
3     1-3      208             83                       28
#mean percent causal using PIP > 0.8
sum(results_df$n_detected_pip_in_causal)/sum(results_df$n_detected_pip)
[1] 0.3372549
#results using combined PIP threshold
results_df[,c("simutag", "n_causal_combined", "n_detected_comb_pip", "n_detected_comb_pip_in_causal")]
  simutag n_causal_combined n_detected_comb_pip n_detected_comb_pip_in_causal
1     1-1               215                  64                            31
2     1-2               236                 123                            53
3     1-3               206                  86                            38
#mean percent causal using combined PIP > 0.8
sum(results_df$n_detected_comb_pip_in_causal)/sum(results_df$n_detected_comb_pip)
[1] 0.4468864
#prior inclusion and mean prior inclusion
results_df[,c(which(colnames(results_df)=="simutag"), setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df))))]
  simutag prior_snp prior_weight1 prior_weight2 prior_weight3
1     1-1  9.260675   0.004795843   0.008183196   0.015153959
2     1-2  7.972147   0.008051228   0.006912857   0.004554866
3     1-3  9.160136   0.014422878   0.005114593   0.007250383
colMeans(results_df[,setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df)))])
    prior_snp prior_weight1 prior_weight2 prior_weight3 
  8.797652886   0.009089983   0.006736882   0.008986402 
#prior variance and mean prior variance
results_df[,c(which(colnames(results_df)=="simutag"), grep("prior_var", names(results_df)))]
  simutag prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3
1     1-1      9.260675          40.75581          22.65020          7.843306
2     1-2      7.972147          20.75719          36.76933         33.636322
3     1-3      9.160136          11.06991          15.12579         27.476631
colMeans(results_df[,grep("prior_var", names(results_df))])
    prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3 
         8.797653         24.194304         24.848437         22.985420 
#PVE and mean PVE
results_df[,c(which(colnames(results_df)=="simutag"), grep("pve", names(results_df)))]
  simutag   pve_snp pve_weight1 pve_weight2 pve_weight3
1     1-1 0.2993075  0.03809702  0.03274535  0.02266473
2     1-2 0.2733422  0.03257372  0.04490532  0.02921524
3     1-3 0.2932229  0.03111951  0.01366733  0.03798830
colMeans(results_df[,grep("pve", names(results_df))])
    pve_snp pve_weight1 pve_weight2 pve_weight3 
 0.28862420  0.03393008  0.03043933  0.02995609 
#store results for figure
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G - Ind Var",
                            count=results_df$n_detected_comb_pip-results_df$n_detected_comb_pip_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G - Ind Var",
                            count=results_df$n_detected_comb_pip_in_causal,
                            ifcausal=T))

Individual tissue analyses

For the cTWAS analysis, each tissue was analyzed individually and the results were combined.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(1, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8

results_df <- data.frame(simutag=as.character(),
                         n_causal_combined=as.integer(),
                         n_detected_weight1=as.integer(),
                         n_detected_in_causal_weight1=as.integer(),
                         n_detected_weight2=as.integer(),
                         n_detected_in_causal_weight2=as.integer(),
                         n_detected_weight3=as.integer(),
                         n_detected_in_causal_weight3=as.integer(),
                         n_detected_combined=as.integer(),
                         n_detected_in_causal_combined=as.integer())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #number of causal genes
  n_causal_combined <- length(true_genes_combined)
  
  #load cTWAS results for weight1
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight1.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight1
  ctwas_genes_weight1 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight1 <- length(ctwas_genes_weight1)
  n_causal_detected_weight1 <- sum(ctwas_genes_weight1 %in% true_genes_combined)
  
  #load cTWAS results for weight2
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight2.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight1
  ctwas_genes_weight2 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight2 <- length(ctwas_genes_weight2)
  n_causal_detected_weight2 <- sum(ctwas_genes_weight2 %in% true_genes_combined)
  
  #load cTWAS results for weight3
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight3.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight3
  ctwas_genes_weight3 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight3 <- length(ctwas_genes_weight3)
  n_causal_detected_weight3 <- sum(ctwas_genes_weight3 %in% true_genes_combined)
  
  #combined analysis
  ctwas_genes_combined <- unique(c(ctwas_genes_weight1, ctwas_genes_weight2, ctwas_genes_weight3))
  n_ctwas_genes_combined <- length(ctwas_genes_combined)
  n_causal_detected_combined <- sum(ctwas_genes_combined %in% true_genes_combined)
  
  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_weight1=as.integer(n_ctwas_genes_weight1),
                                n_detected_in_causal_weight1=as.integer(n_causal_detected_weight1),
                                n_detected_weight2=as.integer(n_ctwas_genes_weight2),
                                n_detected_in_causal_weight2=as.integer(n_causal_detected_weight2),
                                n_detected_weight3=as.integer(n_ctwas_genes_weight3),
                                n_detected_in_causal_weight3=as.integer(n_causal_detected_weight3),
                                n_detected_combined=as.integer(n_ctwas_genes_combined),
                                n_detected_in_causal_combined=as.integer(n_causal_detected_combined))
  
  results_df <- rbind(results_df, results_current)
}

#results using weight1
results_df[,c("simutag", colnames(results_df)[grep("weight1", colnames(results_df))])]
  simutag n_detected_weight1 n_detected_in_causal_weight1
1     1-1                 37                           18
2     1-2                 68                           32
3     1-3                 24                           11
#mean percent causal using PIP > 0.8 for weight1
sum(results_df$n_detected_in_causal_weight1)/sum(results_df$n_detected_weight1)
[1] 0.4728682
#results using weight2
results_df[,c("simutag", colnames(results_df)[grep("weight2", colnames(results_df))])]
  simutag n_detected_weight2 n_detected_in_causal_weight2
1     1-1                 42                           21
2     1-2                 49                           19
3     1-3                 42                           19
#mean percent causal using PIP > 0.8 for weight1
sum(results_df$n_detected_in_causal_weight2)/sum(results_df$n_detected_weight2)
[1] 0.443609
#results using weight3
results_df[,c("simutag", colnames(results_df)[grep("weight3", colnames(results_df))])]
  simutag n_detected_weight3 n_detected_in_causal_weight3
1     1-1                 39                           18
2     1-2                 56                           23
3     1-3                  0                            0
#mean percent causal using PIP > 0.8 for weight3
sum(results_df$n_detected_in_causal_weight3)/sum(results_df$n_detected_weight3)
[1] 0.4315789
#results using combined analysis
results_df[,c("simutag", colnames(results_df)[grep("combined", colnames(results_df))])]
  simutag n_causal_combined n_detected_combined n_detected_in_causal_combined
1     1-1               215                  80                            35
2     1-2               236                 124                            49
3     1-3               206                  58                            26
#mean percent causal using PIP > 0.8 for weight3
sum(results_df$n_detected_in_causal_combined)/sum(results_df$n_detected_combined)
[1] 0.4198473
#store results for figure

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight1",
                            count=results_df$n_detected_weight1-results_df$n_detected_in_causal_weight1,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight1",
                            count=results_df$n_detected_in_causal_weight1,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight2",
                            count=results_df$n_detected_weight2-results_df$n_detected_in_causal_weight2,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight2",
                            count=results_df$n_detected_in_causal_weight2,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight3",
                            count=results_df$n_detected_weight3-results_df$n_detected_in_causal_weight3,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight3",
                            count=results_df$n_detected_in_causal_weight3,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Union",
                            count=results_df$n_detected_combined-results_df$n_detected_in_causal_combined,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Union",
                            count=results_df$n_detected_in_causal_combined,
                            ifcausal=T))

Figure

For the cTWAS analysis, each tissue was analyzed individually and the results were combined.

plot_df$method <- factor(plot_df$method, levels=c("TWAS", "cTWAS G+T", "cTWAS G", "cTWAS G - Ind Var", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3"))

library(ggpubr)
Loading required package: ggplot2
colset = c("#ebebeb", "#fb8072")

ggbarplot(plot_df, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "none", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))

plot_df_2 <- plot_df

plot_df_2$method %in% c("TWAS", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3")
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[37]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
plot_df_2$ifcausal <- as.character(plot_df_2$ifcausal+1)
plot_df_2$count[plot_df_2$method=="cTWAS G" & plot_df_2$ifcausal=="2"] <- plot_df_2$count[plot_df_2$method=="cTWAS G" & plot_df_2$ifcausal=="2"] - plot_df_2$count[plot_df_2$method=="cTWAS G+T" & plot_df_2$ifcausal=="2"]

plot_df_2 <- rbind(plot_df_2, data.frame(simutag=simutags,
                                    method="cTWAS G",
                                    count=plot_df_2$count[plot_df_2$method=="cTWAS G+T" & plot_df_2$ifcausal=="2"],
                                    ifcausal="3"))




plot_df_3 <- plot_df_2


plot_df_2 <- plot_df_2[plot_df_2$method %in% c("TWAS", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3"),]
plot_df_2$method <- droplevels(plot_df_2$method)
levels(plot_df_2$method) <- c("TWAS", "cTWAS Union", "cTWAS Primary", "cTWAS Secondary", "cTWAS Null")

plot_df_2$ifcausal[plot_df_2$ifcausal=="1"] <- "FP"
plot_df_2$ifcausal[plot_df_2$ifcausal=="2"] <- "TP"

colset = c("#ebebeb", "#fb8072", "#8dd3c7")

ggbarplot(plot_df_2, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "right", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + guides(fill=guide_legend(title=""))

plot_df_3 <- plot_df_3[plot_df_3$method %in% c("cTWAS G", "cTWAS Union"),]
plot_df_3$method <- droplevels(plot_df_3$method)
levels(plot_df_3$method) <- c("cTWAS Joint", "cTWAS Union")

plot_df_3$method <- relevel(plot_df_3$method, "cTWAS Union")

plot_df_3$ifcausal[plot_df_3$ifcausal=="1"] <- "FP"
plot_df_3$ifcausal[plot_df_3$ifcausal=="2"] <- "TP"
plot_df_3$ifcausal[plot_df_3$ifcausal=="3"] <- "TP and Tissue Identified\nby Joint Analysis"

ggbarplot(plot_df_3, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "right", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + guides(fill=guide_legend(title=""))

Simulation 2: Two causal tissues with equal low PVE

Shared effect size parameters

30% PVE and 2.5e-4 prior inclusion for SNPs, 1% PVE and 0.003 prior inclusion for Brain Cerebellum expression, 1% PVE and 0.003 prior inclusion for Brain Hippocampus expression, 1% PVE and 0.003 prior inclusion for Brain Caudate basal ganglia expression. Each tissue had its own prior inclusion parameter, and the tissues shared a single effect size parameter.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 1

simutags <- paste(2, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8
results_df <- data.frame(simutag=as.character(),
                              n_causal=as.integer(),
                              n_causal_combined=as.integer(),
                              n_detected_pip=as.integer(),
                              n_detected_pip_in_causal=as.integer(),
                              n_detected_comb_pip=as.integer(),
                              n_detected_comb_pip_in_causal=as.integer(),
                              pve_snp=as.numeric(),
                              pve_weight1=as.numeric(),
                              pve_weight2=as.numeric(),
                              pve_weight3=as.numeric(),
                              prior_weight1=as.numeric(),
                              prior_weight2=as.numeric(),
                              prior_weight3=as.numeric(),
                              prior_var_snp=as.numeric(),
                              prior_var_weight1=as.numeric(),
                              prior_var_weight2=as.numeric(),
                              prior_var_weight3=as.numeric(),
                              n_detected_twas=as.integer(),
                              n_detected_twas_in_causal=as.integer(),
                              n_detected_comb_twas=as.integer(),
                              n_detected_comb_twas_in_causal=as.integer())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #load cTWAS results
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of causal genes
  n_causal <- length(true_genes)
  n_causal_combined <- length(true_genes_combined)
  
  #number of gene+tissue combinations with cTWAS PIP > threshold
  n_ctwas_genes <- sum(ctwas_gene_res$susie_pip > PIP_threshold)
  
  #number of cTWAS genes that are causal
  n_causal_detected <- sum(ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold] %in% true_genes)
  
  #collapse gene+tissues to genes and compute combined PIP
  ctwas_gene_res$gene <- sapply(ctwas_gene_res$id, function(x){unlist(strsplit(x,"[|]"))[1]})
  ctwas_gene_res_combined <- aggregate(ctwas_gene_res$susie_pip, by=list(ctwas_gene_res$gene), FUN=sum)
  colnames(ctwas_gene_res_combined) <- c("gene", "pip_combined")
  
  #number of genes with combined PIP > threshold
  n_ctwas_genes_combined <- sum(ctwas_gene_res_combined$pip_combined > PIP_threshold)
  
  #number of cTWAS genes using combined PIP that are causal
  n_causal_detected_combined <- sum(ctwas_gene_res_combined$gene[ctwas_gene_res_combined$pip_combined > PIP_threshold] %in% true_genes_combined)
  
  #collect number of SNPs analyzed by cTWAS
  ctwas_res_s1 <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s1.susieIrss.txt"))
  n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
  rm(ctwas_res_s1)
  
  #load estimated parameters
  load(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s2.susieIrssres.Rd"))
  
  #estimated group prior (all iterations)
  estimated_group_prior_all <- group_prior_rec
  estimated_group_prior_all["SNP",] <- estimated_group_prior_all["SNP",]*thin #adjust parameter to account for thin argument
  
  #estimated group prior variance (all iterations)
  estimated_group_prior_var_all <- group_prior_var_rec
  
  #set group size
  group_size <- c(table(ctwas_gene_res$type), structure(n_snps, names="SNP"))
  group_size <- group_size[rownames(estimated_group_prior_all)]
  
  #estimated group PVE (all iterations)
  estimated_group_pve_all <- estimated_group_prior_var_all*estimated_group_prior_all*group_size/sample_size #check PVE calculation
  
  #multitissue TWAS analysis with bonferroni adjusted threshold for z scores
  ld_exprfs <- paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_chr", 1:22, ".exprqc.Rd")
  z_gene <- list()
  for (j in 1:length(ld_exprfs)){
    load(ld_exprfs[j])
    z_gene[[j]] <- z_gene_chr
  }
  z_gene <- do.call(rbind, z_gene)
  alpha <- 0.05
  sig_thresh <- qnorm(1-(alpha/nrow(z_gene)/2), lower=T)
  twas_genes <- z_gene$id[abs(z_gene$z)>sig_thresh]
  twas_genes_combined <- unique(sapply(twas_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  n_twas_genes <- length(twas_genes)
  n_twas_genes_combined <- length(twas_genes_combined)
  
  n_twas_genes_in_causal <- sum(twas_genes %in% true_genes)
  n_twas_genes_in_causal_combined <- sum(twas_genes_combined %in% true_genes_combined)

  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal=as.integer(n_causal),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_pip=as.integer(n_ctwas_genes),
                                n_detected_pip_in_causal=as.integer(n_causal_detected),
                                n_detected_comb_pip=as.integer(n_ctwas_genes_combined),
                                n_detected_comb_pip_in_causal=as.integer(n_causal_detected_combined),
                                pve_snp=as.numeric(rev(estimated_group_pve_all["SNP",])[1]),
                                pve_weight1=as.numeric(rev(estimated_group_pve_all["Brain_Cerebellum_harmonized",])[1]),
                                pve_weight2=as.numeric(rev(estimated_group_pve_all["Brain_Hippocampus_harmonized",])[1]),
                                pve_weight3=as.numeric(rev(estimated_group_pve_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_weight1=as.numeric(rev(estimated_group_prior_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_weight2=as.numeric(rev(estimated_group_prior_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_weight3=as.numeric(rev(estimated_group_prior_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_var_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_var_weight1=as.numeric(rev(estimated_group_prior_var_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_var_weight2=as.numeric(rev(estimated_group_prior_var_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_var_weight3=as.numeric(rev(estimated_group_prior_var_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                n_detected_twas=as.integer(n_twas_genes),
                                n_detected_twas_in_causal=as.integer(n_twas_genes_in_causal),
                                n_detected_comb_twas=as.integer(n_twas_genes_combined),
                                n_detected_comb_twas_in_causal=as.integer(n_twas_genes_in_causal_combined))
  
  results_df <- rbind(results_df, results_current)
}

#results using PIP threshold (gene+tissue)
results_df[,c("simutag", "n_causal", "n_detected_pip", "n_detected_pip_in_causal")]
  simutag n_causal n_detected_pip n_detected_pip_in_causal
1     2-1       66             50                        8
2     2-2       77             44                        9
3     2-3       65              4                        2
#mean percent causal using PIP > 0.8
sum(results_df$n_detected_pip_in_causal)/sum(results_df$n_detected_pip)
[1] 0.1938776
#results using combined PIP threshold
results_df[,c("simutag", "n_causal_combined", "n_detected_comb_pip", "n_detected_comb_pip_in_causal")]
  simutag n_causal_combined n_detected_comb_pip n_detected_comb_pip_in_causal
1     2-1                66                  68                            11
2     2-2                77                  58                            18
3     2-3                65                   8                             4
#mean percent causal using combined PIP > 0.8
sum(results_df$n_detected_comb_pip_in_causal)/sum(results_df$n_detected_comb_pip)
[1] 0.2462687
#prior inclusion and mean prior inclusion
results_df[,c(which(colnames(results_df)=="simutag"), setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df))))]
  simutag prior_snp prior_weight1 prior_weight2 prior_weight3
1     2-1  7.844281  0.0041775611   0.005708919   0.003852244
2     2-2  7.063924  0.0010609557   0.004332984   0.015471552
3     2-3  8.280000  0.0001010686   0.003099930   0.016913171
colMeans(results_df[,setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df)))])
    prior_snp prior_weight1 prior_weight2 prior_weight3 
  7.729401759   0.001779862   0.004380611   0.012078989 
#prior variance and mean prior variance
results_df[,c(which(colnames(results_df)=="simutag"), grep("prior_var", names(results_df)))]
  simutag prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3
1     2-1      7.844281         15.242827         15.242827         15.242827
2     2-2      7.063924          9.080828          9.080828          9.080828
3     2-3      8.280000          7.075573          7.075573          7.075573
colMeans(results_df[,grep("prior_var", names(results_df))])
    prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3 
         7.729402         10.466410         10.466410         10.466410 
#PVE and mean PVE
results_df[,c(which(colnames(results_df)=="simutag"), grep("pve", names(results_df)))]
  simutag   pve_snp  pve_weight1 pve_weight2 pve_weight3
1     2-1 0.2622874 0.0124115190 0.015373544  0.01119708
2     2-2 0.2393578 0.0018778430 0.006951317  0.02679074
3     2-3 0.2680854 0.0001393845 0.003874968  0.02281981
colMeans(results_df[,grep("pve", names(results_df))])
    pve_snp pve_weight1 pve_weight2 pve_weight3 
0.256576890 0.004809582 0.008733277 0.020269210 
#TWAS results
results_df[,c(which(colnames(results_df)=="simutag"), grep("twas", names(results_df)))]
  simutag n_detected_twas n_detected_twas_in_causal n_detected_comb_twas
1     2-1             164                        17                  107
2     2-2             147                        20                   93
3     2-3             101                        15                   63
  n_detected_comb_twas_in_causal
1                             17
2                             20
3                             15
sum(results_df$n_detected_comb_twas_in_causal)/sum(results_df$n_detected_comb_twas)
[1] 0.1977186
#store results for figure
plot_df <- data.frame(simutag=results_df$simutag,
                      method="cTWAS G+T",
                      count=results_df$n_detected_pip-results_df$n_detected_pip_in_causal,
                      ifcausal=F)
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G+T",
                            count=results_df$n_detected_pip_in_causal,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G",
                            count=results_df$n_detected_comb_pip-results_df$n_detected_comb_pip_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G",
                            count=results_df$n_detected_comb_pip_in_causal,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="TWAS",
                            count=results_df$n_detected_comb_twas-results_df$n_detected_comb_twas_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="TWAS",
                            count=results_df$n_detected_comb_twas_in_causal,
                            ifcausal=T))

Separate effect size parameters

For the cTWAS analysis, each tissue had its own prior inclusion parameter and effect size parameter.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(2, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8
results_df <- data.frame(simutag=as.character(),
                              n_causal=as.integer(),
                              n_causal_combined=as.integer(),
                              n_detected_pip=as.integer(),
                              n_detected_pip_in_causal=as.integer(),
                              n_detected_comb_pip=as.integer(),
                              n_detected_comb_pip_in_causal=as.integer(),
                              pve_snp=as.numeric(),
                              pve_weight1=as.numeric(),
                              pve_weight2=as.numeric(),
                              pve_weight3=as.numeric(),
                              prior_weight1=as.numeric(),
                              prior_weight2=as.numeric(),
                              prior_weight3=as.numeric(),
                              prior_var_snp=as.numeric(),
                              prior_var_weight1=as.numeric(),
                              prior_var_weight2=as.numeric(),
                              prior_var_weight3=as.numeric())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #load cTWAS results
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of causal genes
  n_causal <- length(true_genes)
  n_causal_combined <- length(true_genes_combined)
  
  #number of gene+tissue combinations with cTWAS PIP > threshold
  n_ctwas_genes <- sum(ctwas_gene_res$susie_pip > PIP_threshold)
  
  #number of cTWAS genes that are causal
  n_causal_detected <- sum(ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold] %in% true_genes)
  
  #collapse gene+tissues to genes and compute combined PIP
  ctwas_gene_res$gene <- sapply(ctwas_gene_res$id, function(x){unlist(strsplit(x,"[|]"))[1]})
  ctwas_gene_res_combined <- aggregate(ctwas_gene_res$susie_pip, by=list(ctwas_gene_res$gene), FUN=sum)
  colnames(ctwas_gene_res_combined) <- c("gene", "pip_combined")
  
  #number of genes with combined PIP > threshold
  n_ctwas_genes_combined <- sum(ctwas_gene_res_combined$pip_combined > PIP_threshold)
  
  #number of cTWAS genes using combined PIP that are causal
  n_causal_detected_combined <- sum(ctwas_gene_res_combined$gene[ctwas_gene_res_combined$pip_combined > PIP_threshold] %in% true_genes_combined)
  
  #collect number of SNPs analyzed by cTWAS
  ctwas_res_s1 <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s1.susieIrss.txt"))
  n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
  rm(ctwas_res_s1)
  
  #load estimated parameters
  load(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s2.susieIrssres.Rd"))
  
  #estimated group prior (all iterations)
  estimated_group_prior_all <- group_prior_rec
  estimated_group_prior_all["SNP",] <- estimated_group_prior_all["SNP",]*thin #adjust parameter to account for thin argument
  
  #estimated group prior variance (all iterations)
  estimated_group_prior_var_all <- group_prior_var_rec
  
  #set group size
  group_size <- c(table(ctwas_gene_res$type), structure(n_snps, names="SNP"))
  group_size <- group_size[rownames(estimated_group_prior_all)]
  
  #estimated group PVE (all iterations)
  estimated_group_pve_all <- estimated_group_prior_var_all*estimated_group_prior_all*group_size/sample_size #check PVE calculation

  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal=as.integer(n_causal),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_pip=as.integer(n_ctwas_genes),
                                n_detected_pip_in_causal=as.integer(n_causal_detected),
                                n_detected_comb_pip=as.integer(n_ctwas_genes_combined),
                                n_detected_comb_pip_in_causal=as.integer(n_causal_detected_combined),
                                pve_snp=as.numeric(rev(estimated_group_pve_all["SNP",])[1]),
                                pve_weight1=as.numeric(rev(estimated_group_pve_all["Brain_Cerebellum_harmonized",])[1]),
                                pve_weight2=as.numeric(rev(estimated_group_pve_all["Brain_Hippocampus_harmonized",])[1]),
                                pve_weight3=as.numeric(rev(estimated_group_pve_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_weight1=as.numeric(rev(estimated_group_prior_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_weight2=as.numeric(rev(estimated_group_prior_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_weight3=as.numeric(rev(estimated_group_prior_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_var_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_var_weight1=as.numeric(rev(estimated_group_prior_var_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_var_weight2=as.numeric(rev(estimated_group_prior_var_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_var_weight3=as.numeric(rev(estimated_group_prior_var_all["Brain_Caudate_basal_ganglia_harmonized",])[1]))
  
  results_df <- rbind(results_df, results_current)
}

#results using PIP threshold (gene+tissue)
results_df[,c("simutag", "n_causal", "n_detected_pip", "n_detected_pip_in_causal")]
  simutag n_causal n_detected_pip n_detected_pip_in_causal
1     2-1       66             45                        9
2     2-2       77             46                        6
3     2-3       65             16                        3
#mean percent causal using PIP > 0.8
sum(results_df$n_detected_pip_in_causal)/sum(results_df$n_detected_pip)
[1] 0.1682243
#results using combined PIP threshold
results_df[,c("simutag", "n_causal_combined", "n_detected_comb_pip", "n_detected_comb_pip_in_causal")]
  simutag n_causal_combined n_detected_comb_pip n_detected_comb_pip_in_causal
1     2-1                66                  44                            10
2     2-2                77                  51                            16
3     2-3                65                  21                             7
#mean percent causal using combined PIP > 0.8
sum(results_df$n_detected_comb_pip_in_causal)/sum(results_df$n_detected_comb_pip)
[1] 0.2844828
#prior inclusion and mean prior inclusion
results_df[,c(which(colnames(results_df)=="simutag"), setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df))))]
  simutag prior_snp prior_weight1 prior_weight2 prior_weight3
1     2-1  8.104534  0.0017744642   0.014077209   0.004048987
2     2-2  7.339106  0.0008837711   0.001610797   0.024063238
3     2-3  8.310899  0.0030392077   0.002884784   0.015402053
colMeans(results_df[,setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df)))])
    prior_snp prior_weight1 prior_weight2 prior_weight3 
  7.918179541   0.001899148   0.006190930   0.014504760 
#prior variance and mean prior variance
results_df[,c(which(colnames(results_df)=="simutag"), grep("prior_var", names(results_df)))]
  simutag prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3
1     2-1      8.104534         45.154705          5.192346         14.985668
2     2-2      7.339106         15.491046         39.302892          5.098351
3     2-3      8.310899          1.070797          8.888097          7.451467
colMeans(results_df[,grep("prior_var", names(results_df))])
    prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3 
         7.918180         20.572183         17.794445          9.178496 
#PVE and mean PVE
results_df[,c(which(colnames(results_df)=="simutag"), grep("pve", names(results_df)))]
  simutag   pve_snp  pve_weight1 pve_weight2 pve_weight3
1     2-1 0.2596614 0.0156173316 0.012913227  0.01157039
2     2-2 0.2360318 0.0026684382 0.011184587  0.02339425
3     2-3 0.2673311 0.0006343136 0.004529776  0.02188496
colMeans(results_df[,grep("pve", names(results_df))])
    pve_snp pve_weight1 pve_weight2 pve_weight3 
0.254341409 0.006306694 0.009542530 0.018949868 
#store results for figure
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G - Ind Var",
                            count=results_df$n_detected_comb_pip-results_df$n_detected_comb_pip_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G - Ind Var",
                            count=results_df$n_detected_comb_pip_in_causal,
                            ifcausal=T))

Individual tissue analyses

For the cTWAS analysis, each tissue was analyzed individually and the results were combined.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(2, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8

results_df <- data.frame(simutag=as.character(),
                         n_causal_combined=as.integer(),
                         n_detected_weight1=as.integer(),
                         n_detected_in_causal_weight1=as.integer(),
                         n_detected_weight2=as.integer(),
                         n_detected_in_causal_weight2=as.integer(),
                         n_detected_weight3=as.integer(),
                         n_detected_in_causal_weight3=as.integer(),
                         n_detected_combined=as.integer(),
                         n_detected_in_causal_combined=as.integer())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #number of causal genes
  n_causal_combined <- length(true_genes_combined)
  
  #load cTWAS results for weight1
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight1.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight1
  ctwas_genes_weight1 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight1 <- length(ctwas_genes_weight1)
  n_causal_detected_weight1 <- sum(ctwas_genes_weight1 %in% true_genes_combined)
  
  #load cTWAS results for weight2
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight2.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight1
  ctwas_genes_weight2 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight2 <- length(ctwas_genes_weight2)
  n_causal_detected_weight2 <- sum(ctwas_genes_weight2 %in% true_genes_combined)
  
  #load cTWAS results for weight3
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight3.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight3
  ctwas_genes_weight3 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight3 <- length(ctwas_genes_weight3)
  n_causal_detected_weight3 <- sum(ctwas_genes_weight3 %in% true_genes_combined)
  
  #combined analysis
  ctwas_genes_combined <- unique(c(ctwas_genes_weight1, ctwas_genes_weight2, ctwas_genes_weight3))
  n_ctwas_genes_combined <- length(ctwas_genes_combined)
  n_causal_detected_combined <- sum(ctwas_genes_combined %in% true_genes_combined)
  
  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_weight1=as.integer(n_ctwas_genes_weight1),
                                n_detected_in_causal_weight1=as.integer(n_causal_detected_weight1),
                                n_detected_weight2=as.integer(n_ctwas_genes_weight2),
                                n_detected_in_causal_weight2=as.integer(n_causal_detected_weight2),
                                n_detected_weight3=as.integer(n_ctwas_genes_weight3),
                                n_detected_in_causal_weight3=as.integer(n_causal_detected_weight3),
                                n_detected_combined=as.integer(n_ctwas_genes_combined),
                                n_detected_in_causal_combined=as.integer(n_causal_detected_combined))
  
  results_df <- rbind(results_df, results_current)
}

#results using weight1
results_df[,c("simutag", colnames(results_df)[grep("weight1", colnames(results_df))])]
  simutag n_detected_weight1 n_detected_in_causal_weight1
1     2-1                 32                            8
2     2-2                 22                            3
3     2-3                 19                            3
#mean percent causal using PIP > 0.8 for weight1
sum(results_df$n_detected_in_causal_weight1)/sum(results_df$n_detected_weight1)
[1] 0.1917808
#results using weight2
results_df[,c("simutag", colnames(results_df)[grep("weight2", colnames(results_df))])]
  simutag n_detected_weight2 n_detected_in_causal_weight2
1     2-1                 34                            4
2     2-2                 30                           12
3     2-3                 24                            4
#mean percent causal using PIP > 0.8 for weight1
sum(results_df$n_detected_in_causal_weight2)/sum(results_df$n_detected_weight2)
[1] 0.2272727
#results using weight3
results_df[,c("simutag", colnames(results_df)[grep("weight3", colnames(results_df))])]
  simutag n_detected_weight3 n_detected_in_causal_weight3
1     2-1                 29                            3
2     2-2                  0                            0
3     2-3                 33                            9
#mean percent causal using PIP > 0.8 for weight3
sum(results_df$n_detected_in_causal_weight3)/sum(results_df$n_detected_weight3)
[1] 0.1935484
#results using combined analysis
results_df[,c("simutag", colnames(results_df)[grep("combined", colnames(results_df))])]
  simutag n_causal_combined n_detected_combined n_detected_in_causal_combined
1     2-1                66                  72                            12
2     2-2                77                  46                            13
3     2-3                65                  63                            12
#mean percent causal using PIP > 0.8 for weight3
sum(results_df$n_detected_in_causal_combined)/sum(results_df$n_detected_combined)
[1] 0.2044199
#store results for figure

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight1",
                            count=results_df$n_detected_weight1-results_df$n_detected_in_causal_weight1,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight1",
                            count=results_df$n_detected_in_causal_weight1,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight2",
                            count=results_df$n_detected_weight2-results_df$n_detected_in_causal_weight2,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight2",
                            count=results_df$n_detected_in_causal_weight2,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight3",
                            count=results_df$n_detected_weight3-results_df$n_detected_in_causal_weight3,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight3",
                            count=results_df$n_detected_in_causal_weight3,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Union",
                            count=results_df$n_detected_combined-results_df$n_detected_in_causal_combined,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Union",
                            count=results_df$n_detected_in_causal_combined,
                            ifcausal=T))

Figure

For the cTWAS analysis, each tissue was analyzed individually and the results were combined.

plot_df$method <- factor(plot_df$method, levels=c("TWAS", "cTWAS G+T", "cTWAS G", "cTWAS G - Ind Var", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3"))

library(ggpubr)

colset = c("#ebebeb", "#fb8072") 

ggbarplot(plot_df, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "none", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))

plot_df_2 <- plot_df

plot_df_2$method %in% c("TWAS", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3")
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[37]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
plot_df_2$ifcausal <- as.character(plot_df_2$ifcausal+1)
plot_df_2$count[plot_df_2$method=="cTWAS G" & plot_df_2$ifcausal=="2"] <- plot_df_2$count[plot_df_2$method=="cTWAS G" & plot_df_2$ifcausal=="2"] - plot_df_2$count[plot_df_2$method=="cTWAS G+T" & plot_df_2$ifcausal=="2"]

plot_df_2 <- rbind(plot_df_2, data.frame(simutag=simutags,
                                    method="cTWAS G",
                                    count=plot_df_2$count[plot_df_2$method=="cTWAS G+T" & plot_df_2$ifcausal=="2"],
                                    ifcausal="3"))




plot_df_3 <- plot_df_2


plot_df_2 <- plot_df_2[plot_df_2$method %in% c("TWAS", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3"),]
plot_df_2$method <- droplevels(plot_df_2$method)
levels(plot_df_2$method) <- c("TWAS", "cTWAS Union", "cTWAS Primary", "cTWAS Secondary", "cTWAS Null")

plot_df_2$ifcausal[plot_df_2$ifcausal=="1"] <- "FP"
plot_df_2$ifcausal[plot_df_2$ifcausal=="2"] <- "TP"

colset = c("#ebebeb", "#fb8072", "#8dd3c7")

ggbarplot(plot_df_2, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "right", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + guides(fill=guide_legend(title=""))

plot_df_3 <- plot_df_3[plot_df_3$method %in% c("cTWAS G", "cTWAS Union"),]
plot_df_3$method <- droplevels(plot_df_3$method)
levels(plot_df_3$method) <- c("cTWAS Joint", "cTWAS Union")

plot_df_3$method <- relevel(plot_df_3$method, "cTWAS Union")

plot_df_3$ifcausal[plot_df_3$ifcausal=="1"] <- "FP"
plot_df_3$ifcausal[plot_df_3$ifcausal=="2"] <- "TP"
plot_df_3$ifcausal[plot_df_3$ifcausal=="3"] <- "TP and Tissue Identified\nby Joint Analysis"

ggbarplot(plot_df_3, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "right", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + guides(fill=guide_legend(title=""))

Simulation 3: Three causal tissues with unequal PVE

Shared effect size parameters

30% PVE and 2.5e-4 prior inclusion for SNPs, 3% PVE and 0.009 prior inclusion for Brain Cerebellum expression, 2% PVE and 0.006 prior inclusion for Brain Hippocampus expression, 1% PVE and 0.003 prior inclusion for Brain Caudate basal ganglia expression. For the cTWAS analysis, each tissue had its own prior inclusion parameter, and the tissues shared a single effect size parameter.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(3, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8
results_df <- data.frame(simutag=as.character(),
                              n_causal=as.integer(),
                              n_causal_combined=as.integer(),
                              n_detected_pip=as.integer(),
                              n_detected_pip_in_causal=as.integer(),
                              n_detected_comb_pip=as.integer(),
                              n_detected_comb_pip_in_causal=as.integer(),
                              pve_snp=as.numeric(),
                              pve_weight1=as.numeric(),
                              pve_weight2=as.numeric(),
                              pve_weight3=as.numeric(),
                              prior_weight1=as.numeric(),
                              prior_weight2=as.numeric(),
                              prior_weight3=as.numeric(),
                              prior_var_snp=as.numeric(),
                              prior_var_weight1=as.numeric(),
                              prior_var_weight2=as.numeric(),
                              prior_var_weight3=as.numeric(),
                              n_detected_twas=as.integer(),
                              n_detected_twas_in_causal=as.integer(),
                              n_detected_comb_twas=as.integer(),
                              n_detected_comb_twas_in_causal=as.integer())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #load cTWAS results
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of causal genes
  n_causal <- length(true_genes)
  n_causal_combined <- length(true_genes_combined)
  
  #number of gene+tissue combinations with cTWAS PIP > threshold
  n_ctwas_genes <- sum(ctwas_gene_res$susie_pip > PIP_threshold)
  
  #number of cTWAS genes that are causal
  n_causal_detected <- sum(ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold] %in% true_genes)
  
  #collapse gene+tissues to genes and compute combined PIP
  ctwas_gene_res$gene <- sapply(ctwas_gene_res$id, function(x){unlist(strsplit(x,"[|]"))[1]})
  ctwas_gene_res_combined <- aggregate(ctwas_gene_res$susie_pip, by=list(ctwas_gene_res$gene), FUN=sum)
  colnames(ctwas_gene_res_combined) <- c("gene", "pip_combined")
  
  #number of genes with combined PIP > threshold
  n_ctwas_genes_combined <- sum(ctwas_gene_res_combined$pip_combined > PIP_threshold)
  
  #number of cTWAS genes using combined PIP that are causal
  n_causal_detected_combined <- sum(ctwas_gene_res_combined$gene[ctwas_gene_res_combined$pip_combined > PIP_threshold] %in% true_genes_combined)
  
  #collect number of SNPs analyzed by cTWAS
  ctwas_res_s1 <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s1.susieIrss.txt"))
  n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
  rm(ctwas_res_s1)
  
  #load estimated parameters
  load(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s2.susieIrssres.Rd"))
  
  #estimated group prior (all iterations)
  estimated_group_prior_all <- group_prior_rec
  estimated_group_prior_all["SNP",] <- estimated_group_prior_all["SNP",]*thin #adjust parameter to account for thin argument
  
  #estimated group prior variance (all iterations)
  estimated_group_prior_var_all <- group_prior_var_rec
  
  #set group size
  group_size <- c(table(ctwas_gene_res$type), structure(n_snps, names="SNP"))
  group_size <- group_size[rownames(estimated_group_prior_all)]
  
  #estimated group PVE (all iterations)
  estimated_group_pve_all <- estimated_group_prior_var_all*estimated_group_prior_all*group_size/sample_size #check PVE calculation
  
  #multitissue TWAS analysis with bonferroni adjusted threshold for z scores
  #load(paste0(results_dir, runtag, "_simu", simutag, "_LDR_z_gene.Rd"))
  ld_exprfs <- paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_chr", 1:22, ".exprqc.Rd")
  z_gene <- list()
  for (j in 1:length(ld_exprfs)){
    load(ld_exprfs[j])
    z_gene[[j]] <- z_gene_chr
  }
  z_gene <- do.call(rbind, z_gene)
  alpha <- 0.05
  sig_thresh <- qnorm(1-(alpha/nrow(z_gene)/2), lower=T)
  twas_genes <- z_gene$id[abs(z_gene$z)>sig_thresh]
  twas_genes_combined <- unique(sapply(twas_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  n_twas_genes <- length(twas_genes)
  n_twas_genes_combined <- length(twas_genes_combined)
  
  n_twas_genes_in_causal <- sum(twas_genes %in% true_genes)
  n_twas_genes_in_causal_combined <- sum(twas_genes_combined %in% true_genes_combined)

  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal=as.integer(n_causal),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_pip=as.integer(n_ctwas_genes),
                                n_detected_pip_in_causal=as.integer(n_causal_detected),
                                n_detected_comb_pip=as.integer(n_ctwas_genes_combined),
                                n_detected_comb_pip_in_causal=as.integer(n_causal_detected_combined),
                                pve_snp=as.numeric(rev(estimated_group_pve_all["SNP",])[1]),
                                pve_weight1=as.numeric(rev(estimated_group_pve_all["Brain_Cerebellum_harmonized",])[1]),
                                pve_weight2=as.numeric(rev(estimated_group_pve_all["Brain_Hippocampus_harmonized",])[1]),
                                pve_weight3=as.numeric(rev(estimated_group_pve_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_weight1=as.numeric(rev(estimated_group_prior_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_weight2=as.numeric(rev(estimated_group_prior_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_weight3=as.numeric(rev(estimated_group_prior_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_var_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_var_weight1=as.numeric(rev(estimated_group_prior_var_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_var_weight2=as.numeric(rev(estimated_group_prior_var_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_var_weight3=as.numeric(rev(estimated_group_prior_var_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                n_detected_twas=as.integer(n_twas_genes),
                                n_detected_twas_in_causal=as.integer(n_twas_genes_in_causal),
                                n_detected_comb_twas=as.integer(n_twas_genes_combined),
                                n_detected_comb_twas_in_causal=as.integer(n_twas_genes_in_causal_combined))
  
  results_df <- rbind(results_df, results_current)
}

#results using PIP threshold (gene+tissue)
results_df[,c("simutag", "n_causal", "n_detected_pip", "n_detected_pip_in_causal")]
  simutag n_causal n_detected_pip n_detected_pip_in_causal
1     3-1      137             82                       25
2     3-2      158             57                       16
3     3-3      135             79                       12
#mean percent causal using PIP > 0.8
sum(results_df$n_detected_pip_in_causal)/sum(results_df$n_detected_pip)
[1] 0.2431193
#results using combined PIP threshold
results_df[,c("simutag", "n_causal_combined", "n_detected_comb_pip", "n_detected_comb_pip_in_causal")]
  simutag n_causal_combined n_detected_comb_pip n_detected_comb_pip_in_causal
1     3-1               137                  79                            30
2     3-2               156                  59                            25
3     3-3               134                  84                            21
#mean percent causal using combined PIP > 0.8
sum(results_df$n_detected_comb_pip_in_causal)/sum(results_df$n_detected_comb_pip)
[1] 0.3423423
#prior inclusion and mean prior inclusion
results_df[,c(which(colnames(results_df)=="simutag"), setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df))))]
  simutag prior_snp prior_weight1 prior_weight2 prior_weight3
1     3-1  7.694010   0.009496299   0.015700714  0.0005218101
2     3-2  8.968999   0.010896212   0.002214315  0.0065810326
3     3-3  8.327279   0.011588585   0.003198182  0.0106307116
colMeans(results_df[,setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df)))])
    prior_snp prior_weight1 prior_weight2 prior_weight3 
  8.330095836   0.010660365   0.007037737   0.005911185 
#prior variance and mean prior variance
results_df[,c(which(colnames(results_df)=="simutag"), grep("prior_var", names(results_df)))]
  simutag prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3
1     3-1      7.694010          20.04747          11.83834        102.068952
2     3-2      8.968999          19.77378           1.67682         16.497945
3     3-3      8.327279          15.43960          31.66265          3.725893
colMeans(results_df[,grep("prior_var", names(results_df))])
    prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3 
         8.330096         18.420282         15.059270         40.764263 
#PVE and mean PVE
results_df[,c(which(colnames(results_df)=="simutag"), grep("pve", names(results_df)))]
  simutag   pve_snp pve_weight1  pve_weight2 pve_weight3
1     3-1 0.2386009  0.03710654 0.0328371066 0.010156208
2     3-2 0.3126840  0.04199542 0.0006559645 0.020703762
3     3-3 0.2762783  0.03487410 0.0178897832 0.007552976
colMeans(results_df[,grep("pve", names(results_df))])
    pve_snp pve_weight1 pve_weight2 pve_weight3 
 0.27585440  0.03799202  0.01712762  0.01280432 
#TWAS results
results_df[,c(which(colnames(results_df)=="simutag"), grep("twas", names(results_df)))]
  simutag n_detected_twas n_detected_twas_in_causal n_detected_comb_twas
1     3-1             209                        42                  121
2     3-2             190                        35                  113
3     3-3             183                        33                  104
  n_detected_comb_twas_in_causal
1                             42
2                             36
3                             33
sum(results_df$n_detected_comb_twas_in_causal)/sum(results_df$n_detected_comb_twas)
[1] 0.3284024
#store results for figure
plot_df <- data.frame(simutag=results_df$simutag,
                      method="cTWAS G+T",
                      count=results_df$n_detected_pip-results_df$n_detected_pip_in_causal,
                      ifcausal=F)
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G+T",
                            count=results_df$n_detected_pip_in_causal,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G",
                            count=results_df$n_detected_comb_pip-results_df$n_detected_comb_pip_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G",
                            count=results_df$n_detected_comb_pip_in_causal,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="TWAS",
                            count=results_df$n_detected_comb_twas-results_df$n_detected_comb_twas_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="TWAS",
                            count=results_df$n_detected_comb_twas_in_causal,
                            ifcausal=T))

Separate effect size parameters

For the cTWAS analysis, each tissue had its own prior inclusion parameter and effect size parameter.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(3, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8

results_df <- data.frame(simutag=as.character(),
                              n_causal=as.integer(),
                              n_causal_combined=as.integer(),
                              n_detected_pip=as.integer(),
                              n_detected_pip_in_causal=as.integer(),
                              n_detected_comb_pip=as.integer(),
                              n_detected_comb_pip_in_causal=as.integer(),
                              pve_snp=as.numeric(),
                              pve_weight1=as.numeric(),
                              pve_weight2=as.numeric(),
                              pve_weight3=as.numeric(),
                              prior_weight1=as.numeric(),
                              prior_weight2=as.numeric(),
                              prior_weight3=as.numeric(),
                              prior_var_snp=as.numeric(),
                              prior_var_weight1=as.numeric(),
                              prior_var_weight2=as.numeric(),
                              prior_var_weight3=as.numeric())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #load cTWAS results
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of causal genes
  n_causal <- length(true_genes)
  n_causal_combined <- length(true_genes_combined)
  
  #number of gene+tissue combinations with cTWAS PIP > threshold
  n_ctwas_genes <- sum(ctwas_gene_res$susie_pip > PIP_threshold)
  
  #number of cTWAS genes that are causal
  n_causal_detected <- sum(ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold] %in% true_genes)
  
  #collapse gene+tissues to genes and compute combined PIP
  ctwas_gene_res$gene <- sapply(ctwas_gene_res$id, function(x){unlist(strsplit(x,"[|]"))[1]})
  ctwas_gene_res_combined <- aggregate(ctwas_gene_res$susie_pip, by=list(ctwas_gene_res$gene), FUN=sum)
  colnames(ctwas_gene_res_combined) <- c("gene", "pip_combined")
  
  #number of genes with combined PIP > threshold
  n_ctwas_genes_combined <- sum(ctwas_gene_res_combined$pip_combined > PIP_threshold)
  
  #number of cTWAS genes using combined PIP that are causal
  n_causal_detected_combined <- sum(ctwas_gene_res_combined$gene[ctwas_gene_res_combined$pip_combined > PIP_threshold] %in% true_genes_combined)
  
  #collect number of SNPs analyzed by cTWAS
  ctwas_res_s1 <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s1.susieIrss.txt"))
  n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
  rm(ctwas_res_s1)
  
  #load estimated parameters
  load(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s2.susieIrssres.Rd"))
  
  #estimated group prior (all iterations)
  estimated_group_prior_all <- group_prior_rec
  estimated_group_prior_all["SNP",] <- estimated_group_prior_all["SNP",]*thin #adjust parameter to account for thin argument
  
  #estimated group prior variance (all iterations)
  estimated_group_prior_var_all <- group_prior_var_rec
  
  #set group size
  group_size <- c(table(ctwas_gene_res$type), structure(n_snps, names="SNP"))
  group_size <- group_size[rownames(estimated_group_prior_all)]
  
  #estimated group PVE (all iterations)
  estimated_group_pve_all <- estimated_group_prior_var_all*estimated_group_prior_all*group_size/sample_size #check PVE calculation

  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal=as.integer(n_causal),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_pip=as.integer(n_ctwas_genes),
                                n_detected_pip_in_causal=as.integer(n_causal_detected),
                                n_detected_comb_pip=as.integer(n_ctwas_genes_combined),
                                n_detected_comb_pip_in_causal=as.integer(n_causal_detected_combined),
                                pve_snp=as.numeric(rev(estimated_group_pve_all["SNP",])[1]),
                                pve_weight1=as.numeric(rev(estimated_group_pve_all["Brain_Cerebellum_harmonized",])[1]),
                                pve_weight2=as.numeric(rev(estimated_group_pve_all["Brain_Hippocampus_harmonized",])[1]),
                                pve_weight3=as.numeric(rev(estimated_group_pve_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_weight1=as.numeric(rev(estimated_group_prior_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_weight2=as.numeric(rev(estimated_group_prior_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_weight3=as.numeric(rev(estimated_group_prior_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_var_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_var_weight1=as.numeric(rev(estimated_group_prior_var_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_var_weight2=as.numeric(rev(estimated_group_prior_var_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_var_weight3=as.numeric(rev(estimated_group_prior_var_all["Brain_Caudate_basal_ganglia_harmonized",])[1]))
  
  results_df <- rbind(results_df, results_current)
}

#results using PIP threshold (gene+tissue)
results_df[,c("simutag", "n_causal", "n_detected_pip", "n_detected_pip_in_causal")]
  simutag n_causal n_detected_pip n_detected_pip_in_causal
1     3-1      137             82                       25
2     3-2      158             57                       16
3     3-3      135             79                       12
#mean percent causal using PIP > 0.8
sum(results_df$n_detected_pip_in_causal)/sum(results_df$n_detected_pip)
[1] 0.2431193
#results using combined PIP threshold
results_df[,c("simutag", "n_causal_combined", "n_detected_comb_pip", "n_detected_comb_pip_in_causal")]
  simutag n_causal_combined n_detected_comb_pip n_detected_comb_pip_in_causal
1     3-1               137                  79                            30
2     3-2               156                  59                            25
3     3-3               134                  84                            21
#mean percent causal using combined PIP > 0.8
sum(results_df$n_detected_comb_pip_in_causal)/sum(results_df$n_detected_comb_pip)
[1] 0.3423423
#prior inclusion and mean prior inclusion
results_df[,c(which(colnames(results_df)=="simutag"), setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df))))]
  simutag prior_snp prior_weight1 prior_weight2 prior_weight3
1     3-1  7.694010   0.009496299   0.015700714  0.0005218101
2     3-2  8.968999   0.010896212   0.002214315  0.0065810326
3     3-3  8.327279   0.011588585   0.003198182  0.0106307116
colMeans(results_df[,setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df)))])
    prior_snp prior_weight1 prior_weight2 prior_weight3 
  8.330095836   0.010660365   0.007037737   0.005911185 
#prior variance and mean prior variance
results_df[,c(which(colnames(results_df)=="simutag"), grep("prior_var", names(results_df)))]
  simutag prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3
1     3-1      7.694010          20.04747          11.83834        102.068952
2     3-2      8.968999          19.77378           1.67682         16.497945
3     3-3      8.327279          15.43960          31.66265          3.725893
colMeans(results_df[,grep("prior_var", names(results_df))])
    prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3 
         8.330096         18.420282         15.059270         40.764263 
#PVE and mean PVE
results_df[,c(which(colnames(results_df)=="simutag"), grep("pve", names(results_df)))]
  simutag   pve_snp pve_weight1  pve_weight2 pve_weight3
1     3-1 0.2386009  0.03710654 0.0328371066 0.010156208
2     3-2 0.3126840  0.04199542 0.0006559645 0.020703762
3     3-3 0.2762783  0.03487410 0.0178897832 0.007552976
colMeans(results_df[,grep("pve", names(results_df))])
    pve_snp pve_weight1 pve_weight2 pve_weight3 
 0.27585440  0.03799202  0.01712762  0.01280432 
#store results for figure
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G - Ind Var",
                            count=results_df$n_detected_comb_pip-results_df$n_detected_comb_pip_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G - Ind Var",
                            count=results_df$n_detected_comb_pip_in_causal,
                            ifcausal=T))

Individual tissue analyses

For the cTWAS analysis, each tissue was analyzed individually and the results were combined.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(3, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8

results_df <- data.frame(simutag=as.character(),
                         n_causal_combined=as.integer(),
                         n_detected_weight1=as.integer(),
                         n_detected_in_causal_weight1=as.integer(),
                         n_detected_weight2=as.integer(),
                         n_detected_in_causal_weight2=as.integer(),
                         n_detected_weight3=as.integer(),
                         n_detected_in_causal_weight3=as.integer(),
                         n_detected_combined=as.integer(),
                         n_detected_in_causal_combined=as.integer())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #number of causal genes
  n_causal_combined <- length(true_genes_combined)
  
  #load cTWAS results for weight1
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight1.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight1
  ctwas_genes_weight1 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight1 <- length(ctwas_genes_weight1)
  n_causal_detected_weight1 <- sum(ctwas_genes_weight1 %in% true_genes_combined)
  
  #load cTWAS results for weight2
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight2.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight1
  ctwas_genes_weight2 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight2 <- length(ctwas_genes_weight2)
  n_causal_detected_weight2 <- sum(ctwas_genes_weight2 %in% true_genes_combined)
  
  #load cTWAS results for weight3
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight3.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight3
  ctwas_genes_weight3 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight3 <- length(ctwas_genes_weight3)
  n_causal_detected_weight3 <- sum(ctwas_genes_weight3 %in% true_genes_combined)
  
  #combined analysis
  ctwas_genes_combined <- unique(c(ctwas_genes_weight1, ctwas_genes_weight2, ctwas_genes_weight3))
  n_ctwas_genes_combined <- length(ctwas_genes_combined)
  n_causal_detected_combined <- sum(ctwas_genes_combined %in% true_genes_combined)
  
  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_weight1=as.integer(n_ctwas_genes_weight1),
                                n_detected_in_causal_weight1=as.integer(n_causal_detected_weight1),
                                n_detected_weight2=as.integer(n_ctwas_genes_weight2),
                                n_detected_in_causal_weight2=as.integer(n_causal_detected_weight2),
                                n_detected_weight3=as.integer(n_ctwas_genes_weight3),
                                n_detected_in_causal_weight3=as.integer(n_causal_detected_weight3),
                                n_detected_combined=as.integer(n_ctwas_genes_combined),
                                n_detected_in_causal_combined=as.integer(n_causal_detected_combined))
  
  results_df <- rbind(results_df, results_current)
}

#results using weight1
results_df[,c("simutag", colnames(results_df)[grep("weight1", colnames(results_df))])]
  simutag n_detected_weight1 n_detected_in_causal_weight1
1     3-1                 27                           10
2     3-2                 29                           15
3     3-3                 52                           11
#mean percent causal using PIP > 0.8 for weight1
sum(results_df$n_detected_in_causal_weight1)/sum(results_df$n_detected_weight1)
[1] 0.3333333
#results using weight2
results_df[,c("simutag", colnames(results_df)[grep("weight2", colnames(results_df))])]
  simutag n_detected_weight2 n_detected_in_causal_weight2
1     3-1                 40                           15
2     3-2                 32                           11
3     3-3                 47                           13
#mean percent causal using PIP > 0.8 for weight1
sum(results_df$n_detected_in_causal_weight2)/sum(results_df$n_detected_weight2)
[1] 0.3277311
#results using weight3
results_df[,c("simutag", colnames(results_df)[grep("weight3", colnames(results_df))])]
  simutag n_detected_weight3 n_detected_in_causal_weight3
1     3-1                 35                           10
2     3-2                 33                           10
3     3-3                 24                            7
#mean percent causal using PIP > 0.8 for weight3
sum(results_df$n_detected_in_causal_weight3)/sum(results_df$n_detected_weight3)
[1] 0.2934783
#results using combined analysis
results_df[,c("simutag", colnames(results_df)[grep("combined", colnames(results_df))])]
  simutag n_causal_combined n_detected_combined n_detected_in_causal_combined
1     3-1               137                  76                            26
2     3-2               156                  71                            27
3     3-3               134                  96                            22
#mean percent causal using PIP > 0.8 for weight3
sum(results_df$n_detected_in_causal_combined)/sum(results_df$n_detected_combined)
[1] 0.308642
#store results for figure

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight1",
                            count=results_df$n_detected_weight1-results_df$n_detected_in_causal_weight1,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight1",
                            count=results_df$n_detected_in_causal_weight1,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight2",
                            count=results_df$n_detected_weight2-results_df$n_detected_in_causal_weight2,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight2",
                            count=results_df$n_detected_in_causal_weight2,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight3",
                            count=results_df$n_detected_weight3-results_df$n_detected_in_causal_weight3,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight3",
                            count=results_df$n_detected_in_causal_weight3,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Union",
                            count=results_df$n_detected_combined-results_df$n_detected_in_causal_combined,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Union",
                            count=results_df$n_detected_in_causal_combined,
                            ifcausal=T))

Figure

For the cTWAS analysis, each tissue was analyzed individually and the results were combined.

plot_df$method <- factor(plot_df$method, levels=c("TWAS", "cTWAS G+T", "cTWAS G", "cTWAS G - Ind Var", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3"))

library(ggpubr)

colset = c("#ebebeb", "#fb8072") 

ggbarplot(plot_df, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "none", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))

plot_df_2 <- plot_df

plot_df_2$method %in% c("TWAS", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3")
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[37]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
plot_df_2$ifcausal <- as.character(plot_df_2$ifcausal+1)
plot_df_2$count[plot_df_2$method=="cTWAS G" & plot_df_2$ifcausal=="2"] <- plot_df_2$count[plot_df_2$method=="cTWAS G" & plot_df_2$ifcausal=="2"] - plot_df_2$count[plot_df_2$method=="cTWAS G+T" & plot_df_2$ifcausal=="2"]

plot_df_2 <- rbind(plot_df_2, data.frame(simutag=simutags,
                                    method="cTWAS G",
                                    count=plot_df_2$count[plot_df_2$method=="cTWAS G+T" & plot_df_2$ifcausal=="2"],
                                    ifcausal="3"))




plot_df_3 <- plot_df_2


plot_df_2 <- plot_df_2[plot_df_2$method %in% c("TWAS", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3"),]
plot_df_2$method <- droplevels(plot_df_2$method)
levels(plot_df_2$method) <- c("TWAS", "cTWAS Union", "cTWAS Primary", "cTWAS Secondary", "cTWAS Null")

plot_df_2$ifcausal[plot_df_2$ifcausal=="1"] <- "FP"
plot_df_2$ifcausal[plot_df_2$ifcausal=="2"] <- "TP"

colset = c("#ebebeb", "#fb8072", "#8dd3c7")

ggbarplot(plot_df_2, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "right", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + guides(fill=guide_legend(title=""))

plot_df_3 <- plot_df_3[plot_df_3$method %in% c("cTWAS G", "cTWAS Union"),]
plot_df_3$method <- droplevels(plot_df_3$method)
levels(plot_df_3$method) <- c("cTWAS Joint", "cTWAS Union")

plot_df_3$method <- relevel(plot_df_3$method, "cTWAS Union")

plot_df_3$ifcausal[plot_df_3$ifcausal=="1"] <- "FP"
plot_df_3$ifcausal[plot_df_3$ifcausal=="2"] <- "TP"
plot_df_3$ifcausal[plot_df_3$ifcausal=="3"] <- "TP and Tissue Identified\nby Joint Analysis"

ggbarplot(plot_df_3, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "right", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + guides(fill=guide_legend(title=""))

Simulation 4: Two causal tissues with unequal PVE

Shared effect size parameters

30% PVE and 2.5e-4 prior inclusion for SNPs, 3% PVE and 0.009 prior inclusion for Brain Cerebellum expression, 1% PVE and 0.003 prior inclusion for Brain Hippocampus expression. For the cTWAS analysis, each tissue had its own prior inclusion parameter, and the tissues shared a single effect size parameter.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 1

simutags <- paste(4, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8

results_df <- data.frame(simutag=as.character(),
                              n_causal=as.integer(),
                              n_causal_combined=as.integer(),
                              n_detected_pip=as.integer(),
                              n_detected_pip_in_causal=as.integer(),
                              n_detected_comb_pip=as.integer(),
                              n_detected_comb_pip_in_causal=as.integer(),
                              pve_snp=as.numeric(),
                              pve_weight1=as.numeric(),
                              pve_weight2=as.numeric(),
                              pve_weight3=as.numeric(),
                              prior_weight1=as.numeric(),
                              prior_weight2=as.numeric(),
                              prior_weight3=as.numeric(),
                              prior_var_snp=as.numeric(),
                              prior_var_weight1=as.numeric(),
                              prior_var_weight2=as.numeric(),
                              prior_var_weight3=as.numeric(),
                              n_detected_twas=as.integer(),
                              n_detected_twas_in_causal=as.integer(),
                              n_detected_comb_twas=as.integer(),
                              n_detected_comb_twas_in_causal=as.integer())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #load cTWAS results
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of causal genes
  n_causal <- length(true_genes)
  n_causal_combined <- length(true_genes_combined)
  
  #number of gene+tissue combinations with cTWAS PIP > threshold
  n_ctwas_genes <- sum(ctwas_gene_res$susie_pip > PIP_threshold)
  
  #number of cTWAS genes that are causal
  n_causal_detected <- sum(ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold] %in% true_genes)
  
  #collapse gene+tissues to genes and compute combined PIP
  ctwas_gene_res$gene <- sapply(ctwas_gene_res$id, function(x){unlist(strsplit(x,"[|]"))[1]})
  ctwas_gene_res_combined <- aggregate(ctwas_gene_res$susie_pip, by=list(ctwas_gene_res$gene), FUN=sum)
  colnames(ctwas_gene_res_combined) <- c("gene", "pip_combined")
  
  #number of genes with combined PIP > threshold
  n_ctwas_genes_combined <- sum(ctwas_gene_res_combined$pip_combined > PIP_threshold)
  
  #number of cTWAS genes using combined PIP that are causal
  n_causal_detected_combined <- sum(ctwas_gene_res_combined$gene[ctwas_gene_res_combined$pip_combined > PIP_threshold] %in% true_genes_combined)
  
  #collect number of SNPs analyzed by cTWAS
  ctwas_res_s1 <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s1.susieIrss.txt"))
  n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
  rm(ctwas_res_s1)
  
  #load estimated parameters
  load(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s2.susieIrssres.Rd"))
  
  #estimated group prior (all iterations)
  estimated_group_prior_all <- group_prior_rec
  estimated_group_prior_all["SNP",] <- estimated_group_prior_all["SNP",]*thin #adjust parameter to account for thin argument
  
  #estimated group prior variance (all iterations)
  estimated_group_prior_var_all <- group_prior_var_rec
  
  #set group size
  group_size <- c(table(ctwas_gene_res$type), structure(n_snps, names="SNP"))
  group_size <- group_size[rownames(estimated_group_prior_all)]
  
  #estimated group PVE (all iterations)
  estimated_group_pve_all <- estimated_group_prior_var_all*estimated_group_prior_all*group_size/sample_size #check PVE calculation
  
  #multitissue TWAS analysis with bonferroni adjusted threshold for z scores
  ld_exprfs <- paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_chr", 1:22, ".exprqc.Rd")
  z_gene <- list()
  for (j in 1:length(ld_exprfs)){
    load(ld_exprfs[j])
    z_gene[[j]] <- z_gene_chr
  }
  z_gene <- do.call(rbind, z_gene)
  alpha <- 0.05
  sig_thresh <- qnorm(1-(alpha/nrow(z_gene)/2), lower=T)
  twas_genes <- z_gene$id[abs(z_gene$z)>sig_thresh]
  twas_genes_combined <- unique(sapply(twas_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  n_twas_genes <- length(twas_genes)
  n_twas_genes_combined <- length(twas_genes_combined)
  
  n_twas_genes_in_causal <- sum(twas_genes %in% true_genes)
  n_twas_genes_in_causal_combined <- sum(twas_genes_combined %in% true_genes_combined)

  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal=as.integer(n_causal),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_pip=as.integer(n_ctwas_genes),
                                n_detected_pip_in_causal=as.integer(n_causal_detected),
                                n_detected_comb_pip=as.integer(n_ctwas_genes_combined),
                                n_detected_comb_pip_in_causal=as.integer(n_causal_detected_combined),
                                pve_snp=as.numeric(rev(estimated_group_pve_all["SNP",])[1]),
                                pve_weight1=as.numeric(rev(estimated_group_pve_all["Brain_Cerebellum_harmonized",])[1]),
                                pve_weight2=as.numeric(rev(estimated_group_pve_all["Brain_Hippocampus_harmonized",])[1]),
                                pve_weight3=as.numeric(rev(estimated_group_pve_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_weight1=as.numeric(rev(estimated_group_prior_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_weight2=as.numeric(rev(estimated_group_prior_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_weight3=as.numeric(rev(estimated_group_prior_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_var_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_var_weight1=as.numeric(rev(estimated_group_prior_var_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_var_weight2=as.numeric(rev(estimated_group_prior_var_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_var_weight3=as.numeric(rev(estimated_group_prior_var_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                n_detected_twas=as.integer(n_twas_genes),
                                n_detected_twas_in_causal=as.integer(n_twas_genes_in_causal),
                                n_detected_comb_twas=as.integer(n_twas_genes_combined),
                                n_detected_comb_twas_in_causal=as.integer(n_twas_genes_in_causal_combined))
  
  results_df <- rbind(results_df, results_current)
}

#results using PIP threshold (gene+tissue)
results_df[,c("simutag", "n_causal", "n_detected_pip", "n_detected_pip_in_causal")]
  simutag n_causal n_detected_pip n_detected_pip_in_causal
1     4-1       90             37                       13
2     4-2      105             45                       11
3     4-3      114             69                       23
#mean percent causal using PIP > 0.8
sum(results_df$n_detected_pip_in_causal)/sum(results_df$n_detected_pip)
[1] 0.3112583
#results using combined PIP threshold
results_df[,c("simutag", "n_causal_combined", "n_detected_comb_pip", "n_detected_comb_pip_in_causal")]
  simutag n_causal_combined n_detected_comb_pip n_detected_comb_pip_in_causal
1     4-1                90                  43                            16
2     4-2               105                  53                            18
3     4-3               114                  80                            30
#mean percent causal using combined PIP > 0.8
sum(results_df$n_detected_comb_pip_in_causal)/sum(results_df$n_detected_comb_pip)
[1] 0.3636364
#prior inclusion and mean prior inclusion
results_df[,c(which(colnames(results_df)=="simutag"), setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df))))]
  simutag prior_snp prior_weight1 prior_weight2 prior_weight3
1     4-1  8.550793   0.006690625  0.0019100453  6.268846e-06
2     4-2  8.091337   0.007873435  0.0007204744  6.090243e-03
3     4-3  7.497533   0.015089097  0.0050050189  2.412586e-04
colMeans(results_df[,setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df)))])
    prior_snp prior_weight1 prior_weight2 prior_weight3 
  8.046554392   0.009884385   0.002545180   0.002112590 
#prior variance and mean prior variance
results_df[,c(which(colnames(results_df)=="simutag"), grep("prior_var", names(results_df)))]
  simutag prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3
1     4-1      8.550793          16.11374          16.11374          16.11374
2     4-2      8.091337          17.76466          17.76466          17.76466
3     4-3      7.497533          14.00619          14.00619          14.00619
colMeans(results_df[,grep("prior_var", names(results_df))])
    prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3 
         8.046554         15.961531         15.961531         15.961531 
#PVE and mean PVE
results_df[,c(which(colnames(results_df)=="simutag"), grep("pve", names(results_df)))]
  simutag   pve_snp pve_weight1 pve_weight2  pve_weight3
1     4-1 0.2520260  0.02101356 0.005437443 1.926236e-05
2     4-2 0.2588309  0.02726201 0.002261154 2.063084e-02
3     4-3 0.2491648  0.04119265 0.012384550 6.443591e-04
colMeans(results_df[,grep("pve", names(results_df))])
    pve_snp pve_weight1 pve_weight2 pve_weight3 
0.253340562 0.029822740 0.006694382 0.007098155 
#TWAS results
results_df[,c(which(colnames(results_df)=="simutag"), grep("twas", names(results_df)))]
  simutag n_detected_twas n_detected_twas_in_causal n_detected_comb_twas
1     4-1             135                        22                   89
2     4-2             200                        34                  122
3     4-3             166                        29                  104
  n_detected_comb_twas_in_causal
1                             23
2                             36
3                             30
sum(results_df$n_detected_comb_twas_in_causal)/sum(results_df$n_detected_comb_twas)
[1] 0.2825397
#store results for figure
plot_df <- data.frame(simutag=results_df$simutag,
                      method="cTWAS G+T",
                      count=results_df$n_detected_pip-results_df$n_detected_pip_in_causal,
                      ifcausal=F)
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G+T",
                            count=results_df$n_detected_pip_in_causal,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G",
                            count=results_df$n_detected_comb_pip-results_df$n_detected_comb_pip_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G",
                            count=results_df$n_detected_comb_pip_in_causal,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="TWAS",
                            count=results_df$n_detected_comb_twas-results_df$n_detected_comb_twas_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="TWAS",
                            count=results_df$n_detected_comb_twas_in_causal,
                            ifcausal=T))

Separate effect size parameters

For the cTWAS analysis, each tissue had its own prior inclusion parameter end effect size parameter.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(4, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8

results_df <- data.frame(simutag=as.character(),
                              n_causal=as.integer(),
                              n_causal_combined=as.integer(),
                              n_detected_pip=as.integer(),
                              n_detected_pip_in_causal=as.integer(),
                              n_detected_comb_pip=as.integer(),
                              n_detected_comb_pip_in_causal=as.integer(),
                              pve_snp=as.numeric(),
                              pve_weight1=as.numeric(),
                              pve_weight2=as.numeric(),
                              pve_weight3=as.numeric(),
                              prior_weight1=as.numeric(),
                              prior_weight2=as.numeric(),
                              prior_weight3=as.numeric(),
                              prior_var_snp=as.numeric(),
                              prior_var_weight1=as.numeric(),
                              prior_var_weight2=as.numeric(),
                              prior_var_weight3=as.numeric())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #load cTWAS results
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of causal genes
  n_causal <- length(true_genes)
  n_causal_combined <- length(true_genes_combined)
  
  #number of gene+tissue combinations with cTWAS PIP > threshold
  n_ctwas_genes <- sum(ctwas_gene_res$susie_pip > PIP_threshold)
  
  #number of cTWAS genes that are causal
  n_causal_detected <- sum(ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold] %in% true_genes)
  
  #collapse gene+tissues to genes and compute combined PIP
  ctwas_gene_res$gene <- sapply(ctwas_gene_res$id, function(x){unlist(strsplit(x,"[|]"))[1]})
  ctwas_gene_res_combined <- aggregate(ctwas_gene_res$susie_pip, by=list(ctwas_gene_res$gene), FUN=sum)
  colnames(ctwas_gene_res_combined) <- c("gene", "pip_combined")
  
  #number of genes with combined PIP > threshold
  n_ctwas_genes_combined <- sum(ctwas_gene_res_combined$pip_combined > PIP_threshold)
  
  #number of cTWAS genes using combined PIP that are causal
  n_causal_detected_combined <- sum(ctwas_gene_res_combined$gene[ctwas_gene_res_combined$pip_combined > PIP_threshold] %in% true_genes_combined)
  
  #collect number of SNPs analyzed by cTWAS
  ctwas_res_s1 <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s1.susieIrss.txt"))
  n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
  rm(ctwas_res_s1)
  
  #load estimated parameters
  load(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR.s2.susieIrssres.Rd"))
  
  #estimated group prior (all iterations)
  estimated_group_prior_all <- group_prior_rec
  estimated_group_prior_all["SNP",] <- estimated_group_prior_all["SNP",]*thin #adjust parameter to account for thin argument
  
  #estimated group prior variance (all iterations)
  estimated_group_prior_var_all <- group_prior_var_rec
  
  #set group size
  group_size <- c(table(ctwas_gene_res$type), structure(n_snps, names="SNP"))
  group_size <- group_size[rownames(estimated_group_prior_all)]
  
  #estimated group PVE (all iterations)
  estimated_group_pve_all <- estimated_group_prior_var_all*estimated_group_prior_all*group_size/sample_size #check PVE calculation

  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal=as.integer(n_causal),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_pip=as.integer(n_ctwas_genes),
                                n_detected_pip_in_causal=as.integer(n_causal_detected),
                                n_detected_comb_pip=as.integer(n_ctwas_genes_combined),
                                n_detected_comb_pip_in_causal=as.integer(n_causal_detected_combined),
                                pve_snp=as.numeric(rev(estimated_group_pve_all["SNP",])[1]),
                                pve_weight1=as.numeric(rev(estimated_group_pve_all["Brain_Cerebellum_harmonized",])[1]),
                                pve_weight2=as.numeric(rev(estimated_group_pve_all["Brain_Hippocampus_harmonized",])[1]),
                                pve_weight3=as.numeric(rev(estimated_group_pve_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_weight1=as.numeric(rev(estimated_group_prior_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_weight2=as.numeric(rev(estimated_group_prior_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_weight3=as.numeric(rev(estimated_group_prior_all["Brain_Caudate_basal_ganglia_harmonized",])[1]),
                                prior_var_snp=as.numeric(rev(estimated_group_prior_var_all["SNP",])[1]),
                                prior_var_weight1=as.numeric(rev(estimated_group_prior_var_all["Brain_Cerebellum_harmonized",])[1]),
                                prior_var_weight2=as.numeric(rev(estimated_group_prior_var_all["Brain_Hippocampus_harmonized",])[1]),
                                prior_var_weight3=as.numeric(rev(estimated_group_prior_var_all["Brain_Caudate_basal_ganglia_harmonized",])[1]))
  
  results_df <- rbind(results_df, results_current)
}

#results using PIP threshold (gene+tissue)
results_df[,c("simutag", "n_causal", "n_detected_pip", "n_detected_pip_in_causal")]
  simutag n_causal n_detected_pip n_detected_pip_in_causal
1     4-1       90             34                       13
2     4-2      105             47                       11
3     4-3      114             64                       25
#mean percent causal using PIP > 0.8
sum(results_df$n_detected_pip_in_causal)/sum(results_df$n_detected_pip)
[1] 0.337931
#results using combined PIP threshold
results_df[,c("simutag", "n_causal_combined", "n_detected_comb_pip", "n_detected_comb_pip_in_causal")]
  simutag n_causal_combined n_detected_comb_pip n_detected_comb_pip_in_causal
1     4-1                90                  37                            15
2     4-2               105                  55                            20
3     4-3               114                  67                            26
#mean percent causal using combined PIP > 0.8
sum(results_df$n_detected_comb_pip_in_causal)/sum(results_df$n_detected_comb_pip)
[1] 0.3836478
#prior inclusion and mean prior inclusion
results_df[,c(which(colnames(results_df)=="simutag"), setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df))))]
  simutag prior_snp prior_weight1 prior_weight2 prior_weight3
1     4-1  8.581569   0.006602027  0.0021024441   0.001099093
2     4-2  8.087527   0.008219075  0.0007509542   0.005859247
3     4-3  7.605934   0.013902915  0.0053234702   0.003794378
colMeans(results_df[,setdiff(grep("prior", names(results_df)), grep("prior_var", names(results_df)))])
    prior_snp prior_weight1 prior_weight2 prior_weight3 
  8.091677013   0.009574672   0.002725623   0.003584240 
#prior variance and mean prior variance
results_df[,c(which(colnames(results_df)=="simutag"), grep("prior_var", names(results_df)))]
  simutag prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3
1     4-1      8.581569          16.43984          14.88672          1.800665
2     4-2      8.087527          15.93081          20.36791         19.809020
3     4-3      7.605934          15.17775          12.01800          3.115657
colMeans(results_df[,grep("prior_var", names(results_df))])
    prior_var_snp prior_var_weight1 prior_var_weight2 prior_var_weight3 
         8.091677         15.849465         15.757544          8.241781 
#PVE and mean PVE
results_df[,c(which(colnames(results_df)=="simutag"), grep("pve", names(results_df)))]
  simutag   pve_snp pve_weight1 pve_weight2  pve_weight3
1     4-1 0.2517062  0.02115492 0.005529403 0.0003773922
2     4-2 0.2586432  0.02552098 0.002702181 0.0221324851
3     4-3 0.2482345  0.04112916 0.011302685 0.0022543205
colMeans(results_df[,grep("pve", names(results_df))])
    pve_snp pve_weight1 pve_weight2 pve_weight3 
0.252861290 0.029268354 0.006511423 0.008254733 
#store results for figure
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G - Ind Var",
                            count=results_df$n_detected_comb_pip-results_df$n_detected_comb_pip_in_causal,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS G - Ind Var",
                            count=results_df$n_detected_comb_pip_in_causal,
                            ifcausal=T))

Individual tissue analyses

For the cTWAS analysis, each tissue was analyzed individually and the results were combined.

results_dir <- "/project2/xinhe/shengqian/cTWAS/cTWAS_simulation/simulation_correlated/"
runtag = "ukb-s80.45-3_corr"
configtag <- 2

simutags <- paste(4, 1:3, sep = "-")
thin <- 0.1

sample_size <- 45000
PIP_threshold <- 0.8

results_df <- data.frame(simutag=as.character(),
                         n_causal_combined=as.integer(),
                         n_detected_weight1=as.integer(),
                         n_detected_in_causal_weight1=as.integer(),
                         n_detected_weight2=as.integer(),
                         n_detected_in_causal_weight2=as.integer(),
                         n_detected_weight3=as.integer(),
                         n_detected_in_causal_weight3=as.integer(),
                         n_detected_combined=as.integer(),
                         n_detected_in_causal_combined=as.integer())

for (i in 1:length(simutags)){
  simutag <- simutags[i]
  
  #load genes with true simulated effect
  load(paste0(results_dir, runtag, "_simu", simutag, "-pheno.Rd"))
  true_genes <- unlist(sapply(1:22, function(x){phenores$batch[[x]]$id.cgene}))
  true_genes_combined <- unique(sapply(true_genes, function(x){unlist(strsplit(x, "[|]"))[1]}))
  
  #number of causal genes
  n_causal_combined <- length(true_genes_combined)
  
  #load cTWAS results for weight1
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight1.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight1
  ctwas_genes_weight1 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight1 <- length(ctwas_genes_weight1)
  n_causal_detected_weight1 <- sum(ctwas_genes_weight1 %in% true_genes_combined)
  
  #load cTWAS results for weight2
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight2.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight1
  ctwas_genes_weight2 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight2 <- length(ctwas_genes_weight2)
  n_causal_detected_weight2 <- sum(ctwas_genes_weight2 %in% true_genes_combined)
  
  #load cTWAS results for weight3
  ctwas_res <- data.table::fread(paste0(results_dir, runtag, "_simu", simutag, "_config", configtag, "_LDR_weight3.susieIrss.txt"))
  ctwas_gene_res <- ctwas_res[ctwas_res$type!="SNP",]
  
  #number of genes with cTWAS PIP > threshold in weight3
  ctwas_genes_weight3 <- ctwas_gene_res$id[ctwas_gene_res$susie_pip > PIP_threshold]
  n_ctwas_genes_weight3 <- length(ctwas_genes_weight3)
  n_causal_detected_weight3 <- sum(ctwas_genes_weight3 %in% true_genes_combined)
  
  #combined analysis
  ctwas_genes_combined <- unique(c(ctwas_genes_weight1, ctwas_genes_weight2, ctwas_genes_weight3))
  n_ctwas_genes_combined <- length(ctwas_genes_combined)
  n_causal_detected_combined <- sum(ctwas_genes_combined %in% true_genes_combined)
  
  results_current <- data.frame(simutag=as.character(simutag),
                                n_causal_combined=as.integer(n_causal_combined),
                                n_detected_weight1=as.integer(n_ctwas_genes_weight1),
                                n_detected_in_causal_weight1=as.integer(n_causal_detected_weight1),
                                n_detected_weight2=as.integer(n_ctwas_genes_weight2),
                                n_detected_in_causal_weight2=as.integer(n_causal_detected_weight2),
                                n_detected_weight3=as.integer(n_ctwas_genes_weight3),
                                n_detected_in_causal_weight3=as.integer(n_causal_detected_weight3),
                                n_detected_combined=as.integer(n_ctwas_genes_combined),
                                n_detected_in_causal_combined=as.integer(n_causal_detected_combined))
  
  results_df <- rbind(results_df, results_current)
}

#results using weight1
results_df[,c("simutag", colnames(results_df)[grep("weight1", colnames(results_df))])]
  simutag n_detected_weight1 n_detected_in_causal_weight1
1     4-1                 25                           14
2     4-2                 36                           11
3     4-3                 48                           23
#mean percent causal using PIP > 0.8 for weight1
sum(results_df$n_detected_in_causal_weight1)/sum(results_df$n_detected_weight1)
[1] 0.440367
#results using weight2
results_df[,c("simutag", colnames(results_df)[grep("weight2", colnames(results_df))])]
  simutag n_detected_weight2 n_detected_in_causal_weight2
1     4-1                 22                            6
2     4-2                 29                           13
3     4-3                 32                            9
#mean percent causal using PIP > 0.8 for weight1
sum(results_df$n_detected_in_causal_weight2)/sum(results_df$n_detected_weight2)
[1] 0.3373494
#results using weight3
results_df[,c("simutag", colnames(results_df)[grep("weight3", colnames(results_df))])]
  simutag n_detected_weight3 n_detected_in_causal_weight3
1     4-1                 16                            4
2     4-2                 21                            5
3     4-3                  5                            3
#mean percent causal using PIP > 0.8 for weight3
sum(results_df$n_detected_in_causal_weight3)/sum(results_df$n_detected_weight3)
[1] 0.2857143
#results using combined analysis
results_df[,c("simutag", colnames(results_df)[grep("combined", colnames(results_df))])]
  simutag n_causal_combined n_detected_combined n_detected_in_causal_combined
1     4-1                90                  46                            16
2     4-2               105                  66                            20
3     4-3               114                  72                            28
#mean percent causal using PIP > 0.8 for weight3
sum(results_df$n_detected_in_causal_combined)/sum(results_df$n_detected_combined)
[1] 0.3478261
#store results for figure

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight1",
                            count=results_df$n_detected_weight1-results_df$n_detected_in_causal_weight1,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight1",
                            count=results_df$n_detected_in_causal_weight1,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight2",
                            count=results_df$n_detected_weight2-results_df$n_detected_in_causal_weight2,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight2",
                            count=results_df$n_detected_in_causal_weight2,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight3",
                            count=results_df$n_detected_weight3-results_df$n_detected_in_causal_weight3,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Weight3",
                            count=results_df$n_detected_in_causal_weight3,
                            ifcausal=T))

plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Union",
                            count=results_df$n_detected_combined-results_df$n_detected_in_causal_combined,
                            ifcausal=F))
plot_df <- rbind(plot_df,
                 data.frame(simutag=results_df$simutag,
                            method="cTWAS Union",
                            count=results_df$n_detected_in_causal_combined,
                            ifcausal=T))

Figure

For the cTWAS analysis, each tissue was analyzed individually and the results were combined.

plot_df$method <- factor(plot_df$method, levels=c("TWAS", "cTWAS G+T", "cTWAS G", "cTWAS G - Ind Var", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3"))

library(ggpubr)

colset = c("#ebebeb", "#fb8072") 

ggbarplot(plot_df, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "none", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))

plot_df_2 <- plot_df

plot_df_2$method %in% c("TWAS", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3")
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[37]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
plot_df_2$ifcausal <- as.character(plot_df_2$ifcausal+1)
plot_df_2$count[plot_df_2$method=="cTWAS G" & plot_df_2$ifcausal=="2"] <- plot_df_2$count[plot_df_2$method=="cTWAS G" & plot_df_2$ifcausal=="2"] - plot_df_2$count[plot_df_2$method=="cTWAS G+T" & plot_df_2$ifcausal=="2"]

plot_df_2 <- rbind(plot_df_2, data.frame(simutag=simutags,
                                    method="cTWAS G",
                                    count=plot_df_2$count[plot_df_2$method=="cTWAS G+T" & plot_df_2$ifcausal=="2"],
                                    ifcausal="3"))




plot_df_3 <- plot_df_2


plot_df_2 <- plot_df_2[plot_df_2$method %in% c("TWAS", "cTWAS Union", "cTWAS Weight1", "cTWAS Weight2", "cTWAS Weight3"),]
plot_df_2$method <- droplevels(plot_df_2$method)
levels(plot_df_2$method) <- c("TWAS", "cTWAS Union", "cTWAS Primary", "cTWAS Secondary", "cTWAS Null")

plot_df_2$ifcausal[plot_df_2$ifcausal=="1"] <- "FP"
plot_df_2$ifcausal[plot_df_2$ifcausal=="2"] <- "TP"

colset = c("#ebebeb", "#fb8072", "#8dd3c7")

ggbarplot(plot_df_2, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "right", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + guides(fill=guide_legend(title=""))

plot_df_3 <- plot_df_3[plot_df_3$method %in% c("cTWAS G", "cTWAS Union"),]
plot_df_3$method <- droplevels(plot_df_3$method)
levels(plot_df_3$method) <- c("cTWAS Joint", "cTWAS Union")

plot_df_3$method <- relevel(plot_df_3$method, "cTWAS Union")

plot_df_3$ifcausal[plot_df_3$ifcausal=="1"] <- "FP"
plot_df_3$ifcausal[plot_df_3$ifcausal=="2"] <- "TP"
plot_df_3$ifcausal[plot_df_3$ifcausal=="3"] <- "TP and Tissue Identified\nby Joint Analysis"

ggbarplot(plot_df_3, 
          x = "method", 
          y = "count", 
          add = "mean_se", 
          fill = "ifcausal", 
          legend = "right", 
          ylab="Count", 
          xlab="",
          palette = colset) + grids(linetype = "dashed") + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + guides(fill=guide_legend(title=""))


sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggpubr_0.6.0    ggplot2_3.4.0   workflowr_1.7.0

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0  xfun_0.35         bslib_0.4.1       purrr_1.0.2      
 [5] carData_3.0-4     colorspace_2.0-3  vctrs_0.6.3       generics_0.1.3   
 [9] htmltools_0.5.4   yaml_2.3.6        utf8_1.2.2        rlang_1.1.1      
[13] jquerylib_0.1.4   later_1.3.0       pillar_1.8.1      glue_1.6.2       
[17] withr_2.5.0       DBI_1.1.3         lifecycle_1.0.3   stringr_1.5.0    
[21] ggsignif_0.6.4    munsell_0.5.0     gtable_0.3.1      evaluate_0.19    
[25] labeling_0.4.2    knitr_1.41        callr_3.7.3       fastmap_1.1.0    
[29] httpuv_1.6.7      ps_1.7.2          fansi_1.0.3       highr_0.9        
[33] broom_1.0.2       Rcpp_1.0.9        backports_1.2.1   promises_1.2.0.1 
[37] scales_1.2.1      cachem_1.0.6      jsonlite_1.8.4    abind_1.4-5      
[41] farver_2.1.0      fs_1.5.2          digest_0.6.31     stringi_1.7.8    
[45] rstatix_0.7.2     processx_3.8.0    dplyr_1.0.10      getPass_0.2-2    
[49] rprojroot_2.0.3   grid_4.1.0        cli_3.6.1         tools_4.1.0      
[53] magrittr_2.0.3    sass_0.4.4        tibble_3.1.8      car_3.1-1        
[57] tidyr_1.3.0       whisker_0.4.1     pkgconfig_2.0.3   data.table_1.14.6
[61] assertthat_0.2.1  rmarkdown_2.19    httr_1.4.4        rstudioapi_0.14  
[65] R6_2.5.1          git2r_0.30.1      compiler_4.1.0