Last updated: 2025-06-04

Checks: 6 1

Knit directory: multigroup_ctwas_analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown is untracked by Git. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20231112) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 4f9814a. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    cv/
    Ignored:    figures/lz/ADHD-ieu-a-1183/
    Ignored:    figures/lz/ASD-ieu-a-1185/
    Ignored:    figures/lz/IBD-ebi-a-GCST004131/single/
    Ignored:    figures/lz/MDD-ieu-b-102/single/
    Ignored:    figures/lz/PD-ieu-b-7/single/
    Ignored:    figures/lz/RA-panukb/single/

Untracked files:
    Untracked:  analysis/realdata_final_combine_qtls_3qtls.Rmd
    Untracked:  analysis/realdata_final_multigroup_validation_silver_3qtls.Rmd
    Untracked:  analysis/realdata_final_multigroup_validation_weighted_pwy_3qtls.Rmd

Unstaged changes:
    Modified:   analysis/index.Rmd
    Deleted:    analysis/realdata_final_genefunctions_3qtls.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


There are no past versions. Publish this analysis with wflow_publish() to start tracking its development.


library(kableExtra)
library(tidyverse)
-- Attaching packages --------------------------------------- tidyverse 1.3.1 --
v ggplot2 3.5.1     v purrr   1.0.2
v tibble  3.2.1     v dplyr   1.1.4
v tidyr   1.3.0     v stringr 1.5.1
v readr   2.1.2     v forcats 0.5.1
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter()     masks stats::filter()
x dplyr::group_rows() masks kableExtra::group_rows()
x dplyr::lag()        masks stats::lag()

Introduction

We validate genes with susie pip > 0.8 here: https://sq-96.github.io/multigroup_ctwas_analysis/realdata_final_multigroup_summary.html

The basic idea is:

Some biological pathways are related to the traits. Genes within these pathways are more likely to be associated with these traits. Our approach involves aggregating these genes into a collective group. This allows us to assess whether the genes identified by cTWAS are overrepresented in this group.

However, the presence of common genes across multiple pathways presents a challenge to this straightforward aggregation approach. To address this, we propose weighting the pathways, assigning a unique score to each gene. By selecting genes that meet a specific score threshold, we can form a more refined group. We can then evaluate the enrichment of cTWAS-identified genes within this selectively grouped set.

Model

The model is \(y=X*w\)

y is an n-dimensional vector representing gene-trait associations (n = number of genes), which can be:

  • z-scores computed by MAGMA
  • a binary vector indicating gene-trait relationships.

We tried different settings for the binary vector:

  • genes with FDR 0.05 as per MAGMA are marked 1.
  • Genes ranked in the top 500 by MAGMA p-values or meeting the FDR 0.05 threshold were labeled as 1.
  • Genes ranked in the top 1000 by MAGMA p-values or meeting the FDR 0.05 threshold were labeled as 1.

X is an n×m matrix (m = number of pathways) indicating gene membership in specific pathways.

We fitted this model using different models.

If y is a z-score vector, it can be fitted using

- XGBoost: regression with squared loss

If y is a binarized vector, the model can be fitted using

- XGBoost: logistic regression for binary classification, output probability

Benchmarks

The model fitting results in pathway weights, from which we predict gene labels \(\hat{y}\). We then categorize genes based on these new labels.

  • For z-score model, we compute the p-values from the new labels(z-scores), then compute FDR. Then we tested different cutoffs for gene selection. The cutoffs are: 0.05,0.1,0.2

  • For binarized model. Genes with labels > 0.5/0.6/0.7/0.8 are considered benchmarks.

Testing genes

Genes from ctwas results are divided into different groups based on their SuSiE PIPs:

  • high (>0.8)
  • moderate (0.8 > PIP > 0.5)
  • low (<0.5)

Fisher exact test

We assess whether high-PIP genes are more enriched in our benchmarks than other groups using Fisher exact tests.

The testing matrix is:

fisher_matrix <- matrix(c("n1","n2","n3","n4"),nrow = 2,ncol = 2)
rownames(fisher_matrix) <- c("#included","#notincluded")
colnames(fisher_matrix) <- c("pip08","other group")

print(fisher_matrix)
             pip08 other group
#included    "n1"  "n3"       
#notincluded "n2"  "n4"       

Pathways

The pathways are from Go Biological Process (gobp), Go Molecular Function (gomf), Go Cellular Component (gocc) and KEGG.

Results

GOBP

Binary, not truncated

df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_prob0.5_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 64 73 5394 5321 73 0.00059 0.00039 1.00000
IBD-ebi-a-GCST004131 115 28 3749 3699 50 0.00205 0.00175 0.44778
aFib-ebi-a-GCST006414 36 53 3540 3489 51 0.00030 0.00022 0.67849
SBP-ukb-a-360 126 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 70 17 2855 2822 31 0.23603 0.22910 1.00000
Height-panukb 15148 208 13763 13584 179 0.00740 0.00737 0.87110
HTN-panukb 108 42 5465 5400 65 0.03689 0.03266 1.00000
PLT-panukb 15219 119 10616 10471 145 0.33397 0.33579 0.04387
RA-panukb 27 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 10362 122 10154 9976 178 0.00056 0.00055 0.69153
ATH_gtexukb 70 21 3276 3227 49 0.03517 0.03336 0.57803
BMI-panukb 13394 53 9215 9114 101 0.75242 0.75246 0.39287
HB-panukb 1112 90 8649 8522 127 0.02761 0.02634 0.84386
T2D-panukb 17 14 2460 2423 37 0.06597 0.06695 0.27451
SCZ-ieu-b-5102 190 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 43 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 6 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 2 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 10 4 726 717 9 0.00047 0.00048 0.07692
NS-ukb-a-230 48 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 18 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 13703 243 11369 11103 266 0.00016 0.00012 0.38708
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_prob0.6_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 24 73 5394 5321 73 0.23611 0.20704 0.61977
IBD-ebi-a-GCST004131 76 28 3749 3699 50 0.00055 0.00046 0.24286
aFib-ebi-a-GCST006414 20 53 3540 3489 51 0.00178 0.00186 0.24300
SBP-ukb-a-360 53 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 35 17 2855 2822 31 1.00000 1.00000 0.53280
Height-panukb 15028 208 13763 13584 179 0.00583 0.00454 0.87110
HTN-panukb 44 42 5465 5400 65 1.00000 1.00000 0.51860
PLT-panukb 6897 119 10616 10471 145 0.00095 0.00070 0.08313
RA-panukb 23 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 1491 122 10154 9976 178 0.00001 0.00000 0.32109
ATH_gtexukb 36 21 3276 3227 49 0.15904 0.16125 0.30000
BMI-panukb 1194 53 9215 9114 101 0.11045 0.10709 1.00000
HB-panukb 233 90 8649 8522 127 0.08272 0.08319 0.23524
T2D-panukb 7 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 97 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 9 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 3 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 2 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 6 4 726 717 9 0.02717 0.02751 0.30769
NS-ukb-a-230 21 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 7 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 1617 243 11369 11103 266 0.00000 0.00000 0.25095
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_prob0.7_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 10 73 5394 5321 73 0.10202 0.07854 1.00000
IBD-ebi-a-GCST004131 46 28 3749 3699 50 0.02292 0.01950 1.00000
aFib-ebi-a-GCST006414 7 53 3540 3489 51 0.00428 0.00440 0.49533
SBP-ukb-a-360 18 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 24 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 14703 208 13763 13584 179 0.00431 0.00337 1.00000
HTN-panukb 15 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 418 119 10616 10471 145 0.77556 0.77460 0.19290
RA-panukb 18 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 283 122 10154 9976 178 0.00360 0.00326 0.41933
ATH_gtexukb 27 21 3276 3227 49 0.13153 0.13339 0.30000
BMI-panukb 39 53 9215 9114 101 0.16312 0.16477 0.34416
HB-panukb 41 90 8649 8522 127 0.04260 0.04374 0.17089
T2D-panukb 3 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 51 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 4 4 726 717 9 0.02178 0.02205 0.30769
NS-ukb-a-230 9 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 1 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 108 243 11369 11103 266 0.10926 0.09895 1.00000
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_prob0.8_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 3 73 5394 5321 73 1.00000 1.00000 1.00000
IBD-ebi-a-GCST004131 19 28 3749 3699 50 0.09909 0.09353 1.00000
aFib-ebi-a-GCST006414 3 53 3540 3489 51 0.04362 0.04423 1.00000
SBP-ukb-a-360 4 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 9 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 7348 208 13763 13584 179 0.00228 0.00224 0.91891
HTN-panukb 1 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 26 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 14 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 36 122 10154 9976 178 0.30149 0.28880 1.00000
ATH_gtexukb 5 21 3276 3227 49 1.00000 1.00000 1.00000
BMI-panukb 0 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 4 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 2 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 4 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 1 4 726 717 9 1.00000 1.00000 1.00000
NS-ukb-a-230 1 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 0 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 0 243 11369 11103 266 1.00000 1.00000 1.00000

Binary, truncated top 500 genes

df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.5_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 33 73 5394 5321 73 0.00000 0.00000 0.74505
IBD-ebi-a-GCST004131 41 28 3749 3699 50 0.00010 0.00010 0.01435
aFib-ebi-a-GCST006414 23 53 3540 3489 51 0.02521 0.02306 1.00000
SBP-ukb-a-360 7 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 48 17 2855 2822 31 1.00000 1.00000 0.53280
Height-panukb 35 208 13763 13584 179 1.00000 1.00000 0.46253
HTN-panukb 10 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 22 119 10616 10471 145 0.20000 0.19337 1.00000
RA-panukb 25 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 25 122 10154 9976 178 0.25839 0.25328 1.00000
ATH_gtexukb 54 21 3276 3227 49 0.00233 0.00212 0.15514
BMI-panukb 11 53 9215 9114 101 0.05033 0.05087 0.34416
HB-panukb 20 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 20 14 2460 2423 37 0.08182 0.07769 0.47765
SCZ-ieu-b-5102 25 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 19 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 6 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 2 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 9 4 726 717 9 0.03254 0.03294 0.30769
NS-ukb-a-230 17 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 26 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 43 243 11369 11103 266 0.05391 0.05041 0.67332
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.6_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 24 73 5394 5321 73 0.00021 0.00010 1.00000
IBD-ebi-a-GCST004131 31 28 3749 3699 50 0.00075 0.00078 0.04306
aFib-ebi-a-GCST006414 11 53 3540 3489 51 0.09888 0.10024 1.00000
SBP-ukb-a-360 3 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 36 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 22 208 13763 13584 179 1.00000 1.00000 1.00000
HTN-panukb 7 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 14 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 23 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 19 122 10154 9976 178 0.20318 0.20638 0.40667
ATH_gtexukb 39 21 3276 3227 49 0.18571 0.18292 0.51304
BMI-panukb 6 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 14 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 15 14 2460 2423 37 0.07129 0.06695 0.47765
SCZ-ieu-b-5102 22 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 3 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 6 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 2 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 5 4 726 717 9 0.02717 0.02751 0.30769
NS-ukb-a-230 11 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 14 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 27 243 11369 11103 266 0.43541 0.43080 1.00000
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.7_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 11 73 5394 5321 73 0.00026 0.00013 1.00000
IBD-ebi-a-GCST004131 17 28 3749 3699 50 0.09909 0.10036 0.35897
aFib-ebi-a-GCST006414 6 53 3540 3489 51 0.05773 0.05855 1.00000
SBP-ukb-a-360 0 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 28 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 12 208 13763 13584 179 1.00000 1.00000 1.00000
HTN-panukb 2 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 8 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 18 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 14 122 10154 9976 178 1.00000 1.00000 1.00000
ATH_gtexukb 29 21 3276 3227 49 0.14814 0.15021 0.30000
BMI-panukb 2 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 11 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 8 14 2460 2423 37 0.04445 0.03957 0.47765
SCZ-ieu-b-5102 8 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 1 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 1 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 3 4 726 717 9 0.01637 0.01657 0.30769
NS-ukb-a-230 5 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 6 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 15 243 11369 11103 266 1.00000 1.00000 1.00000
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.8_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 9 73 5394 5321 73 0.00012 0.00008 0.61977
IBD-ebi-a-GCST004131 5 28 3749 3699 50 0.02934 0.02973 0.35897
aFib-ebi-a-GCST006414 1 53 3540 3489 51 1.00000 1.00000 1.00000
SBP-ukb-a-360 0 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 11 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 5 208 13763 13584 179 1.00000 1.00000 1.00000
HTN-panukb 0 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 1 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 14 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 7 122 10154 9976 178 1.00000 1.00000 1.00000
ATH_gtexukb 13 21 3276 3227 49 1.00000 1.00000 1.00000
BMI-panukb 0 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 4 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 1 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 7 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 0 4 726 717 9 1.00000 1.00000 1.00000
NS-ukb-a-230 1 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 3 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 4 243 11369 11103 266 1.00000 1.00000 1.00000

Binary, truncated top 1000 genes

df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc1000_prob0.5_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 47 73 5394 5321 73 0.00244 0.00178 1.00000
IBD-ebi-a-GCST004131 73 28 3749 3699 50 0.00073 0.00072 0.05315
aFib-ebi-a-GCST006414 23 53 3540 3489 51 0.00078 0.00063 0.61790
SBP-ukb-a-360 20 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 50 17 2855 2822 31 1.00000 1.00000 0.54296
Height-panukb 47 208 13763 13584 179 0.13926 0.13700 1.00000
HTN-panukb 16 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 37 119 10616 10471 145 0.05147 0.04711 1.00000
RA-panukb 27 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 27 122 10154 9976 178 0.26722 0.27125 0.40667
ATH_gtexukb 56 21 3276 3227 49 0.03245 0.03066 0.57803
BMI-panukb 14 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 41 90 8649 8522 127 0.23627 0.23126 1.00000
T2D-panukb 9 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 42 15 3547 3511 36 0.08504 0.08194 0.50588
BIP-ieu-b-5110 45 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 7 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 3 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 9 4 726 717 9 0.03254 0.03294 0.30769
NS-ukb-a-230 40 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 19 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 52 243 11369 11103 266 0.00449 0.00350 0.74270
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.6_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 24 73 5394 5321 73 0.00021 0.00010 1.00000
IBD-ebi-a-GCST004131 31 28 3749 3699 50 0.00075 0.00078 0.04306
aFib-ebi-a-GCST006414 11 53 3540 3489 51 0.09888 0.10024 1.00000
SBP-ukb-a-360 3 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 36 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 22 208 13763 13584 179 1.00000 1.00000 1.00000
HTN-panukb 7 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 14 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 23 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 19 122 10154 9976 178 0.20318 0.20638 0.40667
ATH_gtexukb 39 21 3276 3227 49 0.18571 0.18292 0.51304
BMI-panukb 6 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 14 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 15 14 2460 2423 37 0.07129 0.06695 0.47765
SCZ-ieu-b-5102 22 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 3 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 6 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 2 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 5 4 726 717 9 0.02717 0.02751 0.30769
NS-ukb-a-230 11 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 14 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 27 243 11369 11103 266 0.43541 0.43080 1.00000
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc1000_prob0.7_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 22 73 5394 5321 73 0.00221 0.00168 1.00000
IBD-ebi-a-GCST004131 25 28 3749 3699 50 0.01117 0.01146 0.12587
aFib-ebi-a-GCST006414 4 53 3540 3489 51 0.02929 0.02971 1.00000
SBP-ukb-a-360 5 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 22 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 14 208 13763 13584 179 0.16479 0.15398 1.00000
HTN-panukb 0 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 9 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 20 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 11 122 10154 9976 178 0.12317 0.12521 0.40667
ATH_gtexukb 27 21 3276 3227 49 0.13710 0.13903 0.30000
BMI-panukb 4 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 4 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 1 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 7 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 1 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 3 4 726 717 9 0.01637 0.01657 0.30769
NS-ukb-a-230 6 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 1 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 10 243 11369 11103 266 0.19069 0.17710 1.00000
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc1000_prob0.8_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 8 73 5394 5321 73 0.08986 0.07854 1.00000
IBD-ebi-a-GCST004131 15 28 3749 3699 50 0.09234 0.09353 0.35897
aFib-ebi-a-GCST006414 1 53 3540 3489 51 0.01475 0.01496 1.00000
SBP-ukb-a-360 1 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 8 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 5 208 13763 13584 179 0.07227 0.07318 1.00000
HTN-panukb 0 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 3 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 14 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 1 122 10154 9976 178 1.00000 1.00000 1.00000
ATH_gtexukb 6 21 3276 3227 49 1.00000 1.00000 1.00000
BMI-panukb 1 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 0 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 0 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 0 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 0 4 726 717 9 1.00000 1.00000 1.00000
NS-ukb-a-230 1 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 0 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 3 243 11369 11103 266 0.06148 0.04238 1.00000

Zscores

df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_zscore/results_zfdr0.05_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff FDR 0.05, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff FDR 0.05, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 75 73 5394 5321 73 0.00243 0.00199 1.00000
IBD-ebi-a-GCST004131 40 28 3749 3699 50 0.00000 0.00001 0.00466
aFib-ebi-a-GCST006414 7 53 3540 3489 51 0.00428 0.00317 1.00000
SBP-ukb-a-360 6 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 27 17 2855 2822 31 0.14873 0.15033 0.35417
Height-panukb 15515 208 13763 13584 179 0.00933 0.00934 0.87110
HTN-panukb 9 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 15484 119 10616 10471 145 0.45810 0.45922 0.11976
RA-panukb 15 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 14551 122 10154 9976 178 0.08884 0.08885 0.86845
ATH_gtexukb 37 21 3276 3227 49 0.02026 0.01972 0.21228
BMI-panukb 14562 53 9215 9114 101 0.29857 0.29850 1.00000
HB-panukb 169 90 8649 8522 127 0.38835 0.36547 0.20077
T2D-panukb 4 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 25 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 2 4 726 717 9 0.01094 0.01107 0.30769
NS-ukb-a-230 2 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 0 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 14788 243 11369 11103 266 0.00247 0.00194 0.68161
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_zscore/results_zfdr0.1_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff FDR 0.1, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff FDR 0.1, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 78 73 5394 5321 73 0.00042 0.00033 0.74505
IBD-ebi-a-GCST004131 45 28 3749 3699 50 0.00001 0.00001 0.00466
aFib-ebi-a-GCST006414 11 53 3540 3489 51 0.00000 0.00000 0.20550
SBP-ukb-a-360 7 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 27 17 2855 2822 31 0.14873 0.15033 0.35417
Height-panukb 15679 208 13763 13584 179 0.01178 0.01185 0.87110
HTN-panukb 16 42 5465 5400 65 0.10175 0.09591 1.00000
PLT-panukb 15534 119 10616 10471 145 0.38775 0.45681 0.11976
RA-panukb 17 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 15063 122 10154 9976 178 0.15725 0.15738 0.73770
ATH_gtexukb 43 21 3276 3227 49 0.02251 0.02198 0.21228
BMI-panukb 15131 53 9215 9114 101 0.47304 0.47293 1.00000
HB-panukb 14479 90 8649 8522 127 0.00077 0.00078 0.08305
T2D-panukb 4 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 32 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 3 4 726 717 9 0.01637 0.01657 0.30769
NS-ukb-a-230 2 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 0 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 15423 243 11369 11103 266 0.01223 0.00977 0.68161
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_zscore/results_zfdr0.2_gobp.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff FDR 0.2, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff FDR 0.2, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 92 73 5394 5321 73 0.00088 0.00073 0.74505
IBD-ebi-a-GCST004131 47 28 3749 3699 50 0.00000 0.00000 0.00147
aFib-ebi-a-GCST006414 14 53 3540 3489 51 0.00000 0.00000 0.20550
SBP-ukb-a-360 7 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 33 17 2855 2822 31 0.16891 0.16058 1.00000
Height-panukb 15751 208 13763 13584 179 0.01482 0.01179 0.87110
HTN-panukb 19 42 5465 5400 65 0.11544 0.10985 1.00000
PLT-panukb 15578 119 10616 10471 145 0.38265 0.38415 0.08045
RA-panukb 17 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 15704 122 10154 9976 178 0.27785 0.23099 0.73851
ATH_gtexukb 44 21 3276 3227 49 0.02251 0.02198 0.21228
BMI-panukb 15275 53 9215 9114 101 0.58372 0.58390 1.00000
HB-panukb 15520 90 8649 8522 127 0.00337 0.00336 0.08305
T2D-panukb 6 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 57 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 3 4 726 717 9 0.01637 0.01657 0.30769
NS-ukb-a-230 2 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 0 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 15680 243 11369 11103 266 0.02287 0.01847 0.68161

GOBP + GOCC + GOMF + KEGG jointly

Binary, not truncated

df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_prob0.5_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 89 73 5394 5321 73 0.00014 0.00010 0.74505
IBD-ebi-a-GCST004131 111 28 3749 3699 50 0.00000 0.00000 0.00266
aFib-ebi-a-GCST006414 34 53 3540 3489 51 0.00030 0.00027 0.36316
SBP-ukb-a-360 112 35 4909 4858 51 0.00732 0.00682 0.39306
T1D-GCST90014023 122 17 2855 2822 31 1.00000 1.00000 0.54296
Height-panukb 16807 208 13763 13584 179 0.00043 0.00043 0.32667
HTN-panukb 199 42 5465 5400 65 1.00000 1.00000 0.15414
PLT-panukb 16707 119 10616 10471 145 0.07674 0.07670 0.48462
RA-panukb 30 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 11994 122 10154 9976 178 0.01093 0.01086 1.00000
ATH_gtexukb 120 21 3276 3227 49 0.04533 0.04657 0.08696
BMI-panukb 14025 53 9215 9114 101 0.41054 0.41050 1.00000
HB-panukb 1462 90 8649 8522 127 0.07592 0.07363 1.00000
T2D-panukb 29 14 2460 2423 37 0.10257 0.10405 0.27451
SCZ-ieu-b-5102 303 15 3547 3511 36 0.36496 0.36237 1.00000
BIP-ieu-b-5110 104 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 6 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 11 4 726 717 9 0.00080 0.00082 0.07692
NS-ukb-a-230 51 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 35 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 14554 243 11369 11103 266 0.00006 0.00005 1.00000
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_prob0.6_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 50 73 5394 5321 73 0.00313 0.00256 1.00000
IBD-ebi-a-GCST004131 71 28 3749 3699 50 0.00002 0.00002 0.02060
aFib-ebi-a-GCST006414 12 53 3540 3489 51 0.00719 0.00581 1.00000
SBP-ukb-a-360 41 35 4909 4858 51 0.00772 0.00788 0.16279
T1D-GCST90014023 81 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 16646 208 13763 13584 179 0.00059 0.00042 0.47319
HTN-panukb 128 42 5465 5400 65 1.00000 1.00000 0.27791
PLT-panukb 7535 119 10616 10471 145 0.00038 0.00027 0.53086
RA-panukb 26 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 1861 122 10154 9976 178 0.00000 0.00000 0.21649
ATH_gtexukb 64 21 3276 3227 49 0.19616 0.19882 0.30000
BMI-panukb 1789 53 9215 9114 101 0.17654 0.17453 0.82876
HB-panukb 319 90 8649 8522 127 0.04774 0.04665 0.49425
T2D-panukb 15 14 2460 2423 37 0.06063 0.06153 0.27451
SCZ-ieu-b-5102 185 15 3547 3511 36 0.22534 0.22065 1.00000
BIP-ieu-b-5110 68 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 4 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 7 4 726 717 9 0.00034 0.00034 0.07692
NS-ukb-a-230 24 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 25 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 2689 243 11369 11103 266 0.00000 0.00000 0.34379
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_prob0.7_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 20 73 5394 5321 73 0.02344 0.01918 1.00000
IBD-ebi-a-GCST004131 38 28 3749 3699 50 0.00117 0.00110 0.12919
aFib-ebi-a-GCST006414 7 53 3540 3489 51 0.07165 0.05855 1.00000
SBP-ukb-a-360 17 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 56 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 16036 208 13763 13584 179 0.00050 0.00050 0.52703
HTN-panukb 76 42 5465 5400 65 1.00000 1.00000 0.51860
PLT-panukb 720 119 10616 10471 145 0.07638 0.07481 0.82302
RA-panukb 24 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 313 122 10154 9976 178 0.03521 0.03415 0.36371
ATH_gtexukb 43 21 3276 3227 49 0.15361 0.15575 0.30000
BMI-panukb 248 53 9215 9114 101 0.00524 0.00499 0.23375
HB-panukb 91 90 8649 8522 127 0.34650 0.35062 0.41475
T2D-panukb 10 14 2460 2423 37 0.03899 0.03957 0.27451
SCZ-ieu-b-5102 104 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 16 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 2 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 4 4 726 717 9 0.01637 0.01657 0.30769
NS-ukb-a-230 10 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 16 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 501 243 11369 11103 266 0.00168 0.00088 0.72005
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_prob0.8_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 6 73 5394 5321 73 1.00000 1.00000 0.49655
IBD-ebi-a-GCST004131 22 28 3749 3699 50 0.12562 0.12057 1.00000
aFib-ebi-a-GCST006414 1 53 3540 3489 51 1.00000 1.00000 1.00000
SBP-ukb-a-360 4 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 24 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 7198 208 13763 13584 179 0.08952 0.08877 0.26242
HTN-panukb 15 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 77 119 10616 10471 145 1.00000 1.00000 0.25445
RA-panukb 21 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 71 122 10154 9976 178 0.05813 0.05995 0.16457
ATH_gtexukb 21 21 3276 3227 49 1.00000 1.00000 1.00000
BMI-panukb 10 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 44 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 7 14 2460 2423 37 0.03351 0.03401 0.27451
SCZ-ieu-b-5102 21 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 5 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 0 4 726 717 9 1.00000 1.00000 1.00000
NS-ukb-a-230 5 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 1 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 123 243 11369 11103 266 0.35855 0.35371 1.00000

Binary, truncated top 500 genes

df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.5_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 36 73 5394 5321 73 0.00100 0.00074 1.00000
IBD-ebi-a-GCST004131 56 28 3749 3699 50 0.00038 0.00036 0.05315
aFib-ebi-a-GCST006414 26 53 3540 3489 51 0.00025 0.00022 0.36316
SBP-ukb-a-360 9 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 63 17 2855 2822 31 0.21739 0.20525 1.00000
Height-panukb 46 208 13763 13584 179 0.12915 0.12680 1.00000
HTN-panukb 9 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 30 119 10616 10471 145 0.27653 0.27154 1.00000
RA-panukb 29 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 21 122 10154 9976 178 0.22202 0.21599 1.00000
ATH_gtexukb 68 21 3276 3227 49 0.03656 0.03336 0.63237
BMI-panukb 9 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 39 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 19 14 2460 2423 37 0.09742 0.09359 0.47765
SCZ-ieu-b-5102 68 15 3547 3511 36 0.10426 0.09756 1.00000
BIP-ieu-b-5110 26 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 6 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 4 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 11 4 726 717 9 0.00080 0.00082 0.07692
NS-ukb-a-230 13 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 30 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 38 243 11369 11103 266 0.04463 0.04419 0.35238
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.6_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 27 73 5394 5321 73 0.00035 0.00023 1.00000
IBD-ebi-a-GCST004131 45 28 3749 3699 50 0.00279 0.00290 0.04306
aFib-ebi-a-GCST006414 13 53 3540 3489 51 0.00891 0.00740 1.00000
SBP-ukb-a-360 2 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 44 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 31 208 13763 13584 179 0.34322 0.33680 1.00000
HTN-panukb 6 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 21 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 25 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 15 122 10154 9976 178 0.16412 0.16678 0.40667
ATH_gtexukb 44 21 3276 3227 49 0.01918 0.01757 0.57803
BMI-panukb 3 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 14 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 12 14 2460 2423 37 0.06063 0.05609 0.47765
SCZ-ieu-b-5102 37 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 1 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 4 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 6 4 726 717 9 0.02717 0.02751 0.30769
NS-ukb-a-230 8 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 22 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 27 243 11369 11103 266 0.43541 0.44302 0.47741
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.7_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 21 73 5394 5321 73 0.00012 0.00008 0.68090
IBD-ebi-a-GCST004131 25 28 3749 3699 50 0.00051 0.00053 0.04306
aFib-ebi-a-GCST006414 6 53 3540 3489 51 0.00208 0.00213 0.49533
SBP-ukb-a-360 1 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 30 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 17 208 13763 13584 179 0.21347 0.20392 1.00000
HTN-panukb 4 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 13 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 23 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 13 122 10154 9976 178 1.00000 1.00000 1.00000
ATH_gtexukb 31 21 3276 3227 49 0.01322 0.01360 0.08696
BMI-panukb 1 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 12 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 11 14 2460 2423 37 0.05527 0.05061 0.47765
SCZ-ieu-b-5102 13 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 2 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 3 4 726 717 9 0.01637 0.01657 0.30769
NS-ukb-a-230 1 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 15 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 20 243 11369 11103 266 0.34513 0.35168 0.47741
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.8_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 13 73 5394 5321 73 0.01065 0.00918 1.00000
IBD-ebi-a-GCST004131 19 28 3749 3699 50 0.00673 0.00690 0.12587
aFib-ebi-a-GCST006414 1 53 3540 3489 51 1.00000 1.00000 1.00000
SBP-ukb-a-360 0 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 26 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 10 208 13763 13584 179 0.13933 0.14102 1.00000
HTN-panukb 0 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 9 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 21 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 10 122 10154 9976 178 1.00000 1.00000 1.00000
ATH_gtexukb 21 21 3276 3227 49 0.12593 0.12772 0.30000
BMI-panukb 0 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 10 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 8 14 2460 2423 37 0.04445 0.03957 0.47765
SCZ-ieu-b-5102 7 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 1 4 726 717 9 1.00000 1.00000 1.00000
NS-ukb-a-230 0 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 0 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 10 243 11369 11103 266 1.00000 1.00000 1.00000

Binary, truncated top 1000 genes

df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc1000_prob0.5_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.5, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 52 73 5394 5321 73 0.00054 0.00035 1.00000
IBD-ebi-a-GCST004131 71 28 3749 3699 50 0.00100 0.00082 0.44778
aFib-ebi-a-GCST006414 26 53 3540 3489 51 0.00098 0.00081 0.61790
SBP-ukb-a-360 29 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 111 17 2855 2822 31 1.00000 1.00000 0.53280
Height-panukb 49 208 13763 13584 179 0.00478 0.00462 0.37882
HTN-panukb 17 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 44 119 10616 10471 145 0.37442 0.36420 1.00000
RA-panukb 37 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 29 122 10154 9976 178 0.00348 0.00326 0.30765
ATH_gtexukb 105 21 3276 3227 49 0.06320 0.06139 0.57803
BMI-panukb 23 53 9215 9114 101 0.10847 0.09920 1.00000
HB-panukb 77 90 8649 8522 127 0.02120 0.02178 0.17089
T2D-panukb 15 14 2460 2423 37 0.07129 0.07233 0.27451
SCZ-ieu-b-5102 96 15 3547 3511 36 0.01066 0.00979 0.57143
BIP-ieu-b-5110 31 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 9 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 11 4 726 717 9 0.00080 0.00082 0.07692
NS-ukb-a-230 36 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 27 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 67 243 11369 11103 266 0.04480 0.04144 0.71413
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc500_prob0.6_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.6, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 27 73 5394 5321 73 0.00035 0.00023 1.00000
IBD-ebi-a-GCST004131 45 28 3749 3699 50 0.00279 0.00290 0.04306
aFib-ebi-a-GCST006414 13 53 3540 3489 51 0.00891 0.00740 1.00000
SBP-ukb-a-360 2 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 44 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 31 208 13763 13584 179 0.34322 0.33680 1.00000
HTN-panukb 6 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 21 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 25 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 15 122 10154 9976 178 0.16412 0.16678 0.40667
ATH_gtexukb 44 21 3276 3227 49 0.01918 0.01757 0.57803
BMI-panukb 3 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 14 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 12 14 2460 2423 37 0.06063 0.05609 0.47765
SCZ-ieu-b-5102 37 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 1 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 4 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 6 4 726 717 9 0.02717 0.02751 0.30769
NS-ukb-a-230 8 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 22 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 27 243 11369 11103 266 0.43541 0.44302 0.47741
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc1000_prob0.7_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.7, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 25 73 5394 5321 73 0.00378 0.00347 0.61977
IBD-ebi-a-GCST004131 24 28 3749 3699 50 0.14502 0.14033 1.00000
aFib-ebi-a-GCST006414 5 53 3540 3489 51 0.04362 0.04423 1.00000
SBP-ukb-a-360 2 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 59 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 15 208 13763 13584 179 1.00000 1.00000 1.00000
HTN-panukb 8 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 19 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 25 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 14 122 10154 9976 178 0.15407 0.15657 0.40667
ATH_gtexukb 43 21 3276 3227 49 1.00000 1.00000 1.00000
BMI-panukb 7 53 9215 9114 101 0.03383 0.02293 1.00000
HB-panukb 58 90 8649 8522 127 0.16151 0.16369 0.41475
T2D-panukb 7 14 2460 2423 37 0.03899 0.03957 0.27451
SCZ-ieu-b-5102 26 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 7 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 4 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 1 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 3 4 726 717 9 0.01637 0.01657 0.30769
NS-ukb-a-230 12 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 13 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 22 243 11369 11103 266 1.00000 1.00000 0.50004
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_binary/results_y_pred_xgboost_gene_score_bin_trunc1000_prob0.8_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (XGboost prediction): 0.8, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 17 73 5394 5321 73 0.00136 0.00117 0.61977
IBD-ebi-a-GCST004131 12 28 3749 3699 50 0.07869 0.07971 0.35897
aFib-ebi-a-GCST006414 0 53 3540 3489 51 1.00000 1.00000 1.00000
SBP-ukb-a-360 0 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 22 17 2855 2822 31 1.00000 1.00000 1.00000
Height-panukb 11 208 13763 13584 179 1.00000 1.00000 1.00000
HTN-panukb 7 42 5465 5400 65 1.00000 1.00000 1.00000
PLT-panukb 13 119 10616 10471 145 1.00000 1.00000 1.00000
RA-panukb 23 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 11 122 10154 9976 178 1.00000 1.00000 1.00000
ATH_gtexukb 27 21 3276 3227 49 1.00000 1.00000 1.00000
BMI-panukb 1 53 9215 9114 101 1.00000 1.00000 1.00000
HB-panukb 37 90 8649 8522 127 1.00000 1.00000 1.00000
T2D-panukb 1 14 2460 2423 37 1.00000 1.00000 1.00000
SCZ-ieu-b-5102 22 15 3547 3511 36 1.00000 1.00000 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 1 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 1 4 726 717 9 1.00000 1.00000 1.00000
NS-ukb-a-230 1 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 1 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 17 243 11369 11103 266 1.00000 1.00000 1.00000

Zscores

df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_zscore/results_zfdr0.05_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff (FDR  0.05); significant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff (FDR 0.05); significant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 84 73 5394 5321 73 0.00488 0.00328 0.74505
IBD-ebi-a-GCST004131 39 28 3749 3699 50 0.00010 0.00010 0.01435
aFib-ebi-a-GCST006414 8 53 3540 3489 51 0.00016 0.00011 0.61790
SBP-ukb-a-360 4 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 55 17 2855 2822 31 0.21267 0.20525 0.35417
Height-panukb 17188 208 13763 13584 179 0.00084 0.00059 0.32667
HTN-panukb 17 42 5465 5400 65 0.10862 0.10985 0.39252
PLT-panukb 17038 119 10616 10471 145 0.20991 0.21024 0.63206
RA-panukb 24 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 15965 122 10154 9976 178 0.05923 0.05969 0.83051
ATH_gtexukb 45 21 3276 3227 49 0.00218 0.00227 0.02430
BMI-panukb 16080 53 9215 9114 101 0.05838 0.05837 0.33237
HB-panukb 170 90 8649 8522 127 0.01949 0.01726 1.00000
T2D-panukb 10 14 2460 2423 37 0.05527 0.05061 0.47765
SCZ-ieu-b-5102 102 15 3547 3511 36 0.14524 0.13924 1.00000
BIP-ieu-b-5110 0 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 3 4 726 717 9 0.01637 0.01657 0.30769
NS-ukb-a-230 2 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 7 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 16214 243 11369 11103 266 0.00178 0.00134 0.84747
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_zscore/results_zfdr0.1_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff FDR 0.1, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff FDR 0.1, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 96 73 5394 5321 73 0.00090 0.00057 1.00000
IBD-ebi-a-GCST004131 40 28 3749 3699 50 0.00000 0.00000 0.00466
aFib-ebi-a-GCST006414 14 53 3540 3489 51 0.00003 0.00002 0.36316
SBP-ukb-a-360 5 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 77 17 2855 2822 31 0.22209 0.21008 1.00000
Height-panukb 17228 208 13763 13584 179 0.00082 0.00083 0.32667
HTN-panukb 24 42 5465 5400 65 0.15528 0.15041 1.00000
PLT-panukb 17105 119 10616 10471 145 0.20888 0.20873 0.80531
RA-panukb 24 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 16655 122 10154 9976 178 0.24331 0.24333 1.00000
ATH_gtexukb 48 21 3276 3227 49 0.00266 0.00278 0.02430
BMI-panukb 16569 53 9215 9114 101 0.16964 0.16995 0.49544
HB-panukb 16127 90 8649 8522 127 0.00387 0.00261 0.52833
T2D-panukb 11 14 2460 2423 37 0.06063 0.05609 0.47765
SCZ-ieu-b-5102 141 15 3547 3511 36 0.24179 0.23405 1.00000
BIP-ieu-b-5110 12 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 5 4 726 717 9 0.00022 0.00023 0.07692
NS-ukb-a-230 2 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 7 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 17036 243 11369 11103 266 0.01266 0.00972 0.68641
df <- readRDS("/project/xinhe/xsun/multi_group_ctwas/25.multi_group_validation_classfier_0602/results/fisher_xgboost_zscore/results_zfdr0.2_all.RDS")

df %>%
  mutate(across(starts_with("pval_"), 
                ~ifelse(. < 0.05, 
                        cell_spec(sprintf("%.5f", .), color = "red"), 
                        sprintf("%.5f", .)))) %>%
  kable(escape = FALSE, format = "html", 
        caption = "Benchmark gene cutoff FDR 0.2, sigificant ones are highlighted") %>%
  kable_styling(full_width = FALSE)
Benchmark gene cutoff FDR 0.2, sigificant ones are highlighted
trait num_benchmark_genes num_gene_08p num_gene_08m num_gene_05m num_gene_0508 pval_08p08m pval_08p05m pval_08p0508
LDL-ukb-d-30780_irnt 111 73 5394 5321 73 0.00164 0.00104 0.76457
IBD-ebi-a-GCST004131 46 28 3749 3699 50 0.00001 0.00001 0.00466
aFib-ebi-a-GCST006414 21 53 3540 3489 51 0.00000 0.00000 0.11275
SBP-ukb-a-360 5 35 4909 4858 51 1.00000 1.00000 1.00000
T1D-GCST90014023 114 17 2855 2822 31 0.27645 0.26585 1.00000
Height-panukb 17289 208 13763 13584 179 0.00082 0.00082 0.32667
HTN-panukb 31 42 5465 5400 65 0.19337 0.18917 1.00000
PLT-panukb 17174 119 10616 10471 145 0.26397 0.26439 0.80531
RA-panukb 26 5 332 330 2 1.00000 1.00000 1.00000
RBC-panukb 17288 122 10154 9976 178 0.44575 0.44641 1.00000
ATH_gtexukb 59 21 3276 3227 49 0.00320 0.00334 0.02430
BMI-panukb 16882 53 9215 9114 101 0.16700 0.16695 0.49544
HB-panukb 17123 90 8649 8522 127 0.02295 0.02310 0.73861
T2D-panukb 11 14 2460 2423 37 0.06063 0.05609 0.47765
SCZ-ieu-b-5102 154 15 3547 3511 36 0.26745 0.25699 1.00000
BIP-ieu-b-5110 12 9 2660 2622 38 1.00000 1.00000 1.00000
ASD-ieu-a-1185 0 0 107 103 4 1.00000 1.00000 1.00000
ADHD-ieu-a-1183 0 0 256 249 7 1.00000 1.00000 1.00000
PD-ieu-b-7 5 4 726 717 9 0.00022 0.00023 0.07692
NS-ukb-a-230 3 7 1201 1189 12 1.00000 1.00000 1.00000
MDD-ieu-b-102 14 0 799 796 3 1.00000 1.00000 1.00000
WBC-ieu-b-30 17121 243 11369 11103 266 0.02078 0.01606 0.68641

sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.5.1    stringr_1.5.1    dplyr_1.1.4      purrr_1.0.2     
 [5] readr_2.1.2      tidyr_1.3.0      tibble_3.2.1     ggplot2_3.5.1   
 [9] tidyverse_1.3.1  kableExtra_1.4.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.12       svglite_2.1.0     lubridate_1.8.0   assertthat_0.2.1 
 [5] rprojroot_2.0.3   digest_0.6.29     utf8_1.2.2        R6_2.5.1         
 [9] cellranger_1.1.0  backports_1.4.1   reprex_2.0.1      evaluate_0.15    
[13] httr_1.4.3        pillar_1.9.0      rlang_1.1.2       readxl_1.4.0     
[17] rstudioapi_0.13   jquerylib_0.1.4   rmarkdown_2.25    munsell_0.5.0    
[21] broom_0.8.0       compiler_4.2.0    httpuv_1.6.5      modelr_0.1.8     
[25] xfun_0.41         pkgconfig_2.0.3   systemfonts_1.0.4 htmltools_0.5.2  
[29] tidyselect_1.2.0  workflowr_1.7.0   fansi_1.0.3       viridisLite_0.4.0
[33] crayon_1.5.1      tzdb_0.4.0        dbplyr_2.1.1      withr_2.5.0      
[37] later_1.3.0       grid_4.2.0        jsonlite_1.8.0    gtable_0.3.0     
[41] lifecycle_1.0.4   DBI_1.2.2         git2r_0.30.1      magrittr_2.0.3   
[45] scales_1.3.0      cli_3.6.1         stringi_1.7.6     fs_1.5.2         
[49] promises_1.2.0.1  xml2_1.3.3        bslib_0.3.1       ellipsis_0.3.2   
[53] generics_0.1.2    vctrs_0.6.5       tools_4.2.0       glue_1.6.2       
[57] hms_1.1.1         fastmap_1.1.0     yaml_2.3.5        colorspace_2.0-3 
[61] rvest_1.0.2       knitr_1.39        haven_2.5.0       sass_0.4.1