Compute simply overlap tests for each combination of disease/phenotype gene setvs. celltype gene set (determined by the top specificity quantiles for each celltype).
gen_overlap(
gene_data = HPOExplorer::load_phenotype_to_genes(),
ctd = load_example_ctd(),
list_name_column = "disease_id",
gene_column = "gene_symbol",
list_names = unique(gene_data[[list_name_column]]),
annotLevel = 1,
keep_specificity_quantiles = seq(30, 40),
top_n = NULL,
long_format = FALSE,
save_dir = tempdir(),
force_new = FALSE,
cores = 1,
verbose = TRUE
)
data frame of gene list names and genes (see get_gene_lists).
Cell Type Data List generated using generate_celltype_data.
The name of the gene_data column that has the gene list names.
The name of the gene_data column that contains the genes.
character vector of gene list names.
An integer indicating which level of sct_data
to
analyse (Default: 1).
Which cell type specificity quantiles to keep (max quantile is 40).
Top N genes to keep when grouping by group_vars
.
Return results with "union" and "intersection"
genes melted into long format (default: FALSE
).
Otherwise, genes will be collapsed into a list column (TRUE
).
Directory to save results to.
Overwrite previous results
in the save_dir_tmp
.
The number of cores to run in parallel (e.g. 8) int
.
Print messages.
data.table of all overlap test results.
NOTE:
This is a faster but less robust version of gen_results.
It also only requires >=1 gene per disease/phenotype, as opposed to >=4.
gene_data <- HPOExplorer::load_phenotype_to_genes()
#> Reading cached RDS file: phenotype_to_genes.txt
#> + Version: v2023-10-09
list_names <- unique(gene_data$disease_id)[seq(3)]
overlap <- gen_overlap(gene_data = gene_data,
list_names = list_names)
#> Splitting data.
#>
#> Attaching package: ‘purrr’
#> The following object is masked from ‘package:base’:
#>
#> %||%
#> 2.21393299102783
#>
#> Saving results ==> /tmp/Rtmp0tNWxK/gen_overlap.rds