Compute simply overlap tests for each combination of disease/phenotype gene setvs. celltype gene set (determined by the top specificity quantiles for each celltype).
gen_overlap(
gene_data = HPOExplorer::load_phenotype_to_genes(),
ctd = load_example_ctd(),
list_name_column = "disease_id",
gene_column = "gene_symbol",
list_names = unique(gene_data[[list_name_column]]),
annotLevel = 1,
keep_specificity_quantiles = seq(30, 40),
top_n = NULL,
long_format = FALSE,
save_dir = tempdir(),
force_new = FALSE,
cores = 1,
verbose = TRUE
)data frame of gene list names and genes (see get_gene_lists).
CellTypeDataset generated using generate_celltype_data.
The name of the gene_data column that has the gene list names.
The name of the gene_data column that contains the genes.
character vector of gene list names.
An integer indicating which level of sct_data to
analyse (Default: 1).
Which cell type specificity quantiles to keep (max quantile is 40).
Top N genes to keep when grouping by group_vars.
Return results with "union" and "intersection"
genes melted into long format (default: FALSE).
Otherwise, genes will be collapsed into a list column (TRUE).
Directory to save results to.
Don't use previously saved results when TRUE.
The number of cores to run in parallel (e.g. 8) int.
Print messages.
data.table of all overlap test results.
NOTE:
This is a faster but less robust version of gen_results.
It also only requires >=1 gene per disease/phenotype, as opposed to >=4.
gene_data <- HPOExplorer::load_phenotype_to_genes()
#> Reading cached RDS file: phenotype_to_genes.txt
#> + Version: v2025-05-06
list_names <- unique(gene_data$disease_id)[seq(3)]
overlap <- gen_overlap(gene_data = gene_data,
list_names = list_names)
#> Loading ctd_DescartesHuman_example.rds
#> Splitting data.
#> 0.979416370391846
#>
#> Saving results ==> /tmp/RtmpsW0tFt/gen_overlap.rds