Generate overlap — gen_overlap • MSTExplorer

Compute simply overlap tests for each combination of disease/phenotype gene setvs. celltype gene set (determined by the top specificity quantiles for each celltype).

gen_overlap(
  gene_data = HPOExplorer::load_phenotype_to_genes(),
  ctd = load_example_ctd(),
  list_name_column = "disease_id",
  gene_column = "gene_symbol",
  list_names = unique(gene_data[[list_name_column]]),
  annotLevel = 1,
  keep_specificity_quantiles = seq(30, 40),
  top_n = NULL,
  long_format = FALSE,
  save_dir = tempdir(),
  force_new = FALSE,
  cores = 1,
  verbose = TRUE
)

Arguments

gene_data: data frame of gene list names and genes (see get_gene_lists).
ctd: CellTypeDataset generated using generate_celltype_data.
list_name_column: The name of the gene_data column that has the gene list names.
gene_column: The name of the gene_data column that contains the genes.
list_names: character vector of gene list names.
annotLevel: An integer indicating which level of sct_data to analyse (Default: 1).
keep_specificity_quantiles: Which cell type specificity quantiles to keep (max quantile is 40).
top_n: Top N genes to keep when grouping by group_vars.
long_format: Return results with "union" and "intersection" genes melted into long format (default: FALSE). Otherwise, genes will be collapsed into a list column (TRUE).
save_dir: Directory to save results to.
force_new: Overwrite previous results in the save_dir_tmp.
cores: The number of cores to run in parallel (e.g. 8) int.
verbose: Print messages.

Value

data.table of all overlap test results.

Details

NOTE:
This is a faster but less robust version of gen_results. It also only requires >=1 gene per disease/phenotype, as opposed to >=4.

Examples

gene_data <- HPOExplorer::load_phenotype_to_genes()
#> Reading cached RDS file: phenotype_to_genes.txt
#> + Version: v2024-12-12
list_names <- unique(gene_data$disease_id)[seq(3)]
overlap <- gen_overlap(gene_data = gene_data,
                       list_names = list_names)
#> Loading ctd_DescartesHuman_example.rds
#> Splitting data.
#> 0.972683429718018
#> 
#> Saving results ==> /tmp/RtmpMmVOaL/gen_overlap.rds