Generate background genes given one or more species. Caches the list to avoid excessive API calls to g:Profiler.

get_bg(
  species1 = "human",
  species2 = "human",
  method = "gprofiler",
  save_dir = tools::R_user_dir(package = "MultiEWCE", which = "cache"),
  overwrite = FALSE,
  verbose = TRUE,
  ...
)

Arguments

species1

First species.

species2

Second species.

method

R package to use for gene mapping:

  • "gprofiler" : Slower but more species and genes.

  • "homologene" : Faster but fewer species and genes.

  • "babelgene" : Faster but fewer species and genes. Also gives consensus scores for each gene mapping based on a several different data sources.

save_dir

Directory to save data to.

overwrite

Should any local files of the same name be overwritten? default TRUE.

verbose

Print messages.

...

Arguments passed on to orthogene::create_background

output_species

Species to convert all genes from species1 and species2 to first. Default="human", but can be to either any species supported by orthogene, including species1 or species2.

as_output_species

Return background gene list as output_species orthologs, instead of the gene names of the original input species.

use_intersect

When species1 and species2 are both different from output_species, this argument will determine whether to use the intersect (TRUE) or union (FALSE) of all genes from species1 and species2.

bg

User supplied background list that will be returned to the user after removing duplicate genes.

gene_map

User-supplied gene_map data table from map_orthologs or map_genes.

non121_strategy

How to handle genes that don't have 1:1 mappings between input_species:output_species. Options include:

  • "drop_both_species" or "dbs" or 1 :
    Drop genes that have duplicate mappings in either the input_species or output_species
    (DEFAULT).

  • "drop_input_species" or "dis" or 2 :
    Only drop genes that have duplicate mappings in the input_species.

  • "drop_output_species" or "dos" or 3 :
    Only drop genes that have duplicate mappings in the output_species.

  • "keep_both_species" or "kbs" or 4 :
    Keep all genes regardless of whether they have duplicate mappings in either species.

  • "keep_popular" or "kp" or 5 :
    Return only the most "popular" interspecies ortholog mappings. This procedure tends to yield a greater number of returned genes but at the cost of many of them not being true biological 1:1 orthologs.

  • "sum","mean","median","min" or "max" :
    When gene_df is a matrix and gene_output="rownames", these options will aggregate many-to-one gene mappings (input_species-to-output_species) after dropping any duplicate genes in the output_species.

Value

A vector of background genes.

Examples

bg <- get_bg()
#> Useing cached bg.
#> + Version: 2023-11-14