Generate background genes given one or more species. Caches the list to avoid excessive API calls to g:Profiler.
get_bg(
species1 = "human",
species2 = "human",
method = "gprofiler",
save_dir = tools::R_user_dir(package = "MultiEWCE", which = "cache"),
overwrite = FALSE,
verbose = TRUE,
...
)
First species.
Second species.
R package to use for gene mapping:
"gprofiler"
: Slower but more species and genes.
"homologene"
: Faster but fewer species and genes.
"babelgene"
: Faster but fewer species and genes.
Also gives consensus scores for each gene mapping based on a
several different data sources.
Directory to save data to.
Should any local files of the same name be overwritten?
default TRUE
.
Print messages.
Arguments passed on to orthogene::create_background
output_species
Species to convert all genes from
species1
and species2
to first.
Default="human"
, but can be to either any species
supported by orthogene, including
species1
or species2
.
as_output_species
Return background gene list as
output_species
orthologs, instead of the
gene names of the original input species.
use_intersect
When species1
and species2
are both
different from output_species
, this argument will determine whether
to use the intersect (TRUE
) or union (FALSE
) of all genes
from species1
and species2
.
bg
User supplied background list that will be returned to the user after removing duplicate genes.
gene_map
User-supplied gene_map
data table from
map_orthologs or map_genes.
non121_strategy
How to handle genes that don't have
1:1 mappings between input_species
:output_species
.
Options include:
"drop_both_species" or "dbs" or 1
:
Drop genes that have duplicate
mappings in either the input_species
or output_species
(DEFAULT).
"drop_input_species" or "dis" or 2
:
Only drop genes that have duplicate
mappings in the input_species
.
"drop_output_species" or "dos" or 3
:
Only drop genes that have duplicate
mappings in the output_species
.
"keep_both_species" or "kbs" or 4
:
Keep all genes regardless of whether
they have duplicate mappings in either species.
"keep_popular" or "kp" or 5
:
Return only the most "popular" interspecies ortholog mappings.
This procedure tends to yield a greater number of returned genes
but at the cost of many of them not being true biological 1:1 orthologs.
"sum","mean","median","min" or "max"
:
When gene_df
is a matrix and gene_output="rownames"
,
these options will aggregate many-to-one gene mappings
(input_species
-to-output_species
)
after dropping any duplicate genes in the output_species
.
A vector of background genes.
bg <- get_bg()
#> Useing cached bg.
#> + Version: 2023-11-14