Map orthologs from one species to another.
map_orthologs(
genes,
standardise_genes = FALSE,
input_species,
output_species = "human",
method = c("gprofiler", "homologene", "babelgene"),
mthreshold = Inf,
gene_map = NULL,
input_col = "input_gene",
output_col = "ortholog_gene",
verbose = TRUE,
...
)
can be a mixture of any format (HGNC, Ensembl, RefSeq, UniProt, etc.) and will be automatically converted to standardised HGNC symbol format.
If TRUE
AND
gene_output="columns"
, a new column "input_gene_standard"
will be added to gene_df
containing standardised HGNC symbols
identified by gorth.
Name of the input species (e.g., "mouse","fly"). Use map_species to return a full list of available species.
Name of the output species (e.g. "human","chicken"). Use map_species to return a full list of available species.
R package to use for gene mapping:
"gprofiler"
: Slower but more species and genes.
"homologene"
: Faster but fewer species and genes.
"babelgene"
: Faster but fewer species and genes.
Also gives consensus scores for each gene mapping based on a
several different data sources.
Maximum number of ortholog names per gene to show.
Passed to gorth.
Only used when method="gprofiler"
(DEFAULT : Inf
).
A data.frame that maps the current gene names to new gene names. This function's behaviour will adapt to different situations as follows:
gene_map=<data.frame>
:
When a data.frame containing the
gene key:value columns
(specified by input_col
and output_col
, respectively)
is provided, this will be used to perform aggregation/expansion.
gene_map=NULL
and input_species!=output_species
:
A gene_map
is automatically generated by
map_orthologs to perform inter-species
gene aggregation/expansion.
gene_map=NULL
and input_species==output_species
:
A gene_map
is automatically generated by
map_genes to perform within-species
gene gene symbol standardization and aggregation/expansion.
Column name within gene_map
with gene names matching
the row names of X
.
Column name within gene_map
with gene names
that you wish you map the row names of X
onto.
Print messages.
Additional arguments to be passed to
gorth or homologene.
NOTE: To return only the most "popular"
interspecies ortholog mappings,
supply mthreshold=1
here AND set method="gprofiler"
above.
This procedure tends to yield a greater number of returned genes but at
the cost of many of them not being true biological 1:1 orthologs.
For more details, please see
here.
Ortholog map data.frame
with at
least the columns "input_gene" and "ortholog_gene".
map_orthologs()
is a core function within
convert_orthologs()
, but does not have many
of the extra checks, such as non121_strategy
)
and drop_nonorths
.
data("exp_mouse")
gene_map <- map_orthologs(
genes = rownames(exp_mouse),
input_species = "mouse")
#> Converting mouse ==> human orthologs using: gprofiler
#> Retrieving all organisms available in gprofiler.
#> Using stored `gprofiler_orgs`.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: mmusculus
#> Retrieving all organisms available in gprofiler.
#> Using stored `gprofiler_orgs`.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: hsapiens