Functions to map IDs across ontologies/databases.
map_colors(
dat,
columns = NULL,
as = c("vector", "dict", "name", "function"),
what = "nodes",
preferred_palettes = NULL
)
map_genes_monarch(
dat,
gene_col,
as_graph = methods::is(dat, "tbl_graph"),
map_by_merge = FALSE,
all.x = FALSE
)
map_medgen(dat, input_col, ...)
map_mondo(
dat,
input_col = "id",
output_col = "mondo_id",
to = "mondo",
map_types = NULL,
map_to = NULL,
top_n = NULL,
add_name = TRUE,
add_definitions = TRUE,
all.x = TRUE,
allow.cartesian = FALSE,
save_dir = cache_dir()
)
map_ontology_terms(
ont,
terms = NULL,
to = c("name", "id"),
keep_order = TRUE,
invert = FALSE
)
map_upheno(
pheno_map_method = c("upheno", "monarch"),
gene_map_method = c("monarch"),
subset_db1 = c("HP"),
terms = NULL,
fill_scores = NULL,
show_plot = TRUE,
force_new = FALSE,
save_dir = cache_dir()
)
map_variants(
gr,
build = c("GRCh37", "GRCh38"),
upstream = 2000L,
downstream = 200L,
keep_chr = paste0("chr", c(seq_len(22), "X", "Y")),
ignore.strand = TRUE
)
https://data.monarchinitiative.org/upheno2/current/qc/index
https://data.monarchinitiative.org/upheno2/current/upheno-release/all/index.html
data.table with genes.
Names of columns to map colour palettes to.
A character string specifying the format to convert to.
What should get activated? Possible values are nodes
or
edges
.
Preferred palettes to use for each column.
Name of the gene column in dat
.
Return the object as a tbl_graph.
Map orthologs by merging the node data such that the
orthologous genes will appear as a new column (TRUE
).
Otherwise, the orthologs will be added as new nodes to the graph
(FALSE
).
logical; if TRUE
, rows from x
which have no matching row
in y
are included. These rows will have 'NA's in the columns that are usually
filled with values from y
. The default is FALSE
so that only rows with
data from both x
and y
are included in the output.
Column name of input IDs.
Arguments passed on to map_mondo
Column name of output IDs.
Character vector of database(s) to map IDs to.
When not "mondo"
, can supply multiple alternative databases to map to
(e.g. c("OMIM","Orphanet","DECIPHER")
).
Mapping types to include.
Mapping outputs to include (from Mondo IDs to another database's IDs).
Top number of mappings to return per top_by
grouping.
Set to NULL
to skip this step.
Logical, if TRUE, add mondo name column.
logical, if TRUE, add mondo definition column.
See allow.cartesian
in [.data.table
.
Directory to save cached data.
An ontology of class ontology_DAG.
A subset of HPO IDs to include in the final dataset and plots (e.g. c("HP:0001508","HP:0001507")).
Return a named list of the same length and order
as terms
.
If FALSE
, return a named list of only the unique terms
,
sometimes in a different order.
Invert the keys/values of the dictionary, such that the key becomes the values (and vice versa).
Method to use for mapping phenotypes across ontologies.
"upheno"Use uPheno's phenotype-to-phenotype mappings. Contains fewer ontologies but with greater coverage of phenotypes.
"monarch"Use Monarch's phenotype-to-phenotype mappings. Contains more ontologies but with less coverage of phenotypes.
Method to use for mapping genes across species.
"monarch"Use Monarch's gene-to-gene mappings.
Subset of ontologies to include in the plot.
Fill missing scores in the "equivalence_score" and "subclass_score" columns with this value. These columns represent the quality of mapping between two phenotypes on a scale from 0-1.
Show the plot.
Force new data to be downloaded and processed.
A GRanges object.
Genome build to use when mapping genomic coordinates.
Single integer
values representing the number of base pairs
upstream of the 5'-end and downstream of the 3'-end. Used in contructing
PromoterVariants()
and IntergenicVariants()
objects only.
Which chromosomes to keep.
A logical
indicating if strand should be
ignored when performing overlaps.
Mapped data.
Mapped dat
Character vector
A list containing the data and plot.
map_colors()
: map_
map_genes_monarch()
: map_
Map Monarch genes
Map Monarch gene IDs to HGNC gene symbols, within or across species.
map_medgen()
: map_
Map Medgen.
map_mondo()
: map_
Map to/from mondo IDs
map_ontology_terms()
: map_
Map ontology terms to an alternative name.
Harmonise a mixed vector of term names (e.g. "Focal motor seizure") and term IDs (e.g. c("HP:0000002","HP:0000003")).
map_upheno()
: map_
Map phenotypes across uPheno
Map phenotypes across species within the Unified Phenotype Ontology (uPheno). First, gathers phenotype-phenotype mappings across ontologies. Next, gathers all phenotype-gene associations for each ontology, converts all genes to human HGNC orthologs, and computes the number of overlapping orthologs between all pairs mapped phenotypes. Finally, plots the results as the proportion of intersecting genes between all pairs of phenotypes.
map_variants()
: map_
colors <- map_colors(dat=mtcars, columns=c("cyl","gear"), preferred="viridis")
#> Using palette: viridis
#> Using palette: okabe
dat <- example_dat("gene")
dt2 <- map_genes_monarch(dat=dat, gene_col="gene")
#> Filtering with `queries`.
#> Files found: 1
#> Importing 1 Monarch files.
#> - 1/1: gene_homology.all
#> Unique species with orthologs: 25
#> Filtered 'subject_db' : 6,503,843 / 6,906,371 rows dropped.
#> Unique orthologs: 314,817
#> 6 / 6 rows remain after gene orthology mapping.
dat <- example_dat(rm_types="gene")
dat2 <- map_mondo(dat = dat, map_to="hpo")
#> Loading required namespace: echogithub
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Mapping id --> mondo_id
#> ℹ All local files already up-to-date!
#> Importing cached file: /github/home/.cache/R/KGExplorer/mondo.owl
#> Adding term metadata.
#> IC_method: IC_offspring
#> Adding ancestor metadata.
#> Getting absolute ontology level for 31,550 IDs.
#> 2412 ancestors found at level 2
#> Translating all terms to names.
#> + Returning a vector of terms (same order as input).
#> Converted ontology to: adjacency
#> Getting absolute ontology level for 31,550 IDs.
#> 4 / 21 (19.05%) mondo_id missing.
#> 4 / 21 (19.05%) mondo_name missing.
#> 14 / 21 (66.67%) mondo_def missing.
ont <- get_ontology("hp")
#> ℹ All local files already up-to-date!
#> Importing cached file: /github/home/.cache/R/KGExplorer/hp-international.owl
#> Adding term metadata.
#> IC_method: IC_offspring
#> Adding ancestor metadata.
#> Getting absolute ontology level for 25,301 IDs.
#> 900 ancestors found at level 2
#> Translating all terms to names.
#> + Returning a vector of terms (same order as input).
#> Converted ontology to: adjacency
#> Getting absolute ontology level for 25,301 IDs.
terms <- c("Focal motor seizure","HP:0000002","HP:0000003")
term_names <- map_ontology_terms(ont=ont, terms=terms)
#> Translating all terms to names.
#> + Returning a vector of terms (same order as input).
term_ids <- map_ontology_terms(ont=ont, terms=terms, to="id")
#> Translating all terms to HPO IDs.
#> + Returning a vector of terms (same order as input).
if (FALSE) {
res <- map_upheno()
}
if(interactive()){
gr <- GenomicRanges::GRanges("1:100-10000")
hits <- map_variants(gr)
}