Functions to map IDs across ontologies/databases.

map_colors(
  dat,
  columns = NULL,
  as = c("vector", "dict", "name", "function"),
  what = "nodes",
  preferred_palettes = NULL
)

map_genes_monarch(
  dat,
  gene_col,
  as_graph = methods::is(dat, "tbl_graph"),
  map_by_merge = FALSE,
  all.x = FALSE
)

map_medgen(dat, input_col, ...)

map_mondo(
  dat,
  input_col = "id",
  output_col = "mondo_id",
  to = "mondo",
  map_types = NULL,
  map_to = NULL,
  top_n = NULL,
  add_name = TRUE,
  add_definitions = TRUE,
  all.x = TRUE,
  allow.cartesian = FALSE,
  save_dir = cache_dir()
)

map_ontology_terms(
  ont,
  terms = NULL,
  to = c("name", "id"),
  keep_order = TRUE,
  invert = FALSE
)

map_upheno(
  pheno_map_method = c("upheno", "monarch"),
  gene_map_method = c("monarch"),
  subset_db1 = c("HP"),
  terms = NULL,
  fill_scores = NULL,
  show_plot = TRUE,
  force_new = FALSE,
  save_dir = cache_dir()
)

map_variants(
  gr,
  build = c("GRCh37", "GRCh38"),
  upstream = 2000L,
  downstream = 200L,
  keep_chr = paste0("chr", c(seq_len(22), "X", "Y")),
  ignore.strand = TRUE
)

Source

https://data.monarchinitiative.org/upheno2/current/qc/index

https://data.monarchinitiative.org/upheno2/current/upheno-release/all/index.html

Arguments

dat

data.table with genes.

columns

Names of columns to map colour palettes to.

as

A character string specifying the format to convert to.

what

What should get activated? Possible values are nodes or edges.

preferred_palettes

Preferred palettes to use for each column.

gene_col

Name of the gene column in dat.

as_graph

Return the object as a tbl_graph.

map_by_merge

Map orthologs by merging the node data such that the orthologous genes will appear as a new column (TRUE). Otherwise, the orthologs will be added as new nodes to the graph (FALSE).

all.x

logical; if TRUE, rows from x which have no matching row in y are included. These rows will have 'NA's in the columns that are usually filled with values from y. The default is FALSE so that only rows with data from both x and y are included in the output.

input_col

Column name of input IDs.

...

Arguments passed on to map_mondo

output_col

Column name of output IDs.

to

Character vector of database(s) to map IDs to. When not "mondo", can supply multiple alternative databases to map to (e.g. c("OMIM","Orphanet","DECIPHER")).

map_types

Mapping types to include.

map_to

Mapping outputs to include (from Mondo IDs to another database's IDs).

top_n

Top number of mappings to return per top_by grouping. Set to NULL to skip this step.

add_name

Logical, if TRUE, add mondo name column.

add_definitions

logical, if TRUE, add mondo definition column.

allow.cartesian

See allow.cartesian in [.data.table.

save_dir

Directory to save cached data.

ont

An ontology of class ontology_DAG.

terms

A subset of HPO IDs to include in the final dataset and plots (e.g. c("HP:0001508","HP:0001507")).

keep_order

Return a named list of the same length and order as terms. If FALSE, return a named list of only the unique terms, sometimes in a different order.

invert

Invert the keys/values of the dictionary, such that the key becomes the values (and vice versa).

pheno_map_method

Method to use for mapping phenotypes across ontologies.

  • "upheno"Use uPheno's phenotype-to-phenotype mappings. Contains fewer ontologies but with greater coverage of phenotypes.

  • "monarch"Use Monarch's phenotype-to-phenotype mappings. Contains more ontologies but with less coverage of phenotypes.

gene_map_method

Method to use for mapping genes across species.

  • "monarch"Use Monarch's gene-to-gene mappings.

subset_db1

Subset of ontologies to include in the plot.

fill_scores

Fill missing scores in the "equivalence_score" and "subclass_score" columns with this value. These columns represent the quality of mapping between two phenotypes on a scale from 0-1.

show_plot

Show the plot.

force_new

Force new data to be downloaded and processed.

gr

A GRanges object.

build

Genome build to use when mapping genomic coordinates.

upstream, downstream

Single integer values representing the number of base pairs upstream of the 5'-end and downstream of the 3'-end. Used in contructing PromoterVariants() and IntergenicVariants() objects only.

keep_chr

Which chromosomes to keep.

ignore.strand

A logical indicating if strand should be ignored when performing overlaps.

Value

Mapped data.

Mapped dat

Character vector

A list containing the data and plot.

Functions

  • map_colors(): map_

  • map_genes_monarch(): map_ Map Monarch genes

    Map Monarch gene IDs to HGNC gene symbols, within or across species.

  • map_medgen(): map_ Map Medgen.

  • map_mondo(): map_ Map to/from mondo IDs

  • map_ontology_terms(): map_ Map ontology terms to an alternative name.

    Harmonise a mixed vector of term names (e.g. "Focal motor seizure") and term IDs (e.g. c("HP:0000002","HP:0000003")).

  • map_upheno(): map_ Map phenotypes across uPheno

    Map phenotypes across species within the Unified Phenotype Ontology (uPheno). First, gathers phenotype-phenotype mappings across ontologies. Next, gathers all phenotype-gene associations for each ontology, converts all genes to human HGNC orthologs, and computes the number of overlapping orthologs between all pairs mapped phenotypes. Finally, plots the results as the proportion of intersecting genes between all pairs of phenotypes.

  • map_variants(): map_

Examples

colors <- map_colors(dat=mtcars, columns=c("cyl","gear"), preferred="viridis")
#> Using palette: viridis
#> Using palette: okabe
dat <- example_dat("gene")
dt2 <- map_genes_monarch(dat=dat, gene_col="gene")
#> Filtering with `queries`.
#> Files found: 1
#> Importing 1 Monarch files.
#> - 1/1: gene_homology.all
#> Unique species with orthologs: 25
#> Filtered 'subject_db' : 6,503,843 / 6,906,371 rows dropped.
#> Unique orthologs: 314,817
#> 6 / 6 rows remain after gene orthology mapping.
dat <- example_dat(rm_types="gene")
dat2 <- map_mondo(dat = dat, map_to="hpo")
#> Loading required namespace: echogithub
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Preexisting file detected. Set force_overwrite=TRUE to override this.
#> Mapping id --> mondo_id
#>  All local files already up-to-date!
#> Importing cached file: /github/home/.cache/R/KGExplorer/mondo.owl
#> Adding term metadata.
#> IC_method: IC_offspring
#> Adding ancestor metadata.
#> Getting absolute ontology level for 31,550 IDs.
#> 2412 ancestors found at level 2
#> Translating all terms to names.
#> + Returning a vector of terms (same order as input).
#> Converted ontology to: adjacency 
#> Getting absolute ontology level for 31,550 IDs.
#> 4 / 21 (19.05%) mondo_id missing.
#> 4 / 21 (19.05%) mondo_name missing.
#> 14 / 21 (66.67%) mondo_def missing.
ont <- get_ontology("hp")
#>  All local files already up-to-date!
#> Importing cached file: /github/home/.cache/R/KGExplorer/hp-international.owl
#> Adding term metadata.
#> IC_method: IC_offspring
#> Adding ancestor metadata.
#> Getting absolute ontology level for 25,301 IDs.
#> 900 ancestors found at level 2
#> Translating all terms to names.
#> + Returning a vector of terms (same order as input).
#> Converted ontology to: adjacency 
#> Getting absolute ontology level for 25,301 IDs.
terms <- c("Focal motor seizure","HP:0000002","HP:0000003")
term_names <- map_ontology_terms(ont=ont, terms=terms)
#> Translating all terms to names.
#> + Returning a vector of terms (same order as input).
term_ids <- map_ontology_terms(ont=ont, terms=terms, to="id")
#> Translating all terms to HPO IDs.
#> + Returning a vector of terms (same order as input).
if (FALSE) {
res <- map_upheno()
}
if(interactive()){
gr <- GenomicRanges::GRanges("1:100-10000")
hits <- map_variants(gr)
}