Convert a HPO phenotype dataframe generated by make_phenos_dataframe to a GRangesList split by HPO ID. The resulting object will contain genes (and gene metadata) for all genes associated with each phenotypes.

phenos_to_granges(
  phenos = NULL,
  phenotype_to_genes = load_phenotype_to_genes(),
  hpo = get_hpo(),
  keep_chr = c(seq(22), "X", "Y"),
  by = c("hpo_id", "disease_id"),
  gene_col = "intersection",
  split.field = "hpo_id",
  as_datatable = FALSE,
  allow.cartesian = FALSE,
  verbose = TRUE
)

Arguments

phenos

A data.table containing HPO IDs and other metadata.

phenotype_to_genes

Output of load_phenotype_to_genes mapping phenotypes to gene annotations.

hpo

Human Phenotype Ontology object, loaded from get_ontology.

keep_chr

Chromosomes to keep.

by

A vector of shared column names in x and y to merge on. This defaults to the shared key columns between the two tables. If y has no key columns, this defaults to the key of x.

gene_col

Name of the gene column.

split.field

A character string of a recognized column name in df that contains the grouping. This column defines how the rows of df are split and is typically a factor or character vector. When split.field is not provided the df will be split by the number of rows.

as_datatable

Return as a data.table.

allow.cartesian

See allow.cartesian in [.data.table.

verbose

Print messages.

Value

A GRangesList.

Examples

phenos <- make_phenos_dataframe(ancestor = "Neurodevelopmental delay")
#> Reading cached RDS file: phenotype_to_genes.txt
#> + Version: v2024-04-26
#> Extracting data for 23 descendents.
#> Computing gene counts.
#> Adding term definitions.
#> Adding level-2 ancestor to each HPO ID.
#> Adding ancestor metadata.
#> Ancestor metadata already present. Use force_new=TRUE to overwrite.
#> 23 associations remain after filtering.
#> Getting absolute ontology level for 18,536 IDs.
#> Computing ontology level / gene count ratio.
grl <- phenos_to_granges(phenos = phenos)
#> Converting phenos to GRangesList.
#> Reading cached RDS file: phenotype_to_genes.txt
#> + Version: v2024-04-26
#> Annotating phenos with Disease
#> Reading cached RDS file: phenotype.hpoa
#> + Version: v2024-04-26
#> Loading required namespace: ensembldb
#> Gathering metadata for 2469 unique genes.
#> Loading required namespace: EnsDb.Hsapiens.v75