The Human Phenotype Ontology (HPO) is a controlled vocabulary of phenotypic abnormalities encountered in human disease. It currently contains over 18,000 hierarchically organised terms. Each term in the HPO describes a phenotypic abnormality, ranging from very broad phenotypes (e.g. “Abnormality of the nervous system”) down to extremely specific phenotypes (e.g. “Decreased CSF 5-hydroxyindolacetic acid concentration”).
The HPO is currently being used in thousands of exome and genome sequencing projects around the world to aid in the interpretation of human variation, in clinical practice to support differential diagnosis and to annotate patient information, and in research to understand the role of rare variants in human health and disease. The HPO was developed by the Monarch Initiative in collaboration with The Jackson Laboratory.
HPOExplorer
HPOExplorer
is an R package with extensive functions for easily importing, annotating, filtering, and visualising the Human Phenotype Ontology (HPO) at the disease, phenotype, and gene levels. By pulling fresh data directly from official resources like HPO, Monarch and GenCC, it ensures tightly controlled version coordination with the most up-to-date data available at any given time (with the option to use caching to boost speed). Furthermore, it can efficiently reorganise gene annotations into sparse matrices for usage within downstream statistical and machine learning analysis.
HPOExplorer
was developed by the Neurogenomics Lab at Imperial College London, along with valuable feedback provided by the HPO team. This package is still actively evolving and growing. Community engagement is welcome and any suggestions can be submitted as an Issue or Pull Request.
Within R:
if(!require("BiocManager")) install.packages("BiocManager")
BiocManager::install("neurogenomics/HPOExplorer")
library(HPOExplorer)
A quick tutorial on how to get started with HPOExplorer
.
If you use HPOExplorer
, please cite:
Kitty B. Murphy, Robert Gordon-Smith, Jai Chapman, Momoko Otani, Brian M. Schilder, Nathan G. Skene (2023) Identification of cell type-specific gene targets underlying thousands of rare diseases and subtraits. medRxiv, https://doi.org/10.1101/2023.02.13.23285820
utils::sessionInfo()
## R version 4.4.0 (2024-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] rmarkdown_2.27
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.5 jsonlite_1.8.8 renv_1.0.7
## [4] dplyr_1.1.4 compiler_4.4.0 BiocManager_1.30.23
## [7] tidyselect_1.2.1 rvcheck_0.2.1 scales_1.3.0
## [10] yaml_2.3.8 fastmap_1.2.0 here_1.0.1
## [13] ggplot2_3.5.1 R6_2.5.1 generics_0.1.3
## [16] knitr_1.46 yulab.utils_0.1.4 tibble_3.2.1
## [19] desc_1.4.3 dlstats_0.1.7 rprojroot_2.0.4
## [22] munsell_0.5.1 pillar_1.9.0 RColorBrewer_1.1-3
## [25] rlang_1.1.3 utf8_1.2.4 cachem_1.1.0
## [28] badger_0.2.3 xfun_0.44 fs_1.6.4
## [31] memoise_2.0.1 cli_3.6.2 magrittr_2.0.3
## [34] rworkflows_1.0.1 digest_0.6.35 grid_4.4.0
## [37] lifecycle_1.0.4 vctrs_0.6.5 data.table_1.15.4
## [40] evaluate_0.23 glue_1.7.0 fansi_1.0.6
## [43] colorspace_2.1-0 tools_4.4.0 pkgconfig_2.0.3
## [46] htmltools_0.5.8.1