Introduction

KGExplorer is an R package designed for efficient exploration and manipulation of biomedical knowledge graphs and ontologies. Its core functionalities include:

  • Graph Manipulation and Visualization: Tools like plot_graph_3d() and plot_ggnetwork() enable intuitive visualization and analysis of ontological data.

  • Data Retrieval: Functions such as get_mondo_maps() and get_ontology() facilitate fetching data from prominent biomedical databases.

  • ID Mapping: Mapping utilities like map_genes_monarch() and map_mondo() allow seamless conversion between various identifier systems.

  • Graph Filtering: With functions like filter_graph() and filter_kg(), users can subset knowledge graphs to focus on relevant subsets.

  • Graph Conversion: Utilities such as graph_to_dt() and graph_to_plotly() provide flexibility to transform knowledge graphs into different data formats or visualization-ready structures.

The package leverages robust R libraries such as tidygraph, data.table, and popular plotting tools including plotly and ggplot2. KGExplorer adopts a modular design, with dedicated functions for data retrieval, manipulation, visualization, and utility operations. Comprehensive caching support ensures efficient handling of large datasets, reducing the need for repeated downloads.

Use cases

The following examples illustrate how KGExplorer can be used to extract meaningful insights from biomedical knowledge graphs. Each use case highlights key functionalities of the package and demonstrates practical applications in biomedical research.

To explore additional functionalities and find detailed documentation for all available functions, users can refer to the KGExplorer reference guide.

Extract disease/phenotype-cell type associations

The package provides functionality to extract associations between diseases/phenotypes and cell types. This can help researchers understand the relationships and interactions between different biological entities.

# Get the Monarch knowledge graph
g <- get_monarch_kg()

# Filter the graph to include only edges between diseases/phenotypes and cell
# types
g2 <- filter_kg(g, 
                to_categories = c("biolink:Disease",
                                  "biolink:PhenotypicFeature"),
                from_categories = "biolink:Cell")

# Plot the filtered graph using visNetwork
plot_graph_visnetwork(g2,
                      selectedBy = "id",
                      label_var = "name",
                      layout = "layout_nicely",
                      colour_var = "category")

Assess known animal models of human phenotypes

The package includes tools to assess known animal models of human phenotypes, which can be crucial for translational research and understanding disease mechanisms.

# Map uPheno data
dat <- map_upheno_data()

# Plot the mapped uPheno data
upheno_plots <- plot_upheno(dat)

Users can link diseases to phenotypes, genes, and variants, providing a comprehensive view of the genetic and phenotypic landscape. This can aid in identifying potential genetic markers and understanding the genetic basis of diseases.

# Link Monarch data to create a graph with variant-disease, variant-phenotype,
# and variant-gene associations
gm <- link_monarch(maps = list(
  c("variant", "disease"),
  c("variant", "phenotype"),
  c("variant", "gene")
))

# Join the linked graph with the original graph
gm2 <- tidygraph::graph_join(gm,g)

# Filter the graph to include only specific categories of nodes
gm2 <- filter_graph(gm2,
                    node_filters = list(category=c("disease",
                                                   "phenotype",
                                                   "phenotypicfeature",
                                                   "gene",
                                                   "variant")))

# Further filter the graph to limit the size
gm3 <- filter_graph(g = gm2, size = 20000)

# Update vertex attributes for category and name
igraph::vertex_attr(gm3,"category") <- 
  tolower(gsub("biolink:","", igraph::vertex_attr(gm3,"category")))
igraph::vertex_attr(gm3,"name") <- igraph::vertex_attr(gm3,"id")

# Plot the final graph using visNetwork
plot_graph_visnetwork(gm3, 
                      selectedBy = "category",
                      label_var = "name",
                      layout = "layout_nicely",
                      colour_var = "category")

Session Info

utils::sessionInfo()
## R Under development (unstable) (2025-03-04 r87880)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] KGExplorer_0.99.06
## 
## loaded via a namespace (and not attached):
##   [1] grr_0.9.5                 httr2_1.1.1              
##   [3] rlang_1.1.5               magrittr_2.0.3           
##   [5] clue_0.3-66               GetoptLong_1.0.5         
##   [7] matrixStats_1.5.0         compiler_4.5.0           
##   [9] png_0.1-8                 systemfonts_1.2.1        
##  [11] vctrs_0.6.5               maps_3.4.2.1             
##  [13] gprofiler2_0.2.3          stringr_1.5.1            
##  [15] rvest_1.0.4               shape_1.4.6.1            
##  [17] pkgconfig_2.0.3           crayon_1.5.3             
##  [19] fastmap_1.2.0             backports_1.5.0          
##  [21] promises_1.3.2            rmarkdown_2.29           
##  [23] ragg_1.3.3                purrr_1.0.4              
##  [25] xfun_0.51                 cachem_1.1.0             
##  [27] pals_1.10                 aplot_0.2.5              
##  [29] jsonlite_1.9.1            later_1.4.1              
##  [31] BiocParallel_1.41.2       cluster_2.1.8            
##  [33] broom_1.0.7               parallel_4.5.0           
##  [35] R6_2.6.1                  rols_3.3.0               
##  [37] stringi_1.8.4             RColorBrewer_1.1-3       
##  [39] bslib_0.9.0               car_3.1-3                
##  [41] jquerylib_0.1.4           Rcpp_1.0.14              
##  [43] iterators_1.0.14          knitr_1.49               
##  [45] IRanges_2.41.3            httpuv_1.6.15            
##  [47] igraph_2.1.4              Matrix_1.7-2             
##  [49] tidyselect_1.2.1          dichromat_2.0-0.1        
##  [51] abind_1.4-8               yaml_2.3.10              
##  [53] doParallel_1.0.17         codetools_0.2-20         
##  [55] lattice_0.22-6            tibble_3.2.1             
##  [57] shiny_1.10.0              Biobase_2.67.0           
##  [59] treeio_1.31.0             evaluate_1.0.3           
##  [61] gridGraphics_0.5-1        desc_1.4.3               
##  [63] xml2_1.3.7                circlize_0.4.16          
##  [65] pillar_1.10.1             ggtree_3.15.0            
##  [67] ggpubr_0.6.0              carData_3.0-5            
##  [69] stats4_4.5.0              foreach_1.5.2            
##  [71] ggfun_0.1.8               plotly_4.10.4            
##  [73] generics_0.1.3            S4Vectors_0.45.4         
##  [75] ggplot2_3.5.1             munsell_0.5.1            
##  [77] scales_1.3.0              tidytree_0.4.6           
##  [79] xtable_1.8-4              glue_1.8.0               
##  [81] orthogene_1.13.0          mapproj_1.2.11           
##  [83] scatterplot3d_0.3-44      lazyeval_0.2.2           
##  [85] tools_4.5.0               data.table_1.17.0        
##  [87] ggsignif_0.6.4            babelgene_22.9           
##  [89] fs_1.6.5                  tidygraph_1.3.1          
##  [91] grid_4.5.0                tidyr_1.3.1              
##  [93] ape_5.8-1                 colorspace_2.1-1         
##  [95] nlme_3.1-167              patchwork_1.3.0          
##  [97] homologene_1.4.68.19.3.27 Formula_1.2-5            
##  [99] cli_3.6.4                 rappdirs_0.3.3           
## [101] Polychrome_1.5.1          textshaping_1.0.0        
## [103] viridisLite_0.4.2         ComplexHeatmap_2.23.0    
## [105] dplyr_1.1.4               gtable_0.3.6             
## [107] rstatix_0.7.2             yulab.utils_0.2.0        
## [109] sass_0.4.9                digest_0.6.37            
## [111] BiocGenerics_0.53.6       ggplotify_0.1.2          
## [113] rjson_0.2.23              htmlwidgets_1.6.4        
## [115] farver_2.1.2              htmltools_0.5.8.1        
## [117] pkgdown_2.1.1             simona_1.5.0             
## [119] lifecycle_1.0.4           httr_1.4.7               
## [121] mime_0.12                 GlobalOptions_0.1.2