Check GPT phenotype annotations using a several metrics.

gpt_annot_codify(
  annot = gpt_annot_read(),
  remove_duplicates = TRUE,
  code_dict = c(never = 0, rarely = 1, often = 2, always = 3),
  weights_dict = list(death = 6, intellectual_disability = 5, impaired_mobility = 4,
    physical_malformations = 3, blindness = 4, sensory_impairments = 3, immunodeficiency
    = 3, cancer = 3, reduced_fertility = 1, congenital_onset = 1),
  reset_weights_dict = FALSE,
  filters = list()
)

Arguments

annot

GPT-generated phenotype annotations.

remove_duplicates

Ensure only 1 row per phenotype.

code_dict

Numerical encodings of annotation values.

weights_dict

Weights to be applied to each annotation metric.

reset_weights_dict

Override weights_dict values and set all values to 1. This will ensure that all annotations are unweighted.

filters

A named list, where each element in the list is the name of a column in the data, and the vector within each element represents the values to include in the final data.

Value

Named list

Examples

res_coded <- gpt_annot_codify()
#> Translating ontology terms to ids.
#> Reading cached RDS file: phenotype_to_genes.txt
#> + Version: v2024-12-12
#> 383 phenotypes do not have matching HPO IDs.
#> Reading in GPT annotations for 16,753 phenotypes.