Check GPT phenotype annotations using a several metrics.
gpt_annot_codify(
annot = gpt_annot_read(),
remove_duplicates = TRUE,
code_dict = c(never = 0, rarely = 1, often = 2, always = 3),
weights_dict = list(death = 6, intellectual_disability = 5, impaired_mobility = 4,
physical_malformations = 3, blindness = 4, sensory_impairments = 3, immunodeficiency
= 3, cancer = 3, reduced_fertility = 1, congenital_onset = 1),
reset_weights_dict = FALSE,
filters = list()
)
GPT-generated phenotype annotations.
Ensure only 1 row per phenotype.
Numerical encodings of annotation values.
Weights to be applied to each annotation metric.
Override weights_dict
values and set all values
to 1. This will ensure that all annotations are unweighted.
A named list, where each element in the list is the name of a column in the data, and the vector within each element represents the values to include in the final data.
Named list
res_coded <- gpt_annot_codify()
#> Translating ontology terms to ids.
#> Reading cached RDS file: phenotype_to_genes.txt
#> + Version: v2024-04-26
#> 256 phenotypes do not have matching HPO IDs.
#> Reading in GPT annotations for 16,879 phenotypes.