Adjust MAGMA Z-statistic from .genes.out files — adjust_zstat_in

Used when you want to directly analyse the gene-level Z-scores for a given GWAS while correcting for known confounding variables such as:

NSNPS : Number of SNPs
NPARAM : Number of parameters?
GENELEN : Gene length
log*** : The logged version of each of the above variables, using the default log function.

adjust_zstat_in_genesOut(
  magma_GenesOut_file,
  ctd = NULL,
  ctd_species = infer_ctd_species(ctd),
  prepare_ctd = TRUE,
  method = "bonferroni",
  verbose = TRUE,
  ...
)

Arguments

magma_GenesOut_file

A MAGMA .genes.out file generated by map_snps_to_genes.

ctd

Cell type data structure containing specificity_quantiles.

ctd_species

Species name relevant to the CellTypeDataset (ctd). See list_species for all available species. If ctd_species=NULL (default), the ctd species will automatically be inferred using infer_species.

prepare_ctd

Whether to run prepare_quantile_groups on the ctd first.

method

R package to use for gene mapping:

"gprofiler" : Slower but more species and genes.
"homologene" : Faster but fewer species and genes.
"babelgene" : Faster but fewer species and genes. Also gives consensus scores for each gene mapping based on a several different data sources.

verbose

Print messages.

...

Arguments passed on to EWCE::standardise_ctd

dataset

CellTypeData. name.

input_species

Which species the gene names in exp come from. See list_species for all available species.

output_species

Which species' genes names to convert exp to. See list_species for all available species.

sctSpecies_origin

Species that the sct_data originally came from, regardless of its current gene format (e.g. it was previously converted from mouse to human gene orthologs). This is used for computing an appropriate backgrund.

non121_strategy

How to handle genes that don't have 1:1 mappings between input_species:output_species. Options include:

"drop_both_species" or "dbs" or 1 :
Drop genes that have duplicate mappings in either the input_species or output_species
(DEFAULT).
"drop_input_species" or "dis" or 2 :
Only drop genes that have duplicate mappings in the input_species.
"drop_output_species" or "dos" or 3 :
Only drop genes that have duplicate mappings in the output_species.
"keep_both_species" or "kbs" or 4 :
Keep all genes regardless of whether they have duplicate mappings in either species.
"keep_popular" or "kp" or 5 :
Return only the most "popular" interspecies ortholog mappings. This procedure tends to yield a greater number of returned genes but at the cost of many of them not being true biological 1:1 orthologs.
"sum","mean","median","min" or "max" :
When gene_df is a matrix and gene_output="rownames", these options will aggregate many-to-one gene mappings (input_species-to-output_species) after dropping any duplicate genes in the output_species.

force_new_quantiles

By default, quantile computation is skipped if they have already been computed. Set =TRUE to override this and generate new quantiles.

force_standardise

If ctd has already been standardised, whether to rerun standardisation anyway (Default: FALSE).

remove_unlabeled_clusters

Remove any samples that have numeric column names.

numberOfBins

Number of non-zero quantile bins.

keep_annot

Keep the column annotation data if provided.

keep_plots

Keep the dendrograms if provided.

as_sparse

Convert to sparse matrix.

as_DelayedArray

Convert to DelayedArray.

rename_columns

Remove replace_chars from column names.

make_columns_unique

Rename each columns with the prefix dataset.species.celltype.

Examples

myGenesOut <- MAGMA.Celltyping::import_magma_files(
    ids = c("ieu-a-298"),
    file_types = ".genes.out",
    return_dir = FALSE)
#> Using built-in example files: ieu-a-298.tsv.gz.35UP.10DOWN
#> Returning MAGMA gene.* file paths
ctd <- ewceData::ctd()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache

magmaGenesOut <- MAGMA.Celltyping::adjust_zstat_in_genesOut(
    ctd = ctd,
    magma_GenesOut_file = myGenesOut,
    ctd_species = "mouse"
)
#> Standardising CellTypeDataset
#> Found 5 matrix types across 2 CTD levels.
#> Processing level: 1
#> Processing level: 2
#> Importing genes.out file.
#> 4 genes without HGNC gene symbols were dropped.
#> 371 genes that are absent from the ctd were dropped.
#> Computing adjusted Z-statistic.