R/power_analysis.r
power_analysis.Rd
Run the complete power analysis pipeline by downsampling individuals and cells, performing differential expression analysis, and generating power plots.
power_analysis(
SCE,
range_downsampled_individuals = "placeholder",
range_downsampled_cells = "placeholder",
output_path = getwd(),
sampleID = "donor_id",
design = "placeholder",
sexID = "sex",
celltypeID = "cell_type",
coef = "male",
fdr = 0.05,
nom_pval = 0.05,
Nperms = 20,
y = NULL,
region = "single_region",
control = NULL,
pval_adjust_method = "BH",
rmv_zero_count_genes = TRUE
)
A SingleCellExperiment
object containing the input scRNA-seq data. You may also provide a path to an .R
, .rds
, or .qs
file. If using a file, ensure the SCE
object inside is named SCE
.
A numeric vector specifying the number of individuals to include at each downsampling level, in ascending order (e.g., c(10, 20, 30)). By default, 12 evenly spaced values are generated from 0 to the total number of samples, each rounded up to the nearest multiple of 5.
A numeric vector specifying the number of cells per individual to include at each downsampling level, in ascending order (e.g., c(20, 40, 60)). By default, 11 evenly spaced values are generated from 0 to the 90th percentile of per-individual cell counts, each rounded to the nearest multiple of 5.
A directory path where DGE analysis outputs of down-sampled datasets and power plots will be saved.
Name of the column in the SCE
metadata that identifies biological replicates (e.g., patient ID). This column is used for grouping in the pseudobulk approach.
A model formula specifying covariates for differential expression analysis. It should be of class formula
(e.g., ~ sex + pmi + disease
). This formula is used to fit a generalized linear model.
Name of the column in the SCE
metadata that encodes the sex of individuals. Default is "sex"
.
Name of the column in the SCE
metadata indicating cell type labels. This is used to identify celltype specific DEGs.
Character string indicating the level of the response variable (y
) to test for in differential expression. For case-control studies, this would typically be "case" (e.g. "AD"). Typically used in binary comparisons. Not required for continuous outcomes.
Adjusted p-value (False Discovery Rate) threshold for selecting significantly differentially expressed genes (DEGs). Only genes with adjusted p-values below this value will be retained. Default is 0.05.
Nominal (unadjusted) p-value threshold for selecting DEGs. Used as an alternative to FDR when preferred. Only genes with p-values below this cutoff will be retained. Default is 0.05.
Number of subsets (permutations) to generate at each downsampling level during power analysis. Each subset is analyzed independently to estimate variability. Default is 20.
Name of the column in the SCE
metadata representing the response variable (e.g., "diagnosis" - case or disease). If not specified, defaults to the last variable in the design
formula. Accepts both categorical (logistic regression) and continuous (linear regression) variables.
Optional column in SCE
metadata indicating the tissue or brain region. If present, differential expression is performed within each region separately. Defaults to "single_region" (i.e., no regional split).
Optional. Character string specifying the control level in the response variable (y
) to compare against. Only required if y
contains more than two levels. Ignored for binary or continuous outcomes.
Method used to adjust p-values for multiple testing. Default is "BH" (Benjamini–Hochberg). See stats::p.adjust
for available options.
Logical. Whether to remove genes with zero counts across all cells. Default is TRUE
.
Saves all plots and DGE analysis outputs in the appropriate directories
if (FALSE) { # \dontrun{
# Too slow to run with check()
# 1. Prepare SCE
micro_tsai <- system.file("extdata", "Tsai_Micro.qs", package="poweranalysis")
SCE_tsai <- qs::qread(micro_tsai)
# 2. Run Power Analysis
PA_tsai <- poweranalysis::power_analysis(
SCE_tsai,
sampleID = "sample_id",
celltypeID = "cluster_celltype",
design = ~ sex,
coef = "M",
output_path = tempdir()
)
PA_tsai
} # }