Runs entire power analysis pipeline

power_analysis(
  data,
  range_downsampled_individuals = "placeholder",
  range_downsampled_cells = "placeholder",
  output_path = getwd(),
  sampleID = "donor_id",
  design = "placeholder",
  sexID = "sex",
  celltypeID = "cell_type",
  coeff = "male",
  fdr = 0.05,
  nom_pval = 0.05,
  Nperms = 20,
  y = NULL,
  region = "single_region",
  control = NULL,
  pval_adjust_method = "BH",
  rmv_zero_count_genes = TRUE
)

Arguments

data

the input data (should be an SCE object)

range_downsampled_individuals

vector or list containing values which the data will be downsampled at (for individuals), in ascending order

range_downsampled_cells

vector or list containing values which the data will be downsampled at (for cells), in ascending order

output_path

base path in which outputs will be stored

sampleID

sample ID

design

the design formula of class type formula. Equation used to fit the model- data for the generalised linear model e.g. expression ~ sex + pmi + disease

sexID

sex ID

celltypeID

cell type ID

coeff

which coefficient to carry out DE analysis with respect to

fdr

the cut-off False Discovery Rate below which to select DEGs

nom_pval

the cut-off nominal P-value below which to select DEGs (as an alternative to FDR)

Nperms

number of subsets created when downsampling at each level

y

the column name in the SCE object for the return variable e.g. "diagnosis" - Case or disease. Default is the last variable in the design formula. y can be discrete (logistic regression) or continuous (linear regression)

region

the column name in the SCE object for the study region. If there are multiple regions in the study (for example two brain regions). Pseudobulk values can be derived separately. Default is "single_region" which will not split by region.

control

character specifying which control level for the differential expression analysis e.g. in a case/control/other study use "control" in the y column to compare against. NOTE only need to specify if more than two groups in y, leave as default value for two groups or continuous y. Default is NULL.

pval_adjust_method

the adjustment method for the p-value in the differential expression analysis. Default is benjamini hochberg "BH". See stats::p.adjust for available options

rmv_zero_count_genes

whether genes with no count values in any cell should be removed. Default is TRUE Saves all plots and DGE analysis outputs in the appropriate directories