R/bulk_power_analysis.r
bulk_power_analysis.Rd
Runs the complete bulk RNA-seq-informed power analysis pipeline by performing downsampling-based DEG detection across multiple scRNA-seq datasets, comparing overlaps with bulk RNA-seq DEGs, and generating summary plots for evaluation.
bulk_power_analysis(
SCEs,
dataset_names,
celltype_correspondence,
output_path = getwd(),
celltypeIDs = "cell_type",
sampled = "individuals",
sampleIDs = "donor_id",
bulkDE = "placeholder",
bulk_cutoff = 0.9,
pvalue = 0.05,
Nperms = 20,
fontsize_axislabels = 12,
fontsize_axisticks = 9,
fontsize_title = 14,
fontsize_legendlabels = 9,
fontsize_legendtitle = 9,
plot_title = "placeholder"
)
A list of SingleCellExperiment (SCE) objects, each representing a scRNA-seq dataset.
A vector of names corresponding to each dataset (as you would like them to appear in output plots).
A named vector that maps a standard cell type label (e.g., list(Micro=c("Micro",NA), Astro=c(NA,"Astro")) to how that cell type appears in each dataset. Use NA
if the cell type is not present in a given dataset.
A clean directory path where DGE analysis outputs of down-sampled datasets and summary plots will be saved (should contain no subdirectories).
A character vector specifying the column name in each SCE that denotes cell type identity (in order of SCEs).
Specifies the unit of down-sampling. Can be either "individuals"
or "cells"
, depending on whether the analysis downsamples across samples or cells.
A character vector specifying the column name in each SCE that represents sample or donor IDs (in order of SCEs).
DGE analysis output for a bulk RNA-seq dataset (e.g., LFSR.tsv
): rows (rownames) should be the genes, columns should be tissues, and entries should be significance levels
Proportion (0–1) of bulk tissues in which a gene must be differentially expressed to be considered (e.g., 0.9 selects DEGs found in ≥90% of tissues). Default is 0.9.
P-value threshold for selecting DEGs in each individual dataset. Default is 0.05.
Number of subsets (permutations) to generate at each downsampling level during power analysis. Each subset is analyzed independently to estimate variability. Default is 20.
Font size for axis labels in plot
Font size for axis tick labels in plot
Font size for plot title
Font size for legend labels in plot
Font size for legend title in plot
Plot title Saves all plots in the appropriate directories