Runs the correlation analysis pipeline by computing Spearman’s rank correlations of log₂ fold-changes for differentially expressed genes (DEGs) across and within multiple scRNA-seq datasets. Uses a user-specified reference dataset to define DEGs and compares effect sizes across studies, independently sampled subsets, and permuted controls.

correlation_analysis(
  main_dataset,
  SCEs,
  sampleIDs,
  celltypeIDs,
  celltype_correspondence,
  dataset_names,
  pvals = c(0.05, 0.025, 0.01, 0.001, 1e-04),
  alphaval = 0.25,
  N_randperms = 5,
  N_subsets = 5,
  sex_DEGs = FALSE,
  fontsize_yaxislabels = 12,
  fontsize_yaxisticks = 9,
  fontsize_title = 14,
  fontsize_legendlabels = 9,
  fontsize_legendtitle = 9,
  fontsize_facet_labels = 9,
  output_path = getwd()
)

Arguments

main_dataset

Name of the dataset used to select significant DEGs from (specified as a string, use the dataset name as in dataset_names)

SCEs

A list of SingleCellExperiment (SCE) objects, each representing a scRNA-seq dataset.

sampleIDs

A character vector specifying the column name in each SCE that represents sample or donor IDs (in order of SCEs).

celltypeIDs

A character vector specifying the column name in each SCE that denotes cell type identity (in order of SCEs).

celltype_correspondence

A named vector that maps a standard cell type label (e.g., list(Micro=c("Micro",NA), Astro=c(NA,"Astro")) to how that cell type appears in each dataset. Use NA if the cell type is not present in a given dataset.

dataset_names

A vector of names corresponding to each dataset (as you would like them to appear in output plots).

pvals

list of P-value thresholds for selecting DEGs in each individual dataset. Default is c(0.05,0.025,0.01,0.001,0.0001).

alphaval

Transparency of the non-mean boxplots. The value of alpha ranges between 0 (completely transparent) and 1 (completely opaque).

N_randperms

Number of random permutations of the dataset used to select significant DEGs from. Default is 5.

N_subsets

Number of pairs of random subsets of the dataset used to select significant DEGs from. Default is 5.

sex_DEGs

If TRUE, only keep genes present on sex chromosmomes. Queries hspanies gene Ensembl dataset.

fontsize_yaxislabels

font size for axis labels in plot

fontsize_yaxisticks

font size for axis tick labels in plot

fontsize_title

font size for plot title

fontsize_legendlabels

font size for legend labels in plot

fontsize_legendtitle

font size for legend title in plot

fontsize_facet_labels

font size for facet labels Saves all plots and DGE analysis outputs in the appropriate directories

output_path

A directory path where outputs will be saved.

Examples