Downsample the dataset, based either on the individuals or cells, and run DE analysis on each downsampled output. Save results in a dataframe

bulk_downsampling_DGEanalysis(
  SCEs,
  dataset_names,
  celltype_correspondence,
  sampled = "individuals",
  sampleIDs = "donor_id",
  celltypeIDs = "cell_type",
  output_path = getwd(),
  pvalue = 0.05,
  Nperms = 20
)

Arguments

SCEs

A list of SingleCellExperiment (SCE) objects, each representing a scRNA-seq dataset.

dataset_names

A vector of names corresponding to each dataset (as you would like them to appear in output plots).

celltype_correspondence

A named vector that maps a standard cell type label (e.g., "Endo", "Micro") to how that cell type appears in each dataset. Use NA if the cell type is not present in a given dataset.

sampled

Specifies the unit of down-sampling. Can be either "individuals" or "cells", depending on whether the analysis downsamples across samples or cells.

sampleIDs

A character vector specifying the column name in each SCE that represents sample or donor IDs (in order of SCEs).

celltypeIDs

A character vector specifying the column name in each SCE that denotes cell type identity (in order of SCEs).

output_path

A clean directory path where down-sampled outputs and plots will be saved (should contain no subdirectories).

pvalue

P-value threshold for defining DEGs in the bulk dataset.

Nperms

Number of permutations to perform for each down-sampling level. Default is 20. Saves DGE analysis output in the correct directory, to be used by other bulk analysis functions