Run standardized Seurat pipeline — seurat

Run a standard Seurat preprocessing pipeline on a Seurat object or raw counts matrix. Automatically performs:

NormalizeData: Data normalization
FindVariableFeatures: Variable feature selection
ScaleData: Data scaling
RunPCA: PCA
RunUMAP: UMAP
FindNeighbors: K-nearest neighbors
FindClusters: Clustering

seurat_pipeline(obj, dims = seq_len(30), resolution = 0.8, verbose = TRUE, ...)

Arguments

obj: A Seurat object or a counts matrix.
dims: Dimensions to use for UMAP and neighbors (default: seq_len(30)).
resolution: Clustering resolution (default: 0.8).
verbose: Print progress messages.
...: Additional arguments passed to Seurat functions.

Value

A preprocessed Seurat object with PCA, UMAP, neighbors, and cluster assignments.

Examples

data("pseudo_seurat")
# Re-run pipeline on existing object
obj <- seurat_pipeline(obj = pseudo_seurat)
#> Running NormalizeData...
#> Running FindVariableFeatures...
#> Running ScaleData...
#> Centering and scaling data matrix
#> Running PCA...
#> PC_ 1 
#> Positive:  FTMT, ACTG1, ALAS2, HSPA1L, NME1-NME2, POTEI, OTOP2, RAPSN, BEST2, DPEP2 
#> 	   HMGB2, NAPRT1, KRT8, PPIB, DSCC1, POU4F3, CCDC102A, GAPDHS, CHST6, AGXT2 
#> 	   LYZL2, MTMR8, ACTG2, ACPL2, BANF1, PPAPDC2, HTR1A, IFI30, CYBRD1, LHX8 
#> Negative:  FAIM2, CAMK2A, SCN1A, CAMK2B, FRRS1L, UNC80, PHYHIP, RASGRF2, CCK, GRIA2 
#> 	   STXBP5L, ARPP21, SLC12A5, DIRAS2, RYR2, SLC4A10, KCNT1, GRM5, CAMKV, KIAA1211L 
#> 	   GABRA4, GABRA1, SV2B, CX3CL1, AK5, PNMA2, JPH4, DGKG, GPR158, KCNC2 
#> PC_ 2 
#> Positive:  CAMKK1, DGKQ, NT5DC3, CA7, ABCG4, HTR1A, C5orf28, OTOP2, HYKK, DPEP2 
#> 	   CHST6, POTEI, SLC8A3, SLC38A11, ADRA2A, MPPED1, MTMR8, HTR7, CACNA1B, PPAPDC2 
#> 	   C2orf69, GRIK1, IFI30, STK32B, RASL10B, SLC24A4, FAXDC2, ADCY3, ACSS2, ANKRD29 
#> Negative:  RAN, HSP90AA1, H2AFZ, HNRNPAB, CCT5, NPM1, GNG5, DBI, HMGB2, ITM2B 
#> 	   ATP6V1G1, SERPINH1, CIRBP, CD63, NDUFA6, MDK, JUN, MYL12B, SPARC, NPC2 
#> 	   GLUL, ID3, EEF1A1, VIM, CLIC1, COX6B1, LDHA, DDAH2, ENO1, CNN3 
#> PC_ 3 
#> Positive:  ADGRL2, AC011288.2, RP11-420N3.3, RP11-191L9.4, NRXN3, PLPPR1, RP11-123O10.4, ZNF385D, AC114765.1, NWD2 
#> 	   RBFOX3, MIR137HG, MIR325HG, SGOL1-AS1, POU6F2, ANKRD18A, LY86-AS1, LINC01197, DGCR5, DPY19L1P1 
#> 	   MIR4300HG, AQP4-AS1, HPSE2, LINC00632, NLGN4X, AC067956.1, PWRN1, LINC00599, CABP1, LINC01158 
#> Negative:  KRTCAP2, APOE, C20orf24, PDIA6, PGLS, GNG11, S100A13, HIST1H2BI, ISCA2, GSTM5 
#> 	   LAPTM4A, CST3, TMEM176B, KLF4, PDLIM2, CAP1, S100A16, APRT, CYR61, FAIM 
#> 	   IFITM3, CDKN1A, KLF2, CLIC1, ARPC1B, IER2, S100A1, CMTM5, FXYD1, TCN2 
#> PC_ 4 
#> Positive:  RESP18, CTXN2, ATP6V1G2, GNG13, DISP2, C15orf59, CCDC85A, GNG3, SYNGR3, RGS8 
#> 	   VWA5B2, C1QL3, HPCA, TUBB3, CALB1, SNCB, HTR3A, ARHGDIG, L1CAM, NAP1L5 
#> 	   PCDH20, HMP19, DBNDD2, NPAS4, FABP3, CALY, FAM43B, CKMT1B, LOC728392, LTK 
#> Negative:  PTPN18, SLCO1A2, LINC00639, INPP5D, IFI44, LYN, DISC1, NEAT1, NRGN, CMYA5 
#> 	   IFI44L, GALNT15, PARP14, AC012593.1, AQP4-AS1, MSR1, MT2A, ISG15, SHROOM4, CABP1 
#> 	   UACA, KCNQ1OT1, PART1, CNDP1, FAM153B, DGCR5, SOX2-OT, LINC00844, ADGRG1, LINC00599 
#> PC_ 5 
#> Positive:  MEST, IGFBP2, CNN3, FBXL7, NNAT, TUBB2B, GPC3, VIM, NKAIN4, ID1 
#> 	   BMP7, CSRP2, NDN, DDAH2, GPX8, IGFBPL1, MARCKSL1, GSTM3, FBLN1, PARD3 
#> 	   MFAP4, PTN, FABP7, COPS6, CTNNA2, ZBTB20, BEX1, CD81, ENO1, NPAS3 
#> Negative:  C1QB, FCGR2A, MS4A6A, TYROBP, C1QC, AIF1, C1QA, CSF1R, CD86, MRC1 
#> 	   MS4A7, CTSS, CCL24, FCER1G, CD53, CD14, FCGR1A, PLEK, C3AR1, LYZ 
#> 	   FCGR2B, CX3CR1, CCL3L3, CCL2, CCR1, CD68, C5AR1, PF4, HPGDS, LY86 
#> Running UMAP...
#> Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
#> To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
#> This message will be shown once per session
#> 01:24:11 UMAP embedding parameters a = 0.9922 b = 1.112
#> 01:24:11 Read 801 rows and found 30 numeric columns
#> 01:24:11 Using Annoy for neighbor search, n_neighbors = 30
#> 01:24:11 Building Annoy index with metric = cosine, n_trees = 50
#> 0%   10   20   30   40   50   60   70   80   90   100%
#> [----|----|----|----|----|----|----|----|----|----|
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> *
#> |
#> 01:24:11 Writing NN index file to temp file /tmp/RtmpnTEFXI/file11f861b2df07
#> 01:24:11 Searching Annoy index using 1 thread, search_k = 3000
#> 01:24:11 Annoy recall = 100%
#> 01:24:12 Commencing smooth kNN distance calibration using 1 thread
#>  with target n_neighbors = 30
#> 01:24:13 Found 2 connected components, 
#> falling back to 'spca' initialization with init_sdev = 1
#> 01:24:13 Using 'irlba' for PCA
#> 01:24:13 PCA: 2 components explained 52.16% variance
#> 01:24:13 Scaling init to sdev = 1
#> 01:24:13 Commencing optimization for 500 epochs, with 27816 positive edges
#> 01:24:13 Using rng type: pcg
#> 01:24:14 Optimization finished
#> Running FindNeighbors...
#> Computing nearest neighbor graph
#> Computing SNN
#> Running FindClusters...
#> Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
#> 
#> Number of nodes: 801
#> Number of edges: 19587
#> 
#> Running Louvain algorithm...
#> Maximum modularity in 10 random starts: 0.8727
#> Number of communities: 12
#> Elapsed time: 0 seconds