EpiCompare compares different epigenetic datasets for quality control and benchmarking purposes. The report is divided into three sections:
## [1] "File1: H3K27ac_CnT_ActiveMotif_SEACR"
## [1] "File2: H3K27ac_CnT_ActiveMotif_MACS2"
## [1] "File3: H3K27ac_CnT_Abcamab4729_SEACR"
## [1] "File4: H3K27ac_CnT_Abcamab4729_MACS2"
## [1] "File5: H3K27ac_CnT_KayaOkur_SEACR"
## [1] "File6: H3K27ac_CnT_KayaOkur_MACS2"
## [1] "File7: H3K27ac_CnR_Meers_SEACR"
## [1] "File8: H3K27ac_CnR_Meers_MACS2"
## [1] "File9: H3K27me3_CnR_Meers_SEACR"
## [1] "File10: H3K27me3_CnR_Meers_MACS2"
## [1] "File11: H3K27me3_ENCODE"
## [1] "File12: H3K27ac_ENCODE"
## [1] "File13: H3K27ac_TIP_Abcam.phase_1_05_jan_2022.S_1_R1"
## [1] "File14: H3K27ac_TIP_Abcam.phase_2_03_feb_2022.S_4_R1"
## [1] "File15: H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_2_R1"
## [1] "File16: H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_3_R1"
## [1] "File17: H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_4_R1"
## [1] "File18: H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_5_R1"
## [1] "File19: H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_6_R1"
EpiCompare(peakfiles = list(H3K27ac_CnT_ActiveMotif_SEACR, H3K27ac_CnT_ActiveMotif_MACS2, H3K27ac_CnT_Abcamab4729_SEACR, H3K27ac_CnT_Abcamab4729_MACS2, H3K27ac_CnT_KayaOkur_SEACR, H3K27ac_CnT_KayaOkur_MACS2, H3K27ac_CnR_Meers_SEACR, H3K27ac_CnR_Meers_MACS2, H3K27me3_CnR_Meers_SEACR, H3K27me3_CnR_Meers_MACS2, H3K27me3_ENCODE, H3K27ac_ENCODE, H3K27ac_TIP_Abcam.phase_1_05_jan_2022.S_1_R1, H3K27ac_TIP_Abcam.phase_2_03_feb_2022.S_4_R1, H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_2_R1, H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_3_R1, H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_4_R1, H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_5_R1, H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_6_R1),
blacklist = blacklist,
picard_files = list(),
reference = ENCODE_H3K27ac,
stat_plot = TRUE,
chrmHMM_plot = TRUE,
chrmHMM_annotation = "K562",
chipseeker_plot = TRUE,
enrichment_plot = TRUE,
interact = TRUE,
save_output = FALSE,
output_dir = "/Users/serachoi/Documents/EpiCompare")
Column Description:
PeakN before tidy: Total number of peaks including those blacklisted and those in non-standard chromosomes.
Blacklisted peaks removed (%): Percentage of blacklisted peaks present in the sample. ENCODE blacklist includes regions in the hg19 genome that have anomalous and/or unstructured signals independent of the cell-line or experiment.
Non-standard peaks removed (%): Percentage of peaks identified in
non-standard and/or mitochondrial chromosomes. Identified using
BRGenomics::tidyChromosomes().
PeakN after tidy: Total number of peaks after filtering blacklisted peaks and those in non-standard chromosomes.
NB: All analyses in EpiCompare are conducted on tidied datasets (i.e. blacklisted peaks and those in non-standard chromosomes removed)
| Sample | PeakN before tidy | Blacklisted peaks removed (%) | Non-standard peaks removed (%) | PeakN after tidy |
|---|---|---|---|---|
| H3K27ac_CnT_ActiveMotif_SEACR | 3211 | 17.800 | 4.390 | 2497 |
| H3K27ac_CnT_ActiveMotif_MACS2 | 2526 | 23.200 | 7.050 | 1762 |
| H3K27ac_CnT_Abcamab4729_SEACR | 13530 | 5.690 | 0.761 | 12657 |
| H3K27ac_CnT_Abcamab4729_MACS2 | 26077 | 6.800 | 0.874 | 24076 |
| H3K27ac_CnT_KayaOkur_SEACR | 6669 | 3.420 | 0.390 | 6415 |
| H3K27ac_CnT_KayaOkur_MACS2 | 13456 | 3.470 | 0.498 | 12922 |
| H3K27ac_CnR_Meers_SEACR | 21498 | 3.480 | 0.744 | 20589 |
| H3K27ac_CnR_Meers_MACS2 | 17627 | 3.140 | 0.567 | 16974 |
| H3K27me3_CnR_Meers_SEACR | 83685 | 2.440 | 0.213 | 81465 |
| H3K27me3_CnR_Meers_MACS2 | 103635 | 2.430 | 0.323 | 100777 |
| H3K27me3_ENCODE | 164472 | 0.389 | 0.000 | 163833 |
| H3K27ac_ENCODE | 51176 | 0.952 | 0.000 | 50689 |
| H3K27ac_TIP_Abcam.phase_1_05_jan_2022.S_1_R1 | 60064 | 2.670 | 0.526 | 58143 |
| H3K27ac_TIP_Abcam.phase_2_03_feb_2022.S_4_R1 | 71682 | 2.630 | 0.632 | 69346 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_2_R1 | 61249 | 2.940 | 0.410 | 59195 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_3_R1 | 73765 | 2.620 | 0.493 | 71472 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_4_R1 | 65526 | 2.630 | 0.491 | 63480 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_5_R1 | 64540 | 2.630 | 0.318 | 62638 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_6_R1 | 41891 | 2.400 | 0.465 | 40689 |
Metrics on fragments is shown only if Picard summary is provided. See manual for help.
Column Description:
Distribution of peak widths in each sample.
Heatmap of percentage of overlapping peaks between samples. Hover over the heatmap for percentage values.
N.B. How to interpret heatmap: [Samples in x-axis of heatmap] peaks in [Samples in y-axis of heatmap] peaks
The plot is shown only if a reference peak file is provided and
stat_plot = TRUE. Depending on the format of the reference
file, EpiCompare outputs different plots:
Keys:
Reference peakfile: ENCODE_H3K27ac
ChromHMM annotates and characterises peaks into different chromatin states. ChromHMM annotations used in EpiCompare were obtained from here.
ChromHMM annotation of individual samples.
Percentage of Sample peaks found in Reference peaks (Reference peakfile: ENCODE_H3K27ac)
| Percentage | |
|---|---|
| H3K27ac_CnT_ActiveMotif_SEACR | 77.200 |
| H3K27ac_CnT_ActiveMotif_MACS2 | 63.200 |
| H3K27ac_CnT_Abcamab4729_SEACR | 89.700 |
| H3K27ac_CnT_Abcamab4729_MACS2 | 78.700 |
| H3K27ac_CnT_KayaOkur_SEACR | 92.500 |
| H3K27ac_CnT_KayaOkur_MACS2 | 83.300 |
| H3K27ac_CnR_Meers_SEACR | 80.600 |
| H3K27ac_CnR_Meers_MACS2 | 80.700 |
| H3K27me3_CnR_Meers_SEACR | 0.308 |
| H3K27me3_CnR_Meers_MACS2 | 0.331 |
| H3K27me3_ENCODE | 0.128 |
| H3K27ac_ENCODE | 100.000 |
| H3K27ac_TIP_Abcam.phase_1_05_jan_2022.S_1_R1 | 8.250 |
| H3K27ac_TIP_Abcam.phase_2_03_feb_2022.S_4_R1 | 8.440 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_2_R1 | 5.690 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_3_R1 | 9.730 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_4_R1 | 9.820 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_5_R1 | 3.620 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_6_R1 | 7.930 |
ChromHMM annotation of sample peaks found in reference peaks.
Percentage of Reference peaks found in Sample peaks (Reference peakfile: ENCODE_H3K27ac)
| Percentage | |
|---|---|
| H3K27ac_CnT_ActiveMotif_SEACR | 4.650 |
| H3K27ac_CnT_ActiveMotif_MACS2 | 2.240 |
| H3K27ac_CnT_Abcamab4729_SEACR | 46.300 |
| H3K27ac_CnT_Abcamab4729_MACS2 | 44.400 |
| H3K27ac_CnT_KayaOkur_SEACR | 25.300 |
| H3K27ac_CnT_KayaOkur_MACS2 | 27.100 |
| H3K27ac_CnR_Meers_SEACR | 39.600 |
| H3K27ac_CnR_Meers_MACS2 | 53.900 |
| H3K27me3_CnR_Meers_SEACR | 0.582 |
| H3K27me3_CnR_Meers_MACS2 | 0.793 |
| H3K27me3_ENCODE | 0.422 |
| H3K27ac_ENCODE | 100.000 |
| H3K27ac_TIP_Abcam.phase_1_05_jan_2022.S_1_R1 | 12.700 |
| H3K27ac_TIP_Abcam.phase_2_03_feb_2022.S_4_R1 | 16.000 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_2_R1 | 10.900 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_3_R1 | 23.000 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_4_R1 | 19.600 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_5_R1 | 5.270 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_6_R1 | 6.200 |
ChromHMM annotation of reference peaks found in sample peaks.
Percentage of sample peaks not found in reference peaks (Reference peakfile: ENCODE_H3K27ac)
| Percentage | |
|---|---|
| H3K27ac_CnT_ActiveMotif_SEACR | 22.80 |
| H3K27ac_CnT_ActiveMotif_MACS2 | 36.80 |
| H3K27ac_CnT_Abcamab4729_SEACR | 10.30 |
| H3K27ac_CnT_Abcamab4729_MACS2 | 21.30 |
| H3K27ac_CnT_KayaOkur_SEACR | 7.48 |
| H3K27ac_CnT_KayaOkur_MACS2 | 16.70 |
| H3K27ac_CnR_Meers_SEACR | 19.40 |
| H3K27ac_CnR_Meers_MACS2 | 19.30 |
| H3K27me3_CnR_Meers_SEACR | 99.70 |
| H3K27me3_CnR_Meers_MACS2 | 99.70 |
| H3K27me3_ENCODE | 99.90 |
| H3K27ac_ENCODE | 0.00 |
| H3K27ac_TIP_Abcam.phase_1_05_jan_2022.S_1_R1 | 91.70 |
| H3K27ac_TIP_Abcam.phase_2_03_feb_2022.S_4_R1 | 91.60 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_2_R1 | 94.30 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_3_R1 | 90.30 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_4_R1 | 90.20 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_5_R1 | 96.40 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_6_R1 | 92.10 |
ChromHMM annotation of sample peaks not found in reference peaks.
Percentage of reference peaks not found in sample peaks (Reference peakfile: ENCODE_H3K27ac)
| Percentage | |
|---|---|
| H3K27ac_CnT_ActiveMotif_SEACR | 95.4 |
| H3K27ac_CnT_ActiveMotif_MACS2 | 97.8 |
| H3K27ac_CnT_Abcamab4729_SEACR | 53.7 |
| H3K27ac_CnT_Abcamab4729_MACS2 | 55.6 |
| H3K27ac_CnT_KayaOkur_SEACR | 74.7 |
| H3K27ac_CnT_KayaOkur_MACS2 | 72.9 |
| H3K27ac_CnR_Meers_SEACR | 60.4 |
| H3K27ac_CnR_Meers_MACS2 | 46.1 |
| H3K27me3_CnR_Meers_SEACR | 99.4 |
| H3K27me3_CnR_Meers_MACS2 | 99.2 |
| H3K27me3_ENCODE | 99.6 |
| H3K27ac_ENCODE | 0.0 |
| H3K27ac_TIP_Abcam.phase_1_05_jan_2022.S_1_R1 | 87.3 |
| H3K27ac_TIP_Abcam.phase_2_03_feb_2022.S_4_R1 | 84.0 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_2_R1 | 89.1 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_3_R1 | 77.0 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_4_R1 | 80.4 |
| H3K27me3_TIP_Diagenode.phase_2_28_jan_2022.S_5_R1 | 94.7 |
| H3K27ac_TIP_Diagenode.phase_2_28_jan_2022.S_6_R1 | 93.8 |
ChromHMM annotation of reference peaks not found in sample peaks.
EpiCompare uses annotatePeak function in
ChIPseeker package to annotate peaks with the nearest gene
and genomic region where the peak is located. The peaks are annotated
with genes
taken from the annotations of human genome hg19 provided by
Bioconductor.
EpiCompare performs KEGG pathway and GO enrichment analysis using
clusterProfiler. annotatePeak function in
ChIPseeker package is first used to assign peaks to nearest
genes and biological themes amongst the genes are identified using
ontologies (KEGG and GO). The peaks are annotated with genes
taken from the annotations of human genome hg19 provided by
Bioconductor.
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 11.2
##
## Matrix products: default
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] EpiCompare_0.99.0 org.Hs.eg.db_3.14.0 AnnotationDbi_1.56.2 IRanges_2.28.0 S4Vectors_0.32.3 Biobase_2.54.0 BiocGenerics_0.40.0
## [8] testthat_3.1.2 devtools_2.4.3 usethis_2.1.5
##
## loaded via a namespace (and not attached):
## [1] rappdirs_0.3.3 rtracklayer_1.54.0 tidyr_1.2.0
## [4] ggplot2_3.3.5 bit64_4.0.5 knitr_1.37
## [7] DelayedArray_0.20.0 data.table_1.14.2 KEGGREST_1.34.0
## [10] RCurl_1.98-1.6 generics_0.1.2 GenomicFeatures_1.46.5
## [13] callr_3.7.0 RSQLite_2.2.10 shadowtext_0.1.1
## [16] bit_4.0.4 tzdb_0.2.0 enrichplot_1.14.2
## [19] webshot_0.5.2 xml2_1.3.3 SummarizedExperiment_1.24.0
## [22] assertthat_0.2.1 viridis_0.6.2 xfun_0.29
## [25] hms_1.1.1 jquerylib_0.1.4 evaluate_0.15
## [28] TSP_1.2-0 fansi_1.0.2 restfulr_0.0.13
## [31] progress_1.2.2 caTools_1.18.2 dendextend_1.15.2
## [34] dbplyr_2.1.1 igraph_1.2.11 DBI_1.1.2
## [37] geneplotter_1.72.0 htmlwidgets_1.5.4 purrr_0.3.4
## [40] ellipsis_0.3.2 crosstalk_1.2.0 dplyr_1.0.8
## [43] ggpubr_0.4.0.999 backports_1.4.1 annotate_1.72.0
## [46] gridBase_0.4-7 biomaRt_2.50.3 MatrixGenerics_1.6.0
## [49] vctrs_0.3.8 remotes_2.4.2 abind_1.4-5
## [52] cachem_1.0.6 withr_2.4.3 ggforce_0.3.3
## [55] BSgenome_1.62.0 genomation_1.26.0 vroom_1.5.7
## [58] GenomicAlignments_1.30.0 treeio_1.18.1 prettyunits_1.1.1
## [61] DOSE_3.20.1 ape_5.6-1 lazyeval_0.2.2
## [64] crayon_1.5.0 genefilter_1.76.0 pkgconfig_2.0.3
## [67] labeling_0.4.2 tweenr_1.0.2 GenomeInfoDb_1.30.1
## [70] nlme_3.1-155 pkgload_1.2.4 seriation_1.3.2
## [73] rlang_1.0.1 lifecycle_1.0.1 downloader_0.4
## [76] registry_0.5-1 filelock_1.0.2 BiocFileCache_2.2.1
## [79] seqPattern_1.26.0 rprojroot_2.0.2 polyclip_1.10-0
## [82] matrixStats_0.61.0 Matrix_1.4-0 aplot_0.1.2
## [85] carData_3.0-5 boot_1.3-28 processx_3.5.2
## [88] png_0.1-7 viridisLite_0.4.0 rjson_0.2.21
## [91] bitops_1.0-7 KernSmooth_2.23-20 Biostrings_2.62.0
## [94] blob_1.2.2 stringr_1.4.0 qvalue_2.26.0
## [97] readr_2.1.2 rstatix_0.7.0 gridGraphics_0.5-1
## [100] ggsignif_0.6.3 scales_1.1.1 memoise_2.0.1
## [103] magrittr_2.0.2 plyr_1.8.6 gplots_3.1.1
## [106] zlibbioc_1.40.0 compiler_4.1.2 scatterpie_0.1.7
## [109] TxDb.Hsapiens.UCSC.hg38.knownGene_3.14.0 BiocIO_1.4.0 RColorBrewer_1.1-2
## [112] plotrix_3.8-2 DESeq2_1.34.0 Rsamtools_2.10.0
## [115] cli_3.2.0 XVector_0.34.0 patchwork_1.1.1
## [118] ps_1.6.0 MASS_7.3-55 tidyselect_1.1.2
## [121] stringi_1.7.6 highr_0.9 yaml_2.3.5
## [124] GOSemSim_2.20.0 locfit_1.5-9.4 ggrepel_0.9.1
## [127] grid_4.1.2 sass_0.4.0 fastmatch_1.1-3
## [130] tools_4.1.2 parallel_4.1.2 rstudioapi_0.13
## [133] foreach_1.5.2 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 gridExtra_2.3
## [136] BRGenomics_1.6.0 farver_2.1.0 ggraph_2.0.5
## [139] digest_0.6.29 Rcpp_1.0.8 GenomicRanges_1.46.1
## [142] car_3.0-12 broom_0.7.12 httr_1.4.2
## [145] colorspace_2.0-3 brio_1.1.3 XML_3.99-0.9
## [148] fs_1.5.2 splines_4.1.2 yulab.utils_0.0.4
## [151] tidytree_0.3.8 graphlayouts_0.8.0 ggplotify_0.1.0
## [154] plotly_4.10.0 sessioninfo_1.2.2 xtable_1.8-4
## [157] jsonlite_1.8.0 ggtree_3.2.1 heatmaply_1.3.0
## [160] tidygraph_1.2.0 ggfun_0.0.5 R6_2.5.1
## [163] pillar_1.7.0 htmltools_0.5.2 glue_1.6.2
## [166] fastmap_1.1.0 clusterProfiler_4.2.2 BiocParallel_1.28.3
## [169] codetools_0.2-18 ChIPseeker_1.30.3 fgsea_1.20.0
## [172] pkgbuild_1.3.1 utf8_1.2.2 lattice_0.20-45
## [175] bslib_0.3.1 tibble_3.1.6 curl_4.3.2
## [178] gtools_3.9.2 GO.db_3.14.0 roxygen2_7.1.2.9000
## [181] survival_3.2-13 rmarkdown_2.11 desc_1.4.0
## [184] munsell_0.5.0 DO.db_2.9 GenomeInfoDbData_1.2.7
## [187] iterators_1.0.14 impute_1.68.0 reshape2_1.4.4
## [190] gtable_0.3.0