Skip to contents

Wrapper for `MotifPeeker::motif_enrichment` to get motif enrichment counts and percentages for all peaks and motifs, generating a data.frame suitable for plots. The data.frame contains values for all and segregated peaks.

Usage

get_df_enrichment(
  result,
  segregated_peaks,
  user_motifs,
  genome_build,
  reference_index = 1,
  out_dir = tempdir(),
  workers = 1,
  meme_path = NULL,
  verbose = FALSE
)

Arguments

result

A list with the following elements:

peaks

A list of peak files generated using read_peak_file.

alignments

A list of alignment files.

exp_type

A character vector of experiment types.

exp_labels

A character vector of experiment labels.

read_count

A numeric vector of read counts.

peak_count

A numeric vector of peak counts.

segregated_peaks

A list object generated using segregate_seqs.

user_motifs

A list with the following elements:

motifs

A list of motif files.

motif_labels

A character vector of motif labels.

genome_build

A character string with the abbreviated genome build name, or a BSGenome object. At the moment, only hg38 and hg19 are supported as abbreviated input.

reference_index

An integer specifying the index of the peak file to use as the reference dataset for comparison. Indexing starts from 1. (default = 1)

out_dir

A character vector of output directory.

workers

An integer specifying the number of threads to use for parallel processing. (default = 1)
IMPORTANT: For each worker, please ensure a minimum of 6GB of memory (RAM) is available as denovo_motif_discovery is memory-intensive.

meme_path

path to meme/bin/ (optional). Defaut: NULL, searches "MEME_PATH" environment variable or "meme_path" option for path to "meme/bin/".

verbose

A logical indicating whether to print verbose messages while running the function. (default = FALSE)

Value

A data.frame with the following columns:

exp_label

Experiment labels.

exp_type

Experiment types.

motif_indice

Motif indices.

group1

Segregated group- "all", "Common" or "Unique".

group2

"reference" or "comparison" group.

count_enriched

Number of peaks with motif.

count_nonenriched

Number of peaks without motif.

perc_enriched

Percentage of peaks with motif.

perc_nonenriched

Percentage of peaks without motif.

See also

Other generate data.frames: get_df_distances()

Examples

data("CTCF_ChIP_peaks", package = "MotifPeeker")
data("CTCF_TIP_peaks", package = "MotifPeeker")
data("motif_MA1102.3", package = "MotifPeeker")
data("motif_MA1930.2", package = "MotifPeeker")
input <- list(
    peaks = list(CTCF_ChIP_peaks, CTCF_TIP_peaks),
    exp_type = c("ChIP", "TIP"),
    exp_labels = c("CTCF_ChIP", "CTCF_TIP"),
    read_count = c(150, 200),
    peak_count = c(100, 120)
)
segregated_input <- segregate_seqs(input$peaks[[1]], input$peaks[[2]])
motifs <- list(
    motifs = list(motif_MA1930.2, motif_MA1102.3),
    motif_labels = list("MA1930.2", "MA1102.3")
)
reference_index <- 1

# \donttest{
    if (requireNamespace("BSgenome.Hsapiens.UCSC.hg38")) {
        genome_build <-
            BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38
    
        enrichment_df <- get_df_enrichment(
            input, segregated_input, motifs, genome_build,
            reference_index = 1, workers = 1
        )
    }
# }