Skip to contents

This helper function identifies differentially methylated positions (DMPs) from a BSseq object using the DSS package. It allows for flexible specification of sample groups, covariates, and chromosome filtering.

Usage

findDMPsBSSeq(
  bsseq,
  samplesheet,
  samplesheet_sep = "\t",
  sample_group_col = "Sample_Group",
  id_col = "Sample_ID",
  chr = "auto",
  case_group = NULL,
  covariates = NULL,
  output_file = NULL,
  njobs = 1L
)

Arguments

bsseq

A BSseq object or a file path to a saved BSseq object (RDS format).

samplesheet

A data frame or a file path to a tab-delimited text file containing sample metadata. Must include columns for sample IDs and group labels.

samplesheet_sep

The separator used in the samplesheet file if a file path is provided. Default is tab ("\t").

sample_group_col

The name of the column in the samplesheet that contains the group labels for comparison. Default is "Sample_Group".

id_col

The name of the column in the samplesheet that contains the sample IDs. Default is "Sample_ID".

chr

A character vector of chromosome names to include in the analysis, or "auto" to automatically include chr1-chr22, or "all" to include chr1-chr22 plus chrX and chrY. Default is "auto".

case_group

The specific group label in the sample_group_col to treat as the "case" group for comparison. If NULL, the first unique group in sample_group_col will be used as the case group. Default is NULL.

covariates

A character vector of additional covariate column names from the samplesheet to include in the DSS model, or a comma-separated string of covariate names. Default is NULL (no additional covariates).

output_file

An optional file path to save the DMP results as a tab-delimited text file. If the file name ends with ".gz", the output will be gzipped. Default is NULL (no file output).

njobs

The number of parallel jobs to use for chromosome-level analysis. Default is 1.

Value

A data frame of identified DMPs with columns for chromosome, position, site ID, p-value, q-value, delta beta, and DMP score.

Examples

if (requireNamespace("bsseqData", quietly = TRUE) &&
    requireNamespace("DSS", quietly = TRUE)) {
    # Load example BSseq data
    data(BS.cancer.ex, package = "bsseqData")
    BS.cancer.ex <- BS.cancer.ex[seq_len(1000), ]
    # Create a sample metadata data frame
    samplesheet <- data.frame(
       Sample_ID = colnames(BS.cancer.ex),
      Sample_Group = c(rep("Condition1", 3), rep("Condition2", 3)),
      Age = c(30, 32, 31, 28, 29, 27)
    )
    # Find DMPs with DSS
    # \donttest{
    dmps <- findDMPsBSSeq(
       bsseq = BS.cancer.ex,
       samplesheet = samplesheet,
       sample_group_col = "Sample_Group",
       id_col = "Sample_ID",
       case_group = "Condition2",
       covariates = "Age",
       output_file = NULL,
       njobs = 4
    )
    print(head(dmps))
    # }
}