Skip to contents

Annotates DMRs with overlapping gene promoters and gene bodies using TxDb annotations. For each DMR, identifies genes whose promoters or gene bodies overlap with the DMR coordinates.

Usage

annotateDMRsWithGenes(
  dmrs,
  genome = "hg38",
  promoter_upstream = 2000,
  promoter_downstream = 200,
  njobs = getOption("CMEnt.njobs", .defaultNJobs()),
  site_locs = NULL,
  site_delta_beta = NULL,
  aggfun = stats::median
)

Arguments

dmrs

Dataframe or GRanges object containing DMR coordinates

genome

Character. Genome version to use for gene annotation. (default: "hg38")

promoter_upstream

Integer. Number of base pairs upstream of TSS to define promoter region (default: 2000)

promoter_downstream

Integer. Number of base pairs downstream of TSS to define promoter region (default: 200)

njobs

Integer. Number of parallel jobs used to annotate promoter and gene-body overlaps (default: getOption("CMEnt.njobs"))

site_locs

Optional data frame or GRanges with site coordinates used to compute feature-specific delta beta values.

site_delta_beta

Optional named numeric vector of per-site delta beta values.

aggfun

Function used to aggregate per-site delta beta values.

Value

The input Dataframe/GRanges object with additional metadata columns:

  • in_promoter_of: Character vector of gene symbols with promoters overlapping the DMR (comma-separated)

  • in_gene_body_of: Character vector of gene symbols with gene bodies overlapping the DMR (comma-separated)

  • delta_beta_promoter: Aggregated delta beta of DMR sites overlapping promoters, or NA

  • delta_beta_gene_body: Aggregated delta beta of DMR sites overlapping gene bodies, or NA

Details

The function uses genome-appropriate TxDb packages. For hs1, CMEnt uses hg38 gene models and lifts them to hs1 before computing overlaps. Gene symbols are retrieved from the appropriate org.*.eg.db package. Multiple overlapping genes are concatenated with commas.

Examples

# Annotate DMRs with gene information
dmrs <- data.frame(
    chr = c("chr1", "chr2"),
    start = c(1000000, 2000000),
    end = c(1001000, 2001000)
)
dmrs_annotated <- annotateDMRsWithGenes(dmrs, genome = "hg38")

# Use custom promoter definition
dmrs_annotated <- annotateDMRsWithGenes(
    dmrs,
    genome = "hg38",
    promoter_upstream = 5000,
    promoter_downstream = 1000,
    njobs = 2
)