An R6 class to handle methylation beta value files efficiently, with support for in-memory loading, tabix indexing, BSseq objects, and various file formats.
Public fields
betaPath to beta values file, or a tabix file, or in-memory beta matrix, or BSseq object
genomeReference genome
arrayArray platform, ignore for mouse genomes or when sorted_locs provided
beta_row_names_filePath to row names file
sorted_locsSorted genomic locations
output_prefixPrefix used for saving derived beta artifacts
njobsNumber of parallel jobs
beta_chunk_sizeChunk size for subsetting beta values
Methods
BetaHandler$new()
Create a new BetaHandler object
Usage
BetaHandler$new(
beta = NULL,
array = c("450K", "27K", "EPIC", "EPICv2"),
genome = "hg38",
beta_row_names_file = NULL,
sorted_locs = NULL,
chrom_col = "#chrom",
start_col = "start",
output_prefix = NULL,
njobs = 1
)Arguments
betaPath to beta values file, or a tabix, or a beta matrix, or a BSseq object
arrayArray platform type. Ignored if sorted_locs, a tabix file, or a BSseq object have been provided.
genomeReference genome version, eg. hg38 or hs1. Only human and mouse genomes are supported. Ignored if sorted_locs, a tabix file, or a BSseq object have been provided.
beta_row_names_filePath to row names file. If NULL, row names will be read from input
beta.sorted_locsSorted genomic locations data frame. If given, the input data will be assumed already sorted. If NULL, will be retrieved automatically
chrom_colChromosome column name in tabix file
start_colStart position column name in tabix file
output_prefixPrefix used for saving derived beta artifacts
njobsNumber of parallel jobs
BetaHandler$isArrayBased()
Check if the beta data is array-based (i.e. does not have row names in 'chr:pos' format)
BetaHandler$subset()
Build a compact BetaHandler view for a subset of rows/columns
BetaHandler$getBeta()
Extract beta values for specific site sites and samples
Arguments
row_namesCharacter vector of site IDs to extract. If numeric, treated as row indices.
col_namesCharacter vector of sample IDs to extract (default: NULL for all)
allow_missingLogical. If TRUE, missing site sites will be ignored instead of throwing an error (default: FALSE)
chrCharacter vector of chromosome names to extract, cannot be used along with row_names (default: NULL for all)
