When performing statistical analysis on any set of genomic ranges it
is often important to compare focal sets to null sets that are carefully
matched for possible covariates that may influence the analysis. To
address this need, the nullranges
package implements
matchRanges()
, an efficient and convenient tool for
selecting a covariate-matched set of null hypothesis ranges from a pool
of background ranges within the Bioconductor framework.
In this vignette, we provide an overview of
matchRanges()
and its associated functions. We start with a
simulated example generated with the utility function
makeExampleMatchedDataSet()
. We also provide an overview of
the class struture and a guide for choosing among the supported matching
methods. To see matchRanges()
used in real biological
examples, visit the Case study I: CTCF
occupancy, and Case study II:
CTCF orientation vignettes.
For a description of the method, see Davis et al. (2023).
matchRanges
references four sets of data:
focal
, pool
, matched
and
unmatched
. The focal
set contains the outcome
of interest (Y=1
) while the pool
set contains
all other observations (Y=0
). matchRanges
generates the matched
set, which is a subset of the
pool
that is matched for provided covariates
(i.e. covar
) but does not contain the outcome of interest
(i.e Y=0
). Finally, the unmatched
set contains
the remaining unselected elements from the pool
. The
diagram below depicts the relationships between the four sets.