% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/clusterStats.R
\name{clusterStats}
\alias{clusterStats}
\title{Compute LVR and meandiff statistics for beta values after batch
correction}
\usage{
clusterStats(pre_betas, post_betas, kClusters)
}
\arguments{
\item{pre_betas}{a matrix of methylation beta values \bold{prior to}
correction.}

\item{post_betas}{a matrix of methylation beta values \bold{after}
correction.}

\item{kClusters}{a kClusters S3 object}
}
\value{
A \code{data.frame} containing clustering stats.
}
\description{
This function is part of a set of three functions to be run in
series. \code{\link{discoverClusteredMethylation}} takes a matrix of
methylation beta values (typically from the Illumina Infinium Methylation
Assay) and clusters the data across a range of ks specified by the user.

Then the data is reclustered again across the the best two candidate values
for k (determined by the rate of change in Bayesian information criterion),
and minimum cluster size and distance filters are employed. If both clusters
meet these filters, then the higher value of k is returned.  
This function should be run on uncorrected data that ideally has slides
removed which are prone to batch effect. This will bias towards finding
clusters that are driven by biological factors such as X-chromosome
inactivation and allele-specific methylation.

The output of this function is input for the
\code{\link{kClusterMethylation}} function which extracts cluster membership
and statistics on variance for a given matrix of beta values. It might be
useful to discover clusters on samples less prone to clustering due to batch
effect or cellular heterogeneity and then recluster all the data for set
values of k via the \code{\link{kClusterMethylation}} function.

Finally, a comparison of differences of uncorrected to
batch-corrected beta values can be made using \code{\link{clusterStats}}.
This function generates a data.frame containing log variance ratio and mean
beta differences to clusters after correction.
}
\details{
Betas values should be of type \code{double} with samples in
columns and betas in rows. The betas need to be bounded between 0 and 1.
The matrix is typically exported from a \code{\link[minfi]{GenomicRatioSet}},
\code{\link[minfi]{GenomicMethylSet}} or \code{\link[minfi]{MethylSet}}
object via the \code{getBeta} S4 accessor method.
}
\examples{
library(HarmanData)
data(episcope)
bad_batches <- c(1, 5, 9, 17, 25)
is_bad_sample <- episcope$pd$array_num \%in\% bad_batches
myK <- discoverClusteredMethylation(episcope$original[, !is_bad_sample])
mykClust = kClusterMethylation(episcope$original, row_ks=myK)
res = clusterStats(pre_betas=episcope$original,
                   post_betas=episcope$harman,
                   kClusters = mykClust)
all.equal(episcope$ref_md$meandiffs_harman, res$meandiffs)
all.equal(episcope$ref_lvr$var_ratio_harman, res$log2_var_ratio)
}
\seealso{
\code{\link{kClusterMethylation}},
\code{\link{discoverClusteredMethylation}}
}
