% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/anota2seqResidOutlierTest.R
\name{anota2seqResidOutlierTest}
\alias{anota2seqResidOutlierTest}
\title{Test for normality of residuals}
\usage{
anota2seqResidOutlierTest(Anota2seqDataSet, confInt = 0.01,
  iter = 5, generateSingleGenePlots = FALSE, nGraphs = 200,
  generateSummaryPlot = TRUE, residFitPlot = TRUE, useProgBar = TRUE)
}
\arguments{
\item{Anota2seqDataSet}{An object of class Anota2seqDataSet that also 
contains the output of the anota2seqPerformQC function.}

\item{confInt}{Controls how many samples from the normal distribution
will be used to generate the envelope to which the residuals are
compared. Default is 0.01 which will generate 99 samples from the
normal distribution to compare to the actual residuals.}

\item{iter}{How many times should the analysis be performed? Default is 5
meaning that 5 sets of samples (each with the size controlled by
confInt) will be generated. Notice that the summary plotting is only
performed for the last set but the percentage of outliers for each
iteration can be found in the output object.}

\item{generateSingleGenePlots}{The analysis is performed per identifier and 
plots can be generated for each identifier. However, due to the high
number of identifiers, a large number of plots will typically be
generated. TRUE/FALSE with default FALSE.}

\item{nGraphs}{If generateSingleGenePlots is set to TRUE, nGraphs controls
for how many identifiers such single gene graphs will be generated.
Default is 200. NOTE: this parameter plots the top "n" genes in the same order as the input data.}

\item{generateSummaryPlot}{The function can generate a summary graph
that shows the envelopes generated by sampling from the normal
distribution compared to the obtained values for all genes. Default is
TRUE, thus the graph is generated but only from the last iteration.}

\item{residFitPlot}{Generates an output of the fitted values and
residuals. Default is TRUE, generate the plot.}

\item{useProgBar}{Should the progress bar be shown. Default is TRUE, show
progress bar.}
}
\value{
An Anota2seqDataSet. anota2seqResidOutlierTest saves its output
  data in the 'residOutlierTest' slot of the Anota2seqDataSet, see 
  \code{\link{anota2seqGetResidOutlierTest}} for a detailed description
  of this output. 
  
  anota2seqResidOutlierTest also generates a graphical
  output ("ANOTA2SEQ_residual_distribution_summary.pdf") showing the Q-Q
  plots from all genes as well as the envelopes from the sampled data.
  The obtained percentage of outliers is shown at each rank position and
  all combined. Optionally, when generateSingleGenePlots is set to TRUE, the
  function also generates individual plots (stored as 
  "ANOTA2SEQ_residual_distributions_single.pdf") for n genes (set by
  nGraphs). When residFitPlot is set to TRUE an output comparing the
  fitted values to the residuals is generated (stored as
  "ANOTA2SEQ_residuals_vs_fitted.jpeg").
}
\description{
One assumption when performing APV is that the residuals from the 
regressions are normally distributed. anota2seq assesses this by
comparing the Q-Q plots of the residuals to envelopes derived by sampling
from the normal distribution.
}
\details{
The anota2seqResidOutlierTest function assesses whether the residuals
from the per identifier linear regressions of translated mRNA level~total
mRNA level+treatment are normally distributed. anota2seq generates normal
Q-Q plots of the residuals. If the residuals are normally distributed,
the data quantiles will form a straight diagonal line from bottom left to
top right. Because there are typically relatively few data points,
anota2seq calculates "envelopes" based on a set of samplings from the
normal distribution using the same number of data points as for the true
data \cite{Venables,Ripley}.To enable a comparison both the actual and
the sampled data are centered (mean=0) and scaled (sd=1). The data (both
true and sampled) are then sorted and the true sample is compared to the 
envelopes of the sampled data at each sort position. The result is
presented as a Q-Q plot of the true data where the envelopes of the
sampled data are indicated. If there are 99 samplings we expect that
1/100 values to be outside the envelopes obtained from the samplings.
Thus it is possible to assess if approximately the expected number of
outlier residuals are obtained. The result is presented as both a
graphical output and an output object.
}
\examples{
\dontrun{
data(anota2seq_data)
# Initialize Anota2seqDataSet
Anota2seqDataSet <- anota2seqDataSetFromMatrix(
    dataP = anota2seq_data_P[1:100,],
    dataT = anota2seq_data_T[1:100,],
    phenoVec = anota2seq_pheno_vec,
    dataType = "RNAseq",
    normalize = TRUE)
# Perform anota2seqPerformQC function. This must be performed prior the running
# the anota2seqResidualOutlierTest function.
Anota2seqDataSet <- anota2seqPerformQC(Anota2seqDataSet)
# Perform anota2seqResidualOutlierTest function
Anota2seqDataSet <- anota2seqResidOutlierTest(Anota2seqDataSet)
}
}
\references{
Venables, W.N. and Ripley, B.D., Modern Applied Statistics with
  S-PLUS, \emph{springer} (1999).
}
\seealso{
\code{\link{anota2seqGetResidOutlierTest}}
}
