% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ScpModel-DifferentialAnalysis.R
\name{ScpModel-DifferentialAnalysis}
\alias{ScpModel-DifferentialAnalysis}
\alias{scpDifferentialAnalysis}
\alias{scpDifferentialAggregate}
\alias{scpVolcanoPlot}
\title{Differential abundance analysis for single-cell proteomics}
\usage{
scpDifferentialAnalysis(object, coefficients = NULL, contrasts = NULL, name)

scpDifferentialAggregate(differentialList, fcol, ...)

scpVolcanoPlot(
  differentialList,
  fdrLine = 0.05,
  top = 10,
  by = "padj",
  decreasing = FALSE,
  textBy = "feature",
  pointParams = list(),
  labelParams = list()
)
}
\arguments{
\item{object}{An object that inherits from the
\code{SummarizedExperiment} class. It must contain an estimated
\code{ScpModel} in its metadata.}

\item{coefficients}{A \code{character()} vector with coefficient names
to test. \code{coefficients} and \code{contrasts} cannot be both NULL.}

\item{contrasts}{A \code{list()} where each element is a contrast to
test. Each element must be a vector with 3 strings: 1. The
name of a categorical variable to test; 2. The name of the
reference group: 3. The name of the second group to contrast
against the reference group. \code{coefficients} and \code{contrasts}
cannot be both NULL.}

\item{name}{A \code{character(1)} providing the name to use to retrieve
the model results. When retrieving a model and \code{name} is
missing, the name of the first model found in \code{object} is used.}

\item{differentialList}{A list of tables returned by
\code{scpDifferentialAnalysis()}.}

\item{fcol}{A \code{character(1)} indicating the column to use for
grouping features. Typically, this would be protein or gene
names for grouping proteins.}

\item{...}{Further arguments passed to
\code{\link[metapod:combineGroupedPValues]{metapod::combineGroupedPValues()}}.}

\item{fdrLine}{A \code{numeric(1)} indicating the FDR threshold bar to
show on the plot.}

\item{top}{A \code{numeric(1)} indicating how many features should be
labelled on the plot.}

\item{by}{A \code{character(1)} used to order the features It
indicates which variable should be considered when sorting the
results. Can be one of: "Estimate", "SE", "Df", "tstatistic",
"pvalue", "padj" or any other annotation added by the user.}

\item{decreasing}{A \code{logical(1)} indicating whether the features
should be ordered decreasingly (\code{TRUE}, default) or
increasingly (\code{FALSE}) depending on the value provided by
\code{by}.}

\item{textBy}{A \code{character(1)} indicating the name of the column
to use to label points.}

\item{pointParams}{A \code{list} where each element is an argument that
is provided to \code{\link[ggplot2:geom_point]{ggplot2::geom_point()}}. This is useful to
change point size, transparency, or assign colour based on an
annotation (see \code{\link[ggplot2:aes]{ggplot2::aes()}}).}

\item{labelParams}{A \code{list} where each element is an argument that
is provided to \code{\link[ggrepel:geom_text_repel]{ggrepel::geom_label_repel()}}. This is useful
to change label size, transparency, or assign
colour based on an annotation (see \code{\link[ggplot2:aes]{ggplot2::aes()}}).}
}
\description{
Differential abundance analysis assess the statistical
significance of the differences observed between group of samples
of interest. Differential abundance analysis is part of the
\emph{scplainer} workflow.
}
\section{Running the differential abundance analysis}{


\code{scpDifferentialAnalysis()} performs statistical inference by
means of a t-test on the estimatated parameters. There are 2 use
cases:
\enumerate{
\item \strong{Statistical inference for differences between 2 groups}
}

You can \strong{contrast} 2 groups of interest through the \code{contrasts}
argument. Multiple contrasts, that is multiple pairwise group
comparisons, can be performed. Therefore, \code{contrasts} must be
provided as a list where each element describes the comparison to
perform as a three-element character vector (see examples). The
first element is the name of the annotation variable that contains
the two groups to compare. This variable must be \strong{categorical}.
The second element is the name of the reference group. The third
element is the name of the other group to compare against the
reference.
\enumerate{
\item \strong{Statistical inference for numerical variables}
}

Numerical variables can be tested by providing the \code{coefficient}
argument, that is the name of the numerical annotation variable.

The statistical tests in both use cases are conducted for each
feature independently. The p-values are adjusted using
\code{\link[IHW:ihw.default]{IHW::ihw()}}, where each test is weighted using the feature
intercept (that is the average feature intensity). The function
returns a list of \code{DataFrame}s with one table for each test
contrast and/or coefficient. It provides the adjusted p-values and
the estimates. For contrast, the estimates represent the estimated
log fold changes between the groups. For coefficients, the
estimates are the estimated slopes. Results are only provided for
features for which contrasts or coefficients are estimable, that
are features for which there is sufficient observations for
inference.
}

\section{Differential abundance at the protein level}{


\code{scpDifferentialAggregate()} combines the differential abundance
analysis results for groups of features. This is useful, for
example, to return protein-level results when data is modelled at
the peptide level. The function heavily relies on the approaches
implemented in \code{\link[metapod:combineGroupedPValues]{metapod::combineGroupedPValues()}}. The p-values
are combined into a single value using one of the following
methods: Simes' method
(default), Fisher's method, Berger's method, Pearson's method,
minimum Holm's approach, Stouffer's Z-score method, and
Wilkinson's method. We refer to the \code{metapod} documentation for
more details on the assumptions underlying each approach. The
estimates are combined using the representative estimate, as
defined by \code{metapod}. Which estimate is representative depends on
the selected combination method. The function takes the list of
tables generated by \code{scpDifferentialAnalysis()} and returns a new
list of \code{DataFrame}s with aggregated results. Note that we cannot
meaningfully aggregate degrees of freedom. Those are hence removed
from the aggregated result tables.
}

\section{Volcano plots}{


\code{\link[=scpAnnotateResults]{scpAnnotateResults()}} adds annotations to the differential abundance
analysis results. The annotations are added to all elements of the
list returned by \verb{()}. See the associated
man page for more information.

\code{scpVolcanoPlot()} takes the list of tables generated by
\code{scpDifferentialAnalysis()} and returns a \code{ggplot2} scatter plot.
The plots show the adjusted p-values with respect to the estimate.
A horizontal bar also highlights the significance threshold
(defaults to 5\%, \code{fdrLine}). The top (default 10) features with lowest
p-values are labeled on the plot. You can control which features
are labelled using the \code{top}, \code{by} and \code{decreasing} arguments.
Finally, you can change the point and label aesthetics thanks to
the \code{pointParams} and the \code{labelParams} arguments, respectively.
}

\examples{
library("patchwork")
library("ggplot2")
data("leduc_minimal")
## Add n/p ratio information in rowData
rowData(leduc_minimal)$npRatio <- 
    scpModelFilterNPRatio(leduc_minimal, filtered = FALSE)

####---- Run differential abundance analysis ----####

(res <- scpDifferentialAnalysis(
    leduc_minimal, coefficients =  "MedianIntensity", 
    contrasts = list(c("SampleType", "Melanoma", "Monocyte"))
))
## IHW return a message because of the example data set has only few
## peptides, real dataset should not have that problem.

####---- Annotate results ----####

## Add peptide annotations available from the rowData
res <- scpAnnotateResults(
    res, rowData(leduc_minimal), 
    by = "feature", by2 = "Sequence"
)

####---- Plot results ----####

scpVolcanoPlot(res, textBy = "gene") |>
    wrap_plots(guides = "collect")

## Modify point and label aesthetics
scpVolcanoPlot(
    res, textBy = "gene", top = 20,
    pointParams = list(aes(colour = npRatio), alpha = 0.5),
    labelParams = list(size = 2, max.overlaps = 20)) |>
    wrap_plots(guides = "collect")

####---- Aggregate results ----####

## Aggregate to protein-level results
byProteinDA <- scpDifferentialAggregate(
    res, fcol = "Leading.razor.protein.id"
)
scpVolcanoPlot(byProteinDA) |>
    wrap_plots(guides = "collect")
}
\references{
scplainer: using linear models to understand mass
spectrometry-based single-cell proteomics data Christophe
Vanderaa, Laurent Gatto bioRxiv 2023.12.14.571792; doi:
https://doi.org/10.1101/2023.12.14.571792.
}
\seealso{
This function is part of the \emph{scplainer} workflow, which also
consists of \link{ScpModel-Workflow} to run a model on SCP data
upstream of analysis of variance, and
\link{ScpModel-VarianceAnalysis} and \link{ScpModel-ComponentAnalysis}
to explore the model results.

\code{\link[=scpAnnotateResults]{scpAnnotateResults()}} streamlines the annotation of the
differential abundance results.
}
\author{
Christophe Vanderaa, Laurent Gatto
}
