% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/AllGenerics.R, R/addDissimilarity.R
\name{addDissimilarity}
\alias{addDissimilarity}
\alias{getDissimilarity}
\alias{addDissimilarity,SummarizedExperiment-method}
\alias{getDissimilarity,SummarizedExperiment-method}
\alias{getDissimilarity,TreeSummarizedExperiment-method}
\alias{getDissimilarity,ANY-method}
\title{Calculate dissimilarities}
\usage{
addDissimilarity(x, method, ...)

getDissimilarity(x, method, ...)

\S4method{addDissimilarity}{SummarizedExperiment}(x, method = "bray", name = method, ...)

\S4method{getDissimilarity}{SummarizedExperiment}(
  x,
  method = "bray",
  assay.type = "counts",
  niter = NULL,
  transposed = FALSE,
  ...
)

\S4method{getDissimilarity}{TreeSummarizedExperiment}(
  x,
  method = "bray",
  assay.type = "counts",
  niter = NULL,
  transposed = FALSE,
  ...
)

\S4method{getDissimilarity}{ANY}(x, method = "bray", niter = NULL, ...)
}
\arguments{
\item{x}{\code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
or \code{matrix}.}

\item{method}{\code{Character scalar}. Specifies which dissimilarity to
calculate. (Default: \code{"bray"})}

\item{...}{other arguments passed into \code{\link[vegan:avgdist]{avgdist}},
\code{\link[vegan:vegdist]{vegdist}}, or into mia internal functions:

\itemize{
\item \code{sample}: The sampling depth in rarefaction.
(Default: \code{min(rowSums2(x))})

\item \code{dis.fun}: \code{Character scalar}. Specifies the dissimilarity
function to be used.

\item \code{transf}: \code{Function}. Specifies the optional
transformation applied before calculating the dissimilarity matrix.

\item \code{tree.name}: (Unifrac)  \code{Character scalar}. Specifies the
name of the tree from \code{rowTree(x)} that is used in calculation.
Disabled when \code{tree} is specified. (Default: \code{"phylo"})

\item \code{tree}: (Unifrac) \code{phylo}. A phylogenetic tree used in
calculation. (Default: \code{NULL})

\item \code{weighted}: (Unifrac) \code{Logical scalar}. Should use
weighted-Unifrac calculation?
Weighted-Unifrac takes into account the relative abundance of
species/taxa shared between samples, whereas unweighted-Unifrac only
considers presence/absence. Default is \code{FALSE}, meaning the
unweighted-Unifrac dissimilarity is calculated for all pairs of samples.
(Default: \code{FALSE})

\item \code{node.label} (Unifrac) \code{character vector}. Used only if
\code{x} is a matrix. Specifies links between rows/columns and tips of
\code{tree}. All the node labs must be present in \code{tree}. For the
links, you can provide a vector with whose length equals to the number of
rows/columns in \code{x}. Alternatively, you can provide a named vector
where \code{names} represent names in abundance table and values their
corresponding node in tree.

\item \code{chunkSize}: (JSD) \code{Integer scalar}. Defines the size of
data  send to the individual worker. Only has an effect, if \code{BPPARAM}
defines more than one worker. (Default: \code{nrow(x)})

\item \code{BPPARAM}: (JSD)
\code{\link[BiocParallel:BiocParallelParam-class]{BiocParallelParam}}.
Specifies whether the calculation should be parallelized.

\item \code{detection}: (Overlap) \code{Numeric scalar}.
Defines detection threshold for absence/presence of features. Feature that
has abundance under threshold in either of samples, will be discarded when
evaluating overlap between samples. (Default: \code{0})

\item \code{binary}: \code{Logical scalar}. Whether to perform
presence/absence transformation before dissimilarity calculation. For
Jaccard index the default is \code{TRUE}. For other dissimilarity metrics,
please see \code{\link[vegan:vegdist]{vegdist}}.
}}

\item{name}{\code{Character scalar}. The name to be used to store the result
in metadata of the output. (Default: \code{method})}

\item{assay.type}{\code{Character scalar}. Specifies the name of assay
used in calculation. (Default: \code{"counts"})}

\item{niter}{\code{Integer scalar}. Specifies the number of
rarefaction rounds. Rarefaction is not applied when \code{niter=NULL}
(see Details section). (Default: \code{NULL})}

\item{transposed}{\code{Logical scalar}. Specifies if x is transposed with
cells in rows. (Default: \code{FALSE})}
}
\value{
\code{getDissimilarity} returns a sample-by-sample dissimilarity matrix.

\code{addDissimilarity} returns \code{x} that includes dissimilarity matrix
in its metadata.
}
\description{
These functions are designed to calculate dissimilarities on data stored
within a
\code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
object. For overlap, Unifrac, and Jensen-Shannon Divergence (JSD)
dissimilarities, the functions use mia internal functions, while for other
types of dissimilarities, they rely on \code{\link[vegan:vegdist]{vegdist}}
by default.
}
\details{
Overlap reflects similarity between sample-pairs. When overlap is
calculated using relative abundances, the higher the value the higher the
similarity is. When using relative abundances, overlap value 1 means that
all the abundances of features are equal between two samples, and 0 means
that samples have completely different relative abundances.

Unifrac is calculated with \code{
\link[ecodive:unweighted_unifrac]{ecodive:unweighted_unifrac()}}
or \code{\link[ecodive:weighted_unifrac]{ecodive:weighted_unifrac()}}.

If rarefaction is enabled, \code{\link[vegan:avgdist]{vegan:avgdist()}} is
utilized.

Rarefaction can be used to control uneven sequencing depths. Although,
it is highly debated method. Some think that it is the only option that
successfully controls the variation caused by uneven sampling depths.
The biggest argument against rarefaction is the fact that it omits data.

Rarefaction works by sampling the counts randomly. This random sampling
is done \code{niter} times. In each sampling iteration, \code{sample} number
of random samples are drawn, and dissimilarity is calculated for this
subset. After the iterative process, there are \code{niter} number of
result that are then averaged to get the final result.

Refer to Schloss (2024) for more details on rarefaction.
}
\examples{
library(mia)
library(scater)

# load dataset
data(GlobalPatterns)
tse <- GlobalPatterns

### Overlap dissimilarity

tse <- addDissimilarity(tse, method = "overlap", detection = 0.25)
metadata(tse)[["overlap"]][1:6, 1:6]

### JSD dissimilarity

tse <- addDissimilarity(tse, method = "jsd")
metadata(tse)[["jsd"]][1:6, 1:6]

# Multi Dimensional Scaling applied to JSD dissimilarity matrix
tse <- addMDS(tse, method = "overlap", assay.type = "counts")
reducedDim(tse, "MDS") |> head()

### Unifrac dissimilarity

res <- getDissimilarity(tse, method = "unifrac", weighted = FALSE)
dim(as.matrix(res))

tse <- addDissimilarity(tse, method = "unifrac", weighted = TRUE)
metadata(tse)[["unifrac"]][1:6, 1:6]

### Bray dissimilarity

# Bray is usually applied to relative abundances so we have to apply
# transformation first
tse <- transformAssay(tse, method = "relabundance")
res <- getDissimilarity(tse, method = "bray", assay.type = "relabundance")
as.matrix(res)[1:6, 1:6]

# If applying rarefaction, the input must be count matrix and transformation
# method specified in function call (Note: increase niter)
rclr <- function(x){
    vegan::decostand(x, method="rclr")
}
res <- getDissimilarity(
    tse, method = "euclidean", transf = rclr, niter = 2L)
as.matrix(res)[1:6, 1:6]

}
\references{
For unifrac dissimilarity: \url{http://bmf.colorado.edu/unifrac/}

See also additional descriptions of Unifrac in the following articles:

Lozupone, Hamady and Knight, ``Unifrac - An Online Tool for Comparing
Microbial Community Diversity in a Phylogenetic Context.'', BMC
Bioinformatics 2006, 7:371

Lozupone, Hamady, Kelley and Knight, ``Quantitative and qualitative (beta)
diversity measures lead to different insights into factors that structure
microbial communities.'' Appl Environ Microbiol. 2007

Lozupone C, Knight R. ``Unifrac: a new phylogenetic method for comparing
microbial communities.'' Appl Environ Microbiol. 2005 71 (12):8228-35.

For JSD dissimilarity:
Jensen-Shannon Divergence and Hilbert space embedding.
Bent Fuglede and Flemming Topsoe University of Copenhagen,
Department of Mathematics
\url{http://www.math.ku.dk/~topsoe/ISIT2004JSD.pdf}

For rarefaction:
Schloss PD (2024) Rarefaction is currently the best approach to control for
uneven sequencing effort in amplicon sequence analyses. \emph{mSphere}
28;9(2):e0035423. doi: 10.1128/msphere.00354-23
}
\seealso{
\url{http://en.wikipedia.org/wiki/Jensen-Shannon_divergence}
}
