% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/aggregate-methods.R
\name{aggregate2}
\alias{aggregate2}
\title{Functional API for data.table aggregation which allows capture of associated
aggregate calls so they can be recomputed later.}
\usage{
aggregate2(
  x,
  by,
  ...,
  nthread = 1,
  progress = interactive(),
  BPPARAM = NULL,
  enlist = TRUE,
  moreArgs = list()
)
}
\arguments{
\item{x}{\code{data.table}}

\item{by}{\code{character} One or more valid column names in \code{x} to compute
groups using.}

\item{...}{\code{call} One or more aggregations to compute for each group by in x.
If you name aggregation calls, that will be the column name of the value
in the resulting \code{data.table} otherwise a default name will be parsed from
the function name and its first argument, which is assumed to be the name
of the column being aggregated over.}

\item{nthread}{\code{numeric(1)} Number of threads to use for split-apply-combine
parallelization. Uses \code{BiocParllel::bplapply} if nthread > 1 or you pass in
\code{BPPARAM}. Does not modify data.table threads, so be sure to use
setDTthreads for reasonable nested parallelism. See details for performance
considerations.}

\item{progress}{\code{logical(1)} Display a progress bar for parallelized
computations? Only works if \verb{bpprogressbar<-} is defined for the current
BiocParallel back-end.}

\item{BPPARAM}{\code{BiocParallelParam} object. Use to customized the
the parallization back-end of bplapply. Note, nthread over-rides any
settings from BPPARAM as long as \verb{bpworkers<-} is defined for that class.}

\item{enlist}{\code{logical(1)} Default is \code{TRUE}. Set to \code{FALSE} to evaluate
the first call in \code{...} within \code{data.table} groups. See details for more
information.}

\item{moreArgs}{\code{list()} A named list where each item is an argument one of
the calls in \code{...} which is not a column in the table being aggregated. Use
to further parameterize you calls. Please note that these are not added
to your aggregate calls unless you specify the names in the call.}
}
\value{
\code{data.table} of aggregation results.
}
\description{
Functional API for data.table aggregation which allows capture of associated
aggregate calls so they can be recomputed later.
}
\details{
\subsection{Use of Non-Standard Evaluation}{

Arguments in \code{...} are substituted and wrapped in a list, which is passed
through to the j argument of \verb{[.data.table} internally. The function currently
tries to build informative column names for unnamed arguments in \code{...} by
appending the name of each function call with the name of its first argument,
which is assumed to be the column name being aggregated over. If an argument
to \code{...} is named, that will be the column name of its value in the resulting
\code{data.table}.
}

\subsection{Enlisting}{

The primary use case for \code{enlist=FALSE} is to allow computation of dependent
aggregations, where the output from a previous aggregation is required in a
subsequent one. For this case, wrap your call in \verb{\{} and assign intermediate
results to variables, returning the final results as a list where each list
item will become a column in the final table with the corresponding name.
Name inference is disabled for this case, since it is assumed you will name
the returned list items appropriately.
A major advantage over multiple calls to \code{aggregate} is that
the overhead of parallelization is paid only once even for complex multi-step
computations like fitting a model, capturing its paramters, and making
predictions using it. It also allows capturing arbitrarily complex calls
which can be recomputed later using the
\verb{update,TreatmentResponseExperiment-method}
A potential disadvantage is increased RAM usage per
thread due to storing intermediate values in variables, as well as any
memory allocation overhead associate therewith.
}
}
\seealso{
\verb{data.table::[.data.table}, \code{BiocParallel::bplapply}
}
