% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/genePheno.R
\name{genePheno}
\alias{genePheno}
\title{Identify Predictive Genes for a Phenotype}
\usage{
genePheno(
  seData,
  DEgenes,
  vectorGroups,
  vectorSampleID,
  iter = 100,
  numberOfFolds = 5,
  verbose = TRUE
)
}
\arguments{
\item{seData}{SummarizedExperiment object with the normalized expression 
data and the phenotypic data in colData.}

\item{DEgenes}{Vector containing the genes to be used. Expected to be in the 
same format as the rows of the assay(seData). Usually this vector is the 
result of running prefilterSAM().}

\item{vectorGroups}{Clinical variable or phenotypic variable tested. It must 
be provided as a numeric binary vector.}

\item{vectorSampleID}{Vector containing the sample names in the same order 
as in assay(seData).}

\item{iter}{Number of bootstrap iterations (default: 100, should be 
changed if the function takes too long to execute).}

\item{numberOfFolds}{Number of folds to implement nested cross-validation. 
By default 5.}

\item{verbose}{Logical. Show progress bar.}
}
\value{
A list containing the following elements:
\itemize{
 \item{\code{genes}: A list of genes ranked according to the degree of 
 association with the clinical or phenotypic variable tested.}
 \item{\code{listCoeff}: A list with the beta regression coefficients and 
 the AUC score for each bootstrap iteration.}
 \item{\code{stability}: Gene selection probability estimated by bootstrap 
 (the number of times discovered over "n" iterations).}
 \item{\code{betasMedian}: Median of the beta coefficients over the B 
 replicates.}
 \item{\code{betasMean}:  Mean of the beta coefficients over the B 
 replicates.}
 \item{\code{betasTable}: Table of genes ordered by decreasing value of the 
 stability coefficient. Contains several metrics: the stability index, 
 the mean and the median of the beta coefficients.}
 }
}
\description{
This function implements robust algorithms to obtain a list of genes 
associated to a given clinical variable. It is based on the elastic net 
algorithm and the robustness and reproducibility of the subset of genes is 
improved using a bootstrap strategy combined with ensemble methods.
}
\details{
This function implements a robust version of the elastic net algorithm 
proposed by Tibshirani (Tibshirani et al., 2009). This algorithm considers a 
penalty term to avoid overfitting that is a convex combination of the 
\out{L<sub>2</sub>} norm (ridge regression) and \out{L<sub>1</sub>} 
(Lasso regression). When the alpha parameter is 1, the regularization term 
perfoms similarly to Lasso and minimizes the number of non-null 
coefficients. If a subset of features are slightly correlated Lasso selects 
only one of them randomly. To avoid this extreme behavior the alpha 
parameter is set up to 0.75 that includes more relevant variables than 
Lasso and improves the prediction accuracy. Besides, this choice will help 
to improve the stability and to reduce the variance in the feature selection 
process. In order to improve the robustness and reproducibility of the gene 
signature discovered, a bootstrap strategy is implemented. The patients are 
resampled with replacement giving rise to B replicates. For each replicate, 
a gene signature is obtained using double nested cross-validation to avoid 
overfitting. The final gene list is built as an ensemble of lists, 
considereing several metrics that evaluate the stability, the robustness 
and the predictive power of each gene. See (Martinez-Romero et al., 2018) 
for more details.
}
\examples{
data(seBRCA)

# prefilterSAM ---
data(ex_prefilterSAM)

# genePheno ---
vectorSampleID <- rownames(SummarizedExperiment::colData(seBRCA))
vectorGroups <- SummarizedExperiment::colData(seBRCA)$ER.IHC |> as.numeric()

ex_genePheno <- genePheno(seBRCA, ex_prefilterSAM, vectorGroups, vectorSampleID,
                         iter = 25)

# NOTE: For consistent results with the vignettes and example data, use 
# default parameters (e.g., iter = 100).

}
\references{
\itemize{
  \item{\insertRef{martinezromero2018}{asuri}} 
  \item{\insertRef{BuenoFortes2023}{asuri}}
}
}
