% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/geneSurv.R
\name{geneSurv}
\alias{geneSurv}
\title{Kaplan-Meier Survival Analysis Based on Gene Expression or Risk Score}
\usage{
geneSurv(
  seData,
  time,
  status,
  geneName,
  boxplot = TRUE,
  iter = 100,
  type = c("exprs", "risk"),
  cut_time = 10,
  verbose = TRUE
)
}
\arguments{
\item{seData}{SummarizedExperiment object with the normalized expression 
data and the phenotypic data in colData. Phenotypic colData must contain 
the samples name in the first column and two columns with the time and the 
status.}

\item{time}{SummarizedExperiment colData column name containing the survival 
time in years for each sample in numeric format.}

\item{status}{SummarizedExperiment colData column name containing the status 
(censored 0 and not censored 1) for each sample.}

\item{geneName}{A character string with the name of the gene being analyzed.}

\item{boxplot}{A logical value indicating whether to generate a boxplot of 
gene expression by survival group (default = TRUE).}

\item{iter}{The number of iterations (bootstrap resampling) for calculating 
optimal group cutoffs (default = 100).}

\item{type}{Defines if the KM curve groups are computed using risk ("risk") 
or gene expression (default "exprs").}

\item{cut_time}{A numeric value specifying the cutoff time (in years) for 
survival analysis. All events beyond this time are treated as censored 
(default = 10 years).}

\item{verbose}{Logical. Show progress bar.}
}
\value{
Depending on the type run, the output changes:
\itemize{ 
 \item{For \code{type = exprs}, a Kaplan-Meier plot based on expression groups, a 
 differential expression boxplot and a plot with the membership probability 
 for each risk group.
 Additionally, an object with the following components:}
 \itemize{
   \item{\code{geneName}: A character string with the selected name of 
   the gene to analyze.}
   \item{\code{patientExpr}: The expression level of each patient for 
   the gene.}
   \item{\code{patientClass}: Vector of group classification according 
   to the gene expression level: 2 = high expression, 
   and 1 = low expression level.}
   \item{\code{patientClassProbality}: Vector of membership probabilities 
   for the classification.}
   \item{\code{wilcox.pvalue}: The p-value from the Wilcoxon test comparing 
   the two expression groups.}
   \item{\code{plot_values}: A list containing Kaplan-Meier fit results, 
   log-rank p-value, and hazard ratio.}
   }
 \item{For \code{type = risk}, a Kaplan-Meier plot based on risk groups. 
 Additionally, an object with the following components:}
 \itemize{
   \item{\code{geneName}: A character string with the selected name of the 
   gene to analyze.}
   \item{\code{patientExpr}: The expression level of each patient 
   for the gene.}
   \item{\code{risk_score_predicted}: A numeric vector of predicted relative 
   risk scores for each patient.}
   \item{\code{plot_values}: A list containing Kaplan-Meier fit results,
   log-rank p-value, and hazard ratio.}
   }
 }
}
\description{
This function analyzes the ability of a gene to mark survival based on a 
robust version of the KM curves. The robust K-M estimator is obtained by a 
bootstrap strategy.
}
\details{
This function improves the stability and robustness of the K-M estimator 
using a bootstrap strategy. Patients are resampled with replacement giving 
rise to B replicates. The K-M estimator is obtained based on the replicates 
as well as the confidence intervals. The patients are stratified in two risk 
groups by an expression threshold that optimizes the log-rank statistics, 
that is the separability between the Kaplan-Meier curves for each group. 
This function implements a novel method to find the optimal threshold 
avoiding the problems of instability and unbalanced classes that suffer 
other implementations. Besides, a membership probability for each risk group 
is estimated from the classification of each sample in the replicates. 
This membership probability allow us to reclassify patients around the gene 
expression threshold in a more robust way.
The function provides a robust estimation of the log-rank p-value and the 
Hazard ratio that allow us to evaluate the ability of a given gene 
to mark survival.
}
\examples{
data(seBRCA)
time <- "time"
status <- "status"
geneName <- "ESR1"
# The TIME value must be transformed to YEARS
# The gene expression vector must be provided with the NAMES of each sample,
# that should match the time and status NAMES.
set.seed(5)
outputKM <- geneSurv(seBRCA, time, status, geneName, type = "exprs")

# Generate the plots again
## Plots for c(type = exprs)
plotBoxplot(outputKM)
plotProbClass(outputKM)
plotKM(outputKM)

# If we instead consider to run the function as *type* = risk

geneName <- "BRCA1"
set.seed(5)
outputKM.TP53 <- geneSurv(seBRCA, time, status, geneName, type = "risk")

## Plots for c(type = risk)
plotKM(outputKM.TP53)

}
\references{
\itemize{
  \item{\insertRef{martinezromero2018}{asuri}} 
  \item{\insertRef{BuenoFortes2023}{asuri}}
}
}
