% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/enrichment.R
\name{clusterORA}
\alias{clusterORA}
\title{Calculate annotation enrichment for clusters in the graph}
\usage{
clusterORA(g, alg, name, vid = "name", alpha = 1, col = COLLAPSE)
}
\arguments{
\item{g}{graph to get annotation from}

\item{alg}{cluster algorithm and membership attribute name}

\item{name}{annotation attribute name}

\item{vid}{attribute to be used as a vertex ID}

\item{alpha}{probability threshold}

\item{col}{list separation character in attribute, by
default is \code{;}}
}
\value{
A table with overrepresentation results.
Each row corresponds to a tested annotation in particular cluster.
The columns are the following:
\itemize{
\item alg – name of the clustering algorithm;
\item cl – cluster ID;
\item FL – name of the enriched term;
\item N – number vertices in the network;
\item Fn – number of vertices in the graph annotated by term \code{Fl} (\eqn{F_n});
\item Cn – size of the cluster;
\item Mu – number of vertices in the cluster annotated by term \code{Fl} (\eqn{\mu});
\item OR – odds ratio ;
\item CIl – odds ratio 95\% confidence interval lower bound (\eqn{CI_l});
\item CIu – odds ratio 95\% confidence interval upper bound(\eqn{CI_u});
\item Fe – fold enrichment \eqn{F_e};
\item Fc – fold enrichment \eqn{F_c};
\item pval – an enrichment p-value from hypergeometric test;
\item padj – a BY-adjusted p-value;
\item palt – an depletion p-value from hypergeometric test;
\item paltadj – a BY-adjusted depletion p-value;
\item overlapGenes – vector with overlapping genes.
}
}
\description{
Calculate the cluster enrichment of a graph given a clustering algorithm
\code{alg} and vertex annotation attribute 'name'. Function generates an
enrichment table, one row for each cluster, containing: size of the cluster
(\code{Cn}), number of annotated vertices in the graph \eqn{F_n} (\code{Fn}),
number of annotated vertices in the cluster \eqn{\mu} (\code{Mu}), odds ratio
(\code{OR}) and its 95\% Confidence interval \eqn{[CI_l,CI_u]} (\code{CIl} and
\code{CIu}), two fold enrichment
values \eqn{F_e} (\code{Fe}) and \eqn{F_c} (\code{Fc}). We also provide
the list of vertices from the cluster that contribute
to the annotation term,
p.value of enrichment
(\code{pval}) and depletion (\code{palt})
using the Hypergeometric test, adjusted p.values using Benjamini and Yekutieli
correction (BY).
}
\details{
Given the enrichment results, we can calculate the log of the Odds Ratio
(\code{OR}) as:
\deqn{\ln(OR)=\ln(\frac{\mu(N-F_n+\mu-C_n)}{(C_n-\mu)(F_n-\mu)})}{\ln(OR)=\ln(Mu(N-Fn+Mu-C_n)/((Cn-Mu)(Fn-Mu))}
and it’s upper and lower 95\% Confidence Interval:
\deqn{CI(\ln(OR))=\ln(OR)\pm 1.96\sqrt{\frac{1}{\mu}+\frac{1}{C_n-\mu}+\frac{1}{F_n-\mu}+\frac{1}{N-F_n+\mu-C_n}}}{CI(\ln(OR))=\ln(OR) \pm 1.96(1/Mu+1/(Cn-Mu)+1/(Fn-Mu)+1/(N-Fn+Mu-Cn))^0.5}

Using the odds ratio allows us to distinguish
functionally enriched communities relative to functionally depleted
communities.

Two types of fold enrichment values calculated as follow:
\deqn{F_e=\frac{(\frac{\mu}{F_n})}{(\frac{C_n}{N})}}{F_e=(Mu/Fn)/(Cn/N)}
\deqn{F_c=\frac{(\frac{\mu}{C_n})}{(\frac{C_n}{N})}}{F_c=(Mu/Cn)/(Cn/N)}
}
\examples{
options("show.error.messages"=TRUE)
file <- system.file("extdata", "PPI_Presynaptic.gml", package = "BioNAR")
g <- igraph::read_graph(file, format="gml")
anL<-getAnnotationVertexList(g, 'TopOntoOVGHDOID')
res<-clusterORA(g, alg='louvain', name='TopOntoOVGHDOID', vid='name')
andf<-unique(data.frame(ID=vertex_attr(g, 'TopOntoOVGHDOID'),
Term=vertex_attr(g, 'TopOntoOVG')))
rr<-merge(andf, res, by.y='FL', by.x='ID')
rr[order(rr$cl), ]
}
