% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ChipDataSet-methods.R
\name{constructCDS}
\alias{constructCDS}
\title{constructCDS}
\usage{
constructCDS(
  peaks,
  reads,
  region,
  TxDb,
  tssOf = c("gene", "transcript"),
  tss.region = c(-2000, 2000),
  reduce.peaks = FALSE,
  gapwidth = 1000,
  fragment.size,
  unique = TRUE,
  swap.strand = FALSE,
  param = NULL
)
}
\arguments{
\item{peaks}{A path to a file with peaks. The file needs to have at least 3
columns (tab-separated): chromosome, start (peak), end (peak). The 4th
column - name (peak id) is optional.}

\item{reads}{A path to a BAM file with sequencing reads.}

\item{region}{\code{\link[GenomicRanges]{GRanges}}. Genomic region(s) to
extract reads from. If not supplied, all the reads from a BAM file are
extracted.}

\item{TxDb}{\code{\link[GenomicFeatures]{TxDb}} object.}

\item{tssOf}{\code{Character}. Extract Transcription Start Site (TSS)
regions from either "gene" or "transcript" annotations. Default: "gene".}

\item{tss.region}{A numeric vector of length two, which specifies the size of
TSS region. Default: -2kb to 2kb.}

\item{reduce.peaks}{\code{Logical}. Whether to merge neighboring peaks.
Default: FALSE.}

\item{gapwidth}{\code{Numeric}. A minimum distance (in bp) between peaks to
merge. Default: 1000.}

\item{fragment.size}{\code{Numeric}. Extend read length to the fragment size.}

\item{unique}{\code{Logical}. Whether to remove duplicated reads (based on
the genomic coordinates). Default: FALSE.}

\item{swap.strand}{\code{Logical}. Whether to reverse the strand of the read.
Default: FALSE.}

\item{param}{\code{\link[Rsamtools]{ScanBamParam}} object influencing what
fields and which records (reads) are imported from the Bam file.
Default: NULL.}
}
\value{
An object of class \code{\link{ChipDataSet}}.
}
\description{
The function constructs an object of class \code{\link{ChipDataSet}}, which
is a container for holding processed sequencing data and the results of
all downstream analyses. All the slots of the created object are filled
during the workflow by applying specific functions to the object directly.
}
\details{
The function \code{constructCDS} initializes a
    \code{\link{ChipDataSet}} object, by providing the paths to the input
    files and information relevant to the ChIP-seq library preparation
    procedure. During the object construction the following steps are
    executed:
    \itemize{
        \item The peak information is converted into the object of
            \code{\link[GenomicRanges]{GRanges}} class.
        \item The genomic distribution of the peaks is evaluated (exonic,
            intronic, intergenic, TSSs).
        \item Each peak in the data set is functionally characterized:
            \itemize{
                \item \code{length} - the length of a peak (in base pairs).
                \item \code{fragments} - total number of fragments overlapping
                    a peak region.
                \item \code{density} - number of fragments per base pair of
                    the peak length.
                \item \code{pileup} - highest fragment pileup in each peak
                    region.
                \item \code{tssOverlap} - overlap (binary, yes/no) of the
                    peak with the annotated TSS region.
                    }
    The estimated features are used to predict which of the peaks are gene
    associated in the analysis downstream.
    }

    As many peak-calling algorithms tend to divide broader peaks into the
    several narrower closely spaced peaks, it is advised to merge these
    end-to-end peaks to decrease the number of false positives and prevent
    unnecessary truncation of transcripts in the downstream analysis.
}
\examples{
### Load ChipDataSet object
data(cds)

### View a short summary of the object
cds

}
\seealso{
\code{\link{ChipDataSet}} \code{\link{predictTssOverlap}}
}
\author{
Armen R. Karapetyan
}
