annoLinker Vignette: Annotating genomic regions through chromatin interaction links
Annotating genomic regions through chromatin interaction networks. Genomic peaks are efficiently annotated by leveraging DNA interaction data to construct interaction graphs with igraph. Each peak overlapping a node within a connected component is annotated with all genes associated with that component, enabling robust propagation of functional associations across the chromatin interaction network.
annoLinker 0.99.7
Functional annotation of genomic regions is a key step in understanding regulatory mechanisms underlying gene expression and phenotype variation. Traditional annotation methods typically rely on the linear proximity of genomic features, such as linking peaks to the nearest gene, which can overlook the complex three-dimensional organization of the genome. Recent advances in chromatin conformation capture technologies (e.g., Hi-C, ChIA-PET, PLAC-seq, HiCAR) have revealed that regulatory elements often interact with distant genomic loci through chromatin loops, bringing enhancers, promoters, and other elements into physical contact.
annoLinker is designed to bridge this gap by annotating genomic regions
through chromatin interaction links. Rather than relying solely on
linear genomic distance, annoLinker integrates chromatin contact information
to establish biologically meaningful connections between regulatory regions
and their potential target genes. This approach complements traditional
annotation methods by enabling the functional assignment of distal elements
such as enhancers and silencers.
The package provides a streamlined workflow to link user-defined genomic regions
with genomic interactions (loops), annotate the associated genes, and visualize
the resulting interaction networks. By leveraging interaction data, annoLinker
enables users to explore functional relationships that are invisible in linear
genome coordinates and to generate more comprehensive biological interpretations
from genomic assays such as ATAC-seq, ChIP-seq, or Methyl-seq.
if (!require("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("annoLinker")
The input to annoLinker consists of three components: the genomic regions of
interest to be annotated, the annotation data, and the chromatin interaction
information. Both the peak list and annotation data should be provided as
GRanges objects, while genomic interactions can be supplied as either
GInteractions or Pairs objects.
By default, annoLinker annotates peaks that fall within the promoter region,
defined as 5 kb upstream to 5 kb downstream of the annotated features.
Users can optionally modify this definition to annotate regions within the
gene body or downstream segments instead.
The output is returned as an annoLinkerResult object, which can be easily
converted into a GRanges or data.frame for downstream analysis and
visualization.
library(annoLinker)
library(rtracklayer)
library(TxDb.Drerio.UCSC.danRer10.refGene)
library(org.Dr.eg.db)
txdb <- TxDb.Drerio.UCSC.danRer10.refGene
org <- org.Dr.eg.db
extPath <- system.file("extdata", package = "annoLinker")
## load peaks
peaks <- rtracklayer::import(file.path(extPath, "peaks.bed"))
## load interactions
interactions <- rtracklayer::import(file.path(extPath, "interaction.bedpe"))
## load annotation data
annoData <- genes(txdb)
anno <- annoLinker(peaks, annoData, interactions, verbose = TRUE)
class(anno)
## [1] "annoLinkerResult"
## attr(,"package")
## [1] "annoLinker"
head(anno, n = 2)
## GRanges object with 2 ranges and 7 metadata columns:
## seqnames ranges strand | feature_name feature_start
## <Rle> <IRanges> <Rle> | <character> <integer>
## [1] chr1 45062825-45063844 * | 100151507 44660095
## [2] chr1 45062825-45063844 * | 100038798 45052323
## feature_end feature_strand gene_id peak_bin
## <integer> <Rle> <character> <character>
## [1] 44674627 + 100151507 chr1:45060001-45070000
## [2] 45060215 - 100038798 chr1:45060001-45070000
## feature_bin
## <character>
## [1] chr1:44660001-44670000
## [2] chr1:45050001-45060000
## -------
## seqinfo: 2 sequences from an unspecified genome; no seqlengths
anno_peaks(anno)[c(1, 2)]
## GRanges object with 2 ranges and 7 metadata columns:
## seqnames ranges strand | feature_name feature_start
## <Rle> <IRanges> <Rle> | <character> <integer>
## [1] chr1 45062825-45063844 * | 100151507 44660095
## [2] chr1 45062825-45063844 * | 100038798 45052323
## feature_end feature_strand gene_id peak_bin
## <integer> <Rle> <character> <character>
## [1] 44674627 + 100151507 chr1:45060001-45070000
## [2] 45060215 - 100038798 chr1:45060001-45070000
## feature_bin
## <character>
## [1] chr1:44660001-44670000
## [2] chr1:45050001-45060000
## -------
## seqinfo: 2 sequences from an unspecified genome; no seqlengths
head(as(anno, "GRanges"), n = 2)
## GRanges object with 2 ranges and 7 metadata columns:
## seqnames ranges strand | feature_name feature_start
## <Rle> <IRanges> <Rle> | <character> <integer>
## [1] chr1 45062825-45063844 * | 100151507 44660095
## [2] chr1 45062825-45063844 * | 100038798 45052323
## feature_end feature_strand gene_id peak_bin
## <integer> <Rle> <character> <character>
## [1] 44674627 + 100151507 chr1:45060001-45070000
## [2] 45060215 - 100038798 chr1:45060001-45070000
## feature_bin
## <character>
## [1] chr1:44660001-44670000
## [2] chr1:45050001-45060000
## -------
## seqinfo: 2 sequences from an unspecified genome; no seqlengths
The plotEvidence() function allows users to visualize the annotation evidence
chain, illustrating how genomic regions are linked through chromatin
interactions. Users can plot the annotation evidence as either a
network graph or a genomic track integrated with gene annotation
information.
## plot the evidence for the first annotation
plotEvidence(anno,
event = 1,
output = "htmlWidget"
)
plotEvidence(anno,
event = 1,
output = "trackPlot",
txdb = TxDb.Drerio.UCSC.danRer10.refGene,
org = org.Dr.eg.db
)
## viewport[GRID.VP.83]
## R Under development (unstable) (2026-01-15 r89304)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] org.Dr.eg.db_3.22.0
## [2] TxDb.Drerio.UCSC.danRer10.refGene_3.4.6
## [3] GenomicFeatures_1.63.1
## [4] AnnotationDbi_1.73.0
## [5] Biobase_2.71.0
## [6] rtracklayer_1.71.3
## [7] GenomicRanges_1.63.1
## [8] Seqinfo_1.1.0
## [9] IRanges_2.45.0
## [10] S4Vectors_0.49.0
## [11] BiocGenerics_0.57.0
## [12] generics_0.1.4
## [13] annoLinker_0.99.7
## [14] BiocStyle_2.39.0
##
## loaded via a namespace (and not attached):
## [1] strawr_0.0.92 RColorBrewer_1.1-3
## [3] rstudioapi_0.18.0 jsonlite_2.0.0
## [5] magrittr_2.0.4 magick_2.9.0
## [7] farver_2.1.2 rmarkdown_2.30
## [9] BiocIO_1.21.0 vctrs_0.7.1
## [11] memoise_2.0.1 Rsamtools_2.27.0
## [13] RCurl_1.98-1.17 base64enc_0.1-3
## [15] tinytex_0.58 htmltools_0.5.9
## [17] S4Arrays_1.11.1 progress_1.2.3
## [19] curl_7.0.0 Rhdf5lib_1.33.0
## [21] rhdf5_2.55.12 SparseArray_1.11.10
## [23] Formula_1.2-5 sass_0.4.10
## [25] parallelly_1.46.1 bslib_0.10.0
## [27] htmlwidgets_1.6.4 Gviz_1.55.0
## [29] httr2_1.2.2 cachem_1.1.0
## [31] GenomicAlignments_1.47.0 igraph_2.2.1
## [33] lifecycle_1.0.5 pkgconfig_2.0.3
## [35] Matrix_1.7-4 R6_2.6.1
## [37] fastmap_1.2.0 MatrixGenerics_1.23.0
## [39] future_1.69.0 digest_0.6.39
## [41] colorspace_2.1-2 Hmisc_5.2-5
## [43] RSQLite_2.4.5 filelock_1.0.3
## [45] progressr_0.18.0 httr_1.4.7
## [47] abind_1.4-8 compiler_4.6.0
## [49] bit64_4.6.0-1 htmlTable_2.4.3
## [51] S7_0.2.1 backports_1.5.0
## [53] BiocParallel_1.45.0 DBI_1.2.3
## [55] biomaRt_2.67.1 rappdirs_0.3.4
## [57] DelayedArray_0.37.0 rjson_0.2.23
## [59] tools_4.6.0 foreign_0.8-90
## [61] otel_0.2.0 future.apply_1.20.1
## [63] nnet_7.3-20 glue_1.8.0
## [65] restfulr_0.0.16 InteractionSet_1.39.0
## [67] rhdf5filters_1.23.3 grid_4.6.0
## [69] checkmate_2.3.3 cluster_2.1.8.1
## [71] gtable_0.3.6 BSgenome_1.79.1
## [73] trackViewer_1.47.0 ensembldb_2.35.0
## [75] data.table_1.18.2.1 hms_1.1.4
## [77] XVector_0.51.0 pillar_1.11.1
## [79] stringr_1.6.0 dplyr_1.1.4
## [81] BiocFileCache_3.1.0 lattice_0.22-7
## [83] deldir_2.0-4 bit_4.6.0
## [85] biovizBase_1.59.0 tidyselect_1.2.1
## [87] Biostrings_2.79.4 knitr_1.51
## [89] gridExtra_2.3 bookdown_0.46
## [91] ProtGenerics_1.43.0 SummarizedExperiment_1.41.0
## [93] xfun_0.56 matrixStats_1.5.0
## [95] visNetwork_2.1.4 stringi_1.8.7
## [97] UCSC.utils_1.7.1 lazyeval_0.2.2
## [99] yaml_2.3.12 evaluate_1.0.5
## [101] codetools_0.2-20 cigarillo_1.1.0
## [103] interp_1.1-6 tibble_3.3.1
## [105] BiocManager_1.30.27 cli_3.6.5
## [107] rpart_4.1.24 jquerylib_0.1.4
## [109] dichromat_2.0-0.1 Rcpp_1.1.1
## [111] GenomeInfoDb_1.47.2 globals_0.18.0
## [113] grImport_0.9-7 dbplyr_2.5.1
## [115] png_0.1-8 XML_3.99-0.20
## [117] parallel_4.6.0 ggplot2_4.0.1
## [119] blob_1.3.0 prettyunits_1.2.0
## [121] jpeg_0.1-11 latticeExtra_0.6-31
## [123] AnnotationFilter_1.35.0 bitops_1.0-9
## [125] txdbmaker_1.7.3 listenv_0.10.0
## [127] VariantAnnotation_1.57.1 scales_1.4.0
## [129] crayon_1.5.3 rlang_1.1.7
## [131] KEGGREST_1.51.1