scDotPlot 1.1.0
Dot plots of single-cell RNA-seq data allow for an examination of the relationships between cell groupings (e.g. clusters) and marker gene expression. The scDotPlot package offers a unified approach to perform a hierarchical clustering analysis and add annotations to the columns and/or rows of a scRNA-seq dot plot. It works with SingleCellExperiment and Seurat objects as well as data frames. The scDotPlot()
function uses data from scater::plotDots()
or Seurat::DotPlot()
along with the aplot
package to add dendrograms from ggtree
and optional annotations.
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("scDotPlot")
To install the development version directly from GitHub:
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("ben-laufer/scDotPlot")
First, we normalize the object and then, for the purpose of this example, subset to remove cells without cell-type labels.
library(scRNAseq)
library(scuttle)
sce <- ZeiselBrainData()
sce <- sce |>
logNormCounts() |>
subset(x = _, , level2class != "(none)")
The features argument accepts a character vector with the gene IDs. For this example, we quickly obtain the top markers of for each cell type and then add them to the rowData of the object.
library(scran)
library(purrr)
library(dplyr)
library(AnnotationDbi)
features <- sce |>
scoreMarkers(sce$level1class) |>
map(~ .x |>
as.data.frame() |>
arrange(desc(mean.AUC))|>
dplyr::slice(1:6) |>
rownames()) |>
unlist2()
rowData(sce)$Marker <- features[match(rownames(sce), features)] |>
names()
Finally, we create the plot. The group
arguments utilize the colData, while the features
arguments use the rowData. The paletteList
argument can be used to manually specify the colors for the annotations specified through groupAnno
and featureAnno
. The clustering of the columns shows that cell the cell sub-types cluster by cell-type, while the clustering of the rows shows that most of the markers clusters by their cell type.
library(scDotPlot)
library(ggsci)
sce |>
scDotPlot(features = features,
group = "level2class",
groupAnno = "level1class",
featureAnno = "Marker",
groupLegends = FALSE,
annoColors = list("level1class" = pal_d3()(7),
"Marker" = pal_d3()(7)),
annoWidth = 0.02)
Plotting by Z-score through scale = TRUE
improves the clustering result for the rows.
sce |>
scDotPlot(scale = TRUE,
features = features,
group = "level2class",
groupAnno = "level1class",
featureAnno = "Marker",
groupLegends = FALSE,
annoColors = list("level1class" = pal_d3()(7),
"Marker" = pal_d3()(7)),
annoWidth = 0.02)
After loading the example Seurat object, we find the top markers for each cluster and add them to the assay of interest.
library(Seurat)
library(SeuratObject)
library(tibble)
data("pbmc_small")
features <- pbmc_small |>
FindAllMarkers(only.pos = TRUE, verbose = FALSE) |>
group_by(cluster) |>
dplyr::slice(1:6) |>
dplyr::select(cluster, gene)
pbmc_small[[DefaultAssay(pbmc_small)]][[]] <- pbmc_small[[DefaultAssay(pbmc_small)]][[]] |>
rownames_to_column("gene") |>
left_join(features, by = "gene") |>
column_to_rownames("gene")
features <- features |>
deframe()
Plotting a Seurat object is similar to plotting a SingleCellExperiment object.
pbmc_small |>
scDotPlot(features = features,
group = "RNA_snn_res.1",
groupAnno = "RNA_snn_res.1",
featureAnno = "cluster",
annoColors = list("RNA_snn_res.1" = pal_d3()(7),
"cluster" = pal_d3()(7)),
groupLegends = FALSE,
annoWidth = 0.075)
Again, we see that plotting by Z-score improves the clustering result for the rows.
pbmc_small |>
scDotPlot(scale = TRUE,
features = features,
group = "RNA_snn_res.1",
groupAnno = "RNA_snn_res.1",
featureAnno = "cluster",
annoColors = list("RNA_snn_res.1" = pal_d3()(7),
"cluster" = pal_d3()(7)),
groupLegends = FALSE,
annoWidth = 0.075)
The Bioconductor support site is the preferred method to ask for help. Before posting, it’s recommended to check previous posts for the answer and look over the posting guide. For the post, it’s important to use the scDotPlot
tag and provide both a minimal reproducible example and session information.
This package was inspired by the single-cell example from aplot.
## R Under development (unstable) (2024-10-21 r87258)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] tibble_3.2.1 Seurat_5.1.0
## [3] SeuratObject_5.0.2 sp_2.1-4
## [5] ggsci_3.2.0 scDotPlot_1.1.0
## [7] AnnotationDbi_1.69.0 dplyr_1.1.4
## [9] purrr_1.0.2 scran_1.35.0
## [11] scuttle_1.17.0 scRNAseq_2.19.1
## [13] SingleCellExperiment_1.29.0 SummarizedExperiment_1.37.0
## [15] Biobase_2.67.0 GenomicRanges_1.59.0
## [17] GenomeInfoDb_1.43.0 IRanges_2.41.0
## [19] S4Vectors_0.45.0 BiocGenerics_0.53.0
## [21] MatrixGenerics_1.19.0 matrixStats_1.4.1
## [23] BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] fs_1.6.4 ProtGenerics_1.39.0 spatstat.sparse_3.1-0
## [4] bitops_1.0-9 httr_1.4.7 RColorBrewer_1.1-3
## [7] tools_4.5.0 sctransform_0.4.1 alabaster.base_1.7.0
## [10] utf8_1.2.4 R6_2.5.1 HDF5Array_1.35.0
## [13] uwot_0.2.2 lazyeval_0.2.2 rhdf5filters_1.19.0
## [16] withr_3.0.2 gridExtra_2.3 progressr_0.15.0
## [19] cli_3.6.3 spatstat.explore_3.3-3 fastDummies_1.7.4
## [22] labeling_0.4.3 alabaster.se_1.7.0 sass_0.4.9
## [25] spatstat.data_3.1-2 ggridges_0.5.6 pbapply_1.7-2
## [28] yulab.utils_0.1.7 Rsamtools_2.23.0 scater_1.35.0
## [31] parallelly_1.38.0 limma_3.63.0 RSQLite_2.3.7
## [34] gridGraphics_0.5-1 generics_0.1.3 BiocIO_1.17.0
## [37] spatstat.random_3.3-2 ica_1.0-3 Matrix_1.7-1
## [40] ggbeeswarm_0.7.2 fansi_1.0.6 abind_1.4-8
## [43] lifecycle_1.0.4 yaml_2.3.10 edgeR_4.5.0
## [46] rhdf5_2.51.0 SparseArray_1.7.0 BiocFileCache_2.15.0
## [49] Rtsne_0.17 grid_4.5.0 blob_1.2.4
## [52] promises_1.3.0 dqrng_0.4.1 ExperimentHub_2.15.0
## [55] crayon_1.5.3 miniUI_0.1.1.1 lattice_0.22-6
## [58] beachmat_2.23.0 cowplot_1.1.3 GenomicFeatures_1.59.0
## [61] KEGGREST_1.47.0 pillar_1.9.0 knitr_1.48
## [64] metapod_1.15.0 rjson_0.2.23 future.apply_1.11.3
## [67] codetools_0.2-20 leiden_0.4.3.1 glue_1.8.0
## [70] ggfun_0.1.7 spatstat.univar_3.0-1 data.table_1.16.2
## [73] treeio_1.31.0 vctrs_0.6.5 png_0.1-8
## [76] gypsum_1.3.0 spam_2.11-0 gtable_0.3.6
## [79] cachem_1.1.0 xfun_0.48 S4Arrays_1.7.0
## [82] mime_0.12 survival_3.7-0 statmod_1.5.0
## [85] bluster_1.17.0 fitdistrplus_1.2-1 ROCR_1.0-11
## [88] nlme_3.1-166 ggtree_3.15.0 bit64_4.5.2
## [91] alabaster.ranges_1.7.0 filelock_1.0.3 RcppAnnoy_0.0.22
## [94] bslib_0.8.0 irlba_2.3.5.1 vipor_0.4.7
## [97] KernSmooth_2.23-24 colorspace_2.1-1 DBI_1.2.3
## [100] tidyselect_1.2.1 bit_4.5.0 compiler_4.5.0
## [103] curl_5.2.3 httr2_1.0.5 BiocNeighbors_2.1.0
## [106] DelayedArray_0.33.0 plotly_4.10.4 bookdown_0.41
## [109] rtracklayer_1.67.0 scales_1.3.0 lmtest_0.9-40
## [112] rappdirs_0.3.3 goftest_1.2-3 stringr_1.5.1
## [115] digest_0.6.37 spatstat.utils_3.1-0 alabaster.matrix_1.7.0
## [118] rmarkdown_2.28 XVector_0.47.0 htmltools_0.5.8.1
## [121] pkgconfig_2.0.3 highr_0.11 dbplyr_2.5.0
## [124] fastmap_1.2.0 ensembldb_2.31.0 rlang_1.1.4
## [127] htmlwidgets_1.6.4 UCSC.utils_1.3.0 shiny_1.9.1
## [130] farver_2.1.2 jquerylib_0.1.4 zoo_1.8-12
## [133] jsonlite_1.8.9 BiocParallel_1.41.0 BiocSingular_1.23.0
## [136] RCurl_1.98-1.16 magrittr_2.0.3 ggplotify_0.1.2
## [139] GenomeInfoDbData_1.2.13 dotCall64_1.2 patchwork_1.3.0
## [142] Rhdf5lib_1.29.0 munsell_0.5.1 Rcpp_1.0.13
## [145] viridis_0.6.5 ape_5.8 reticulate_1.39.0
## [148] stringi_1.8.4 alabaster.schemas_1.7.0 zlibbioc_1.53.0
## [151] MASS_7.3-61 AnnotationHub_3.15.0 plyr_1.8.9
## [154] parallel_4.5.0 listenv_0.9.1 ggrepel_0.9.6
## [157] deldir_2.0-4 Biostrings_2.75.0 splines_4.5.0
## [160] tensor_1.5 locfit_1.5-9.10 igraph_2.1.1
## [163] spatstat.geom_3.3-3 RcppHNSW_0.6.0 reshape2_1.4.4
## [166] ScaledMatrix_1.15.0 BiocVersion_3.21.1 XML_3.99-0.17
## [169] evaluate_1.0.1 BiocManager_1.30.25 httpuv_1.6.15
## [172] RANN_2.6.2 tidyr_1.3.1 polyclip_1.10-7
## [175] future_1.34.0 scattermore_1.2 alabaster.sce_1.7.0
## [178] ggplot2_3.5.1 rsvd_1.0.5 xtable_1.8-4
## [181] restfulr_0.0.15 AnnotationFilter_1.31.0 tidytree_0.4.6
## [184] RSpectra_0.16-2 later_1.3.2 viridisLite_0.4.2
## [187] aplot_0.2.3 beeswarm_0.4.0 memoise_2.0.1
## [190] GenomicAlignments_1.43.0 cluster_2.1.6 globals_0.16.3