Contents

1 Introduction

The TransOmicsData package contains datasets spanning various biological contexts such as in vitro embryonic and tissue-specific development in mouse and human. It covers multiple omics sequencing technologies such as RNAseq, mass spectrometry and ChIP-seq. This package was developed to provide convenient access to raw or pre-processed data for comparative trans-omics analysis.

2 The TransOmicsData package

2.1 Accessing the data

The data stored in this package can be retrieved using ExperimentHub.

# if (!requireNamespace("BiocManager", quietly = TRUE))
#    install.packages("BiocManager")

# BiocManager::install("ExperimentHub")
library(ExperimentHub)
refreshHub(hubClass = "ExperimentHub")
## ExperimentHub with 8364 records
## # snapshotDate(): 2025-04-11
## # $dataprovider: Eli and Edythe L. Broad Institute of Harvard and MIT, NCBI,...
## # $species: Homo sapiens, Mus musculus, Saccharomyces cerevisiae, Drosophila...
## # $rdataclass: SummarizedExperiment, data.frame, ExpressionSet, matrix, char...
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH1"]]' 
## 
##            title                                                              
##   EH1    | RNA-Sequencing and clinical data for 7706 tumor samples from The...
##   EH166  | ERR188297                                                          
##   EH167  | ERR188088                                                          
##   EH168  | ERR188204                                                          
##   EH169  | ERR188317                                                          
##   ...      ...                                                                
##   EH9639 | benchmark.set.3.rds                                                
##   EH9640 | regr.rds                                                           
##   EH9641 | regr.no.CV.rds                                                     
##   EH9642 | spe_Amancherla_2025                                                
##   EH9643 | spe_Vannan_2025
ehub <- ExperimentHub()
myfiles <- query(ehub, "TransOmicsData")
myfiles
## ExperimentHub with 12 records
## # snapshotDate(): 2025-04-11
## # $dataprovider: PRIDE, NCBI
## # $species: Mus musculus, Homo sapiens
## # $rdataclass: SummarizedExperiment
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH8536"]]' 
## 
##            title                                         
##   EH8536 | Chen organoid phosphoproteome                 
##   EH8537 | Chen organoid proteome                        
##   EH8538 | Chen organoid transcriptome                   
##   EH8539 | Xiao myogenesis differentation phosphoproteome
##   EH8540 | Xiao myogenesis differentiation proteome      
##   ...      ...                                           
##   EH8543 | Yang ESC epigenome                            
##   EH8544 | Yang ESC phosphoproteome                      
##   EH8545 | Yang ESC proteome                             
##   EH8546 | Yang ESC transcriptome                        
##   EH9515 | Chen organoid sctranscriptome

2.2 Package installation

# BiocManager::install("TransOmicsData")

To list the summarized metadata for all datasets in the package:

library(TransOmicsData)
listDatasets()
## DataFrame with 3 rows and 5 columns
##             Title            Description                  Omics     Species
##       <character>            <character>            <character> <character>
## 1   chen-organoid neural organoid diff.. phosphoproteome, pro..       human
## 2 xiao-myogenesis C2C12 myogenesis dif.. phosphoproteome, pro..       mouse
## 3        yang-esc ESC to epiLC differe.. epigenome, phosphopr..       human
##                RDataPath
##              <character>
## 1 TransOmicsData/0.99...
## 2 TransOmicsData/0.99...
## 3 TransOmicsData/0.99...

2.3 Citing TransOmicsData

We hope that TransOmicsData will be useful for your research. Please use the following information to cite the package. Thank you!

## Citation info
citation("TransOmicsData")
## To cite TransOmicsData in publications use:
## 
##   Chen C, Xiao D, Yang P (2024). _TransOmicsData: a collection of
##   trans-omics data covering a wide range of biological systems._.
##   University of Sydney, Sydney, Australia.
##   doi:10.18129/B9.bioc.TransOmicsData
##   <https://doi.org/10.18129/B9.bioc.TransOmicsData>,
##   <https://github.com/PYangLab/TransOmicsData>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {TransOmicsData: a collection of trans-omics data covering a wide range of biological systems.},
##     author = {Carissa Chen and Di Xiao and Pengyi Yang},
##     organization = {University of Sydney},
##     address = {Sydney, Australia},
##     year = {2024},
##     url = {https://github.com/PYangLab/TransOmicsData},
##     doi = {10.18129/B9.bioc.TransOmicsData},
##   }

Session info

## R version 4.5.0 beta (2025-04-02 r88102)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] TransOmicsData_1.5.0 ExperimentHub_2.17.0 AnnotationHub_3.17.0 BiocFileCache_2.17.0 dbplyr_2.5.0        
## [6] BiocGenerics_0.55.0  generics_0.1.3       BiocStyle_2.37.0    
## 
## loaded via a namespace (and not attached):
##  [1] rappdirs_0.3.3          sass_0.4.10             BiocVersion_3.22.0      RSQLite_2.3.9          
##  [5] digest_0.6.37           magrittr_2.0.3          evaluate_1.0.3          bookdown_0.43          
##  [9] fastmap_1.2.0           blob_1.2.4              jsonlite_2.0.0          AnnotationDbi_1.71.0   
## [13] GenomeInfoDb_1.45.0     DBI_1.2.3               BiocManager_1.30.25     httr_1.4.7             
## [17] purrr_1.0.4             UCSC.utils_1.5.0        Biostrings_2.77.0       jquerylib_0.1.4        
## [21] cli_3.6.4               crayon_1.5.3            rlang_1.1.6             XVector_0.49.0         
## [25] Biobase_2.69.0          bit64_4.6.0-1           withr_3.0.2             cachem_1.1.0           
## [29] yaml_2.3.10             tools_4.5.0             memoise_2.0.1           dplyr_1.1.4            
## [33] GenomeInfoDbData_1.2.14 filelock_1.0.3          curl_6.2.2              mime_0.13              
## [37] vctrs_0.6.5             R6_2.6.1                png_0.1-8               stats4_4.5.0           
## [41] lifecycle_1.0.4         KEGGREST_1.49.0         S4Vectors_0.47.0        IRanges_2.43.0         
## [45] bit_4.6.0               pkgconfig_2.0.3         pillar_1.10.2           bslib_0.9.0            
## [49] glue_1.8.0              xfun_0.52               tibble_3.2.1            tidyselect_1.2.1       
## [53] knitr_1.50              htmltools_0.5.8.1       rmarkdown_2.29          compiler_4.5.0