In general, I recommend against interpreting the fraction of variance explained by residuals. This fraction is driven by:
If you have additional variables that explain variation in measured gene expression, you should include them in order to avoid confounding with your variable of interest. But a particular residual fraction is not ‘good’ or ‘bad’ and is not a good metric of determining whether more variables should be included.
See GitHub page for up-to-date responses to users’ questions.
## R version 4.5.0 Patched (2025-04-21 r88169)
## Platform: aarch64-apple-darwin20
## Running under: macOS Ventura 13.7.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.37     R6_2.6.1          fastmap_1.2.0     xfun_0.52         cachem_1.1.0     
##  [6] knitr_1.50        htmltools_0.5.8.1 rmarkdown_2.29    lifecycle_1.0.4   cli_3.6.5        
## [11] sass_0.4.10       jquerylib_0.1.4   compiler_4.5.0    tools_4.5.0       evaluate_1.0.3   
## [16] bslib_0.9.0       yaml_2.3.10       rlang_1.1.6       jsonlite_2.0.0