1 Troubleshooting

A common issue that comes up when running spiec.easi is coming up with an empty network after running StARS.

For example:

library(SpiecEasi)
data(amgut1.filt)

pargs <- list(seed=10010)
se3 <- spiec.easi(amgut1.filt, method='mb', lambda.min.ratio=5e-1, nlambda=10, pulsar.params=pargs)
getOptInd(se3)
# [1] 1
sum(getRefit(se3))/2
# [1] 139

As the warning indicates, the network stability could not be determined from the lambda path. Looking at the stability along the lambda path, se$select$stars$summary, we can see that the maximum value of the StARS summary statistic never crosses the default threshold (0.05).

This problem we can fix by lowering lambda.min.ratio to explore denser networks:

se4 <- spiec.easi(amgut1.filt, method='mb', lambda.min.ratio=1e-1, nlambda=10, pulsar.params=pargs)

We have now fit a network, but since we have only a rough, discrete sampling of networks along the lambda path, we should check how far we are from the target stability threshold (0.05):

getStability(se4)
# [1] 0.0003237095
sum(getRefit(se4))/2
# [1] 158

To get closer to the mark, we should bump up nlambda to more finely sample of the lambda path, which gives a denser network:

se5 <- spiec.easi(amgut1.filt, method='mb', lambda.min.ratio=1e-1, nlambda=100, pulsar.params=pargs)
getStability(se5)
# [1] 0.0003237095
sum(getRefit(se5))/2
# [1] 210

1.1 Common issues and solutions

1.1.1 1. Empty networks

Problem: After running spiec.easi, you get an empty network (no edges).

Solutions: - Lower lambda.min.ratio to explore denser networks - Increase nlambda for finer sampling of the lambda path - Check if your data has sufficient signal-to-noise ratio - Try different methods (‘mb’ vs ‘glasso’)

1.1.2 2. Very dense networks

Problem: The inferred network has too many edges.

Solutions: - Increase lambda.min.ratio to explore sparser networks - Adjust the StARS threshold in pulsar.params - Use cross-validation instead of StARS

1.1.3 3. Computational issues

Problem: The analysis takes too long or runs out of memory.

Solutions: - Use parallel processing with ncores parameter (Unix-like systems only) - Use B-StARS method for large datasets - Reduce rep.num in pulsar.params - Use batch mode for HPC clusters

1.1.4 4. Windows parallel processing issues

Problem: Error “‘mc.cores’ > 1 is not supported on Windows”

Solutions: - Use ncores=1 for serial processing on Windows - Use snow cluster for parallel processing on Windows:

library(parallel)
cl <- makeCluster(4, type = "SOCK")
pargs.windows <- list(rep.num=50, seed=10010, cluster=cl)
se.windows <- spiec.easi(data, method='mb', pulsar.params=pargs.windows)
stopCluster(cl)
  • Use batch mode which works on all platforms
  • Consider using WSL (Windows Subsystem for Linux) for Unix-like environment

1.1.5 5. Convergence issues

Problem: The algorithm doesn’t converge or gives warnings.

Solutions: - Check data preprocessing and normalization - Ensure data doesn’t have constant columns - Try different starting values - Check for missing or infinite values

1.1.6 6. Memory issues

Problem: R runs out of memory during analysis.

Solutions: - Use sparse matrices where possible - Reduce dataset size by filtering rare taxa - Use batch processing for large datasets - Increase system memory if available

1.2 Platform-specific considerations

1.2.1 Windows users:

  • Default parallel processing (mc.cores > 1) is not supported
  • Use ncores=1 for serial processing
  • Use snow cluster for parallel processing
  • Consider batch mode for large datasets

1.2.2 Unix-like systems (Linux, macOS):

  • Full support for parallel processing with mc.cores
  • Can use ncores parameter directly
  • Both multicore and snow clusters available

1.3 Diagnostic functions

SpiecEasi provides several functions to help diagnose issues:

# Check stability along lambda path
getStability(se)

# Get optimal lambda index
getOptInd(se)

# Get summary statistics
se$select$stars$summary

# Check network density
sum(getRefit(se))/2

# Visualize stability curve
plot(se$select$stars$summary)

# Check platform information
.Platform$OS.type

1.4 Parameter tuning guidelines

1.4.1 For small datasets (< 100 samples, < 50 taxa):

  • lambda.min.ratio = 1e-2
  • nlambda = 20-50
  • rep.num = 20-50

1.4.2 For medium datasets (100-1000 samples, 50-200 taxa):

  • lambda.min.ratio = 1e-3
  • nlambda = 50-100
  • rep.num = 50-100
  • Use parallel processing (Unix-like systems only)

1.4.3 For large datasets (> 1000 samples, > 200 taxa):

  • lambda.min.ratio = 1e-4
  • nlambda = 100+
  • rep.num = 100+
  • Use B-StARS method
  • Consider batch processing

1.4.4 Windows-specific recommendations:

  • Use ncores=1 for serial processing
  • Use snow cluster for parallel processing
  • Consider batch mode for large datasets
  • Use B-StARS method to reduce computational time

Session info:

sessionInfo()
# R Under development (unstable) (2025-11-04 r88984)
# Platform: aarch64-apple-darwin20
# Running under: macOS Ventura 13.7.8
# 
# Matrix products: default
# BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
# LAPACK: /Library/Frameworks/R.framework/Versions/4.6-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
# 
# locale:
# [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# time zone: America/New_York
# tzcode source: internal
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
# [1] phyloseq_1.55.0  igraph_2.2.1     Matrix_1.7-4     SpiecEasi_1.99.3
# [5] BiocStyle_2.39.0
# 
# loaded via a namespace (and not attached):
#  [1] gtable_0.3.6        shape_1.4.6.1       xfun_0.54          
#  [4] bslib_0.9.0         ggplot2_4.0.0       rhdf5_2.55.8       
#  [7] Biobase_2.71.0      lattice_0.22-7      rhdf5filters_1.23.0
# [10] vctrs_0.6.5         tools_4.6.0         generics_0.1.4     
# [13] biomformat_1.39.0   stats4_4.6.0        parallel_4.6.0     
# [16] tibble_3.3.0        cluster_2.1.8.1     pkgconfig_2.0.3    
# [19] huge_1.3.5          data.table_1.17.8   RColorBrewer_1.1-3 
# [22] S7_0.2.0            S4Vectors_0.49.0    lifecycle_1.0.4    
# [25] farver_2.1.2        compiler_4.6.0      stringr_1.6.0      
# [28] Biostrings_2.79.2   tinytex_0.57        Seqinfo_1.1.0      
# [31] codetools_0.2-20    permute_0.9-8       htmltools_0.5.8.1  
# [34] sass_0.4.10         yaml_2.3.10         glmnet_4.1-10      
# [37] pillar_1.11.1       crayon_1.5.3        jquerylib_0.1.4    
# [40] MASS_7.3-65         cachem_1.1.0        vegan_2.7-2        
# [43] magick_2.9.0        iterators_1.0.14    foreach_1.5.2      
# [46] nlme_3.1-168        tidyselect_1.2.1    digest_0.6.38      
# [49] stringi_1.8.7       dplyr_1.1.4         reshape2_1.4.5     
# [52] bookdown_0.45       labeling_0.4.3      splines_4.6.0      
# [55] ade4_1.7-23         fastmap_1.2.0       grid_4.6.0         
# [58] cli_3.6.5           magrittr_2.0.4      dichromat_2.0-0.1  
# [61] survival_3.8-3      ape_5.8-1           withr_3.0.2        
# [64] scales_1.4.0        rmarkdown_2.30      XVector_0.51.0     
# [67] multtest_2.67.0     pulsar_0.3.11       VGAM_1.1-13        
# [70] evaluate_1.0.5      knitr_1.50          IRanges_2.45.0     
# [73] mgcv_1.9-4          rlang_1.1.6         Rcpp_1.1.0         
# [76] glue_1.8.0          BiocManager_1.30.26 BiocGenerics_0.57.0
# [79] jsonlite_2.0.0      R6_2.6.1            Rhdf5lib_1.33.0    
# [82] plyr_1.8.9