# This data package comprises three datasets, namely:
# - 5k PBMCs of a healthy donor, 3' v3 chemistry
# - 10k PBMCs of a healthy donor, 3' v3 chemistry
# - 20k PBMCs of a healthy donor, 3' v3 chemistry

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #

# The raw FASTQs for all three datasets were sourced from publicly accessible datasets provided by 10x Genomics (https://www.10xgenomics.com/datasets)
# To download the FASTQ files via the command line, the following wget commands were employed:
# 5k PBMCs: wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/3.0.2/5k_pbmc_v3/5k_pbmc_v3_fastqs.tar
# 10k PBMCs: wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/6.1.0/10k_PBMC_3p_nextgem_Chromium_Controller/10k_PBMC_3p_nextgem_Chromium_Controller_fastqs.tar
# 20k PBMCs: wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/6.1.0/20k_PBMC_3p_HT_nextgem_Chromium_X/20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs.tar

# Upon downloading, the files were extracted using the following commands:
# tar -xf 5k_pbmc_v3_fastqs.tar
# tar -xf 10k_PBMC_3p_nextgem_Chromium_Controller_fastqs.tar
# tar -xf 20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs.tar

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #

# Once extracted, scIGD (https://github.com/AGImkeller/scIGD) Snakemake workflow was utilized.
# scIGD has been designed to automate and streamline the genotyping process for immune genes, focusing on key targets such as HLAs and KIRs,
# and enabling allele-specific quantification from scRNA-seq data using donor-specific references.
# First,  the workflow executes allele-typing procedures on immune genes, with a focus on critical entities such as HLAs and KIRs.
# This process utilizes donor-specific references to enhance accuracy and specificity, thus mitigating potential quantification bias.
# The output for each gene will consist of 1 or 2 alleles, providing a clear and concise representation
# of the allele types associated with the immune genes in the dataset.
# Second, quantification of genes and typed alleles takes place.
# As generated by kallisto, the resulting output is a count matrix (.mtx) that serves as a rich source for downstream analysis and exploration,
# capturing the nuanced expression levels of genes and specifically typed alleles.
# In addition, barcodes (.txt) and features/genes (.txt) files will be created.
# Lastly, the output includes a lookup table (.csv) to facilitate the creation of the relevant additional data layers during object generation for analysis. These are saved in ~/inst/extdata.

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #

# The data in scaeData represent the resulting output of scIGD.

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #

# In order to obtain the data present in scaeData, we ran scIGD on the FASTQs obtained from 10x.
# Here, we show one example on 5k PBMCs.
# The following parameters have been set in scIGD's config.yaml:
# wta: true
# multiplex: false
# threads_number: 12
# sequencing_technology: 10XV3
# genes_to_be_allele_typed: ["HLA-A","HLA-B","HLA-C","HLA-DPB1","HLA-DQB1","HLA-DQA1","HLA-DRB1"]
# raw_data_fastq_list: ["data/raw/5k_pbmc_v3_S1_L001.R1.fastq.gz", "data/raw/5k_pbmc_v3_S1_L001.R2.fastq.gz",
# "data/raw/5k_pbmc_v3_S1_L002.R1.fastq.gz", "data/raw/5k_pbmc_v3_S1_L002.R2.fastq.gz",
# "data/raw/5k_pbmc_v3_S1_L003.R1.fastq.gz", "data/raw/5k_pbmc_v3_S1_L003.R2.fastq.gz",
# "data/raw/5k_pbmc_v3_S1_L004.R1.fastq.gz", "data/raw/5k_pbmc_v3_S1_L004.R2.fastq.gz"]
# single_end_samples: false
# reference_genome_gtf: "data/meta/Homo_sapiens.GRCh38.110.gtf.gz"
# reference_genome_fasta: "data/meta/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz"

# In addition, two files had to be obtained from Ensembl. These represent the last two parameters in the config.yaml file (reference_genome_gtf, reference_genome_fasta):
# https://ftp.ensembl.org/pub/release-111/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
# https://ftp.ensembl.org/pub/release-111/gtf/homo_sapiens/Homo_sapiens.GRCh38.111.gtf.gz

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #

# Once ready, scIGD Snakemake workflow was executed using the following command:
# snakemake --resources mem_gb=50 --cores 40 all

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #

# For a detailed, step-by-step guide on the workflow, please refer to our scIGD Snakemake documentation (https://github.com/AGImkeller/scIGD)

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #

# To harness the full analytical potential of the results, please consult our SingleCellAlleleExperiment package (https://github.com/AGImkeller/SingleCellAlleleExperiment).
# This package provides a comprehensive multi-layer data structure, enabling the representation of immune genes at various levels, including alleles, genes, and functional aspects.

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #

# Important links:
# 10x Genomics: https://www.10xgenomics.com/datasets
# scIGD: https://github.com/AGImkeller/scIGD
# SingleCellAlleleExperiment: https://github.com/AGImkeller/SingleCellAlleleExperiment

# ------------------------------------------------------------------------------ #
# ------------------------------------------------------------------------------ #
