page

Mar 26, 2024

HiSeq 4000, NovaSeq multiplex sample issues

https://med.stanford.edu/gssc/hiseq4000issue.html

https://enseqlopedia.com/2016/12/index-mis-assignment-between-samples-on-hiseq-4000-and-x-ten/

 

 If free barcoded adapter / index primers are present in a multiplexed pool, the free adapter has the potential to prime and extend library molecules in the same lane during the clustering step.  This can result in mis-assignment of reads through index swapping.  This can cause errors in demultiplexing data, as reads from one sample have the potential to end up in the FASTQ files of a different sample.  The HiSeq 2000/2500 and MiSeq are less impacted due to their biochemistry and the geometry of the flow cell used.

 

The range of mis-assignment can vary significantly and is impacted by the following factors:

  • Amount of free adapter present in library
  • Storage conditions of library
  • Application or library prep workflow

 

Sample mis-assignment can potentially impact users depending on the experimental design and library prep workflow.  Illumina has been working on this issue internally and has developed a few suggested mitigation strategies to reduce index swaps, listed below:

During Library Construction:

  • Optimize your PCR or ligation step to avoid an excess of adapters or index primers.
  • For PCR dilute the index primers to adjust the insert to adapter / primer ratio.
  • Perform extra clean ups after this step.
  • PAGE purification seems to do a good job reducing indexing primers.
  • Purification columns are also an option.
  • Do extra clean ups of each individual library before pooling.
  • Use single use aliquoted adapters and primers.
  • Freeze individual libraries and pool prior to sequencing.

Pooling suggestions:

  • Use dual indexing strategies with unique barcodes on both ends. (Swapping would have to occur at both ends for read mis-assignment to occur)
  • Sequence or freeze created libraries pools as soon as possible.

Sequencing suggestions:

  • Use PhiX from third parties with unique indexing barcodes to determine swap frequency. (We will have begun to introduce PhiX with unique barcodes from SeqMatic for HiSeq 4000 runs.)
  • For methods highly sensitive to mis-assignment use HiSeq 2000/2500 or MiSeq instruments.

 

MiSeq, HiSeq, NovaSeq read output tables

 https://med.stanford.edu/gssc/services/sequencing1.html

 

Illumina Sequencing Services

  MiSeq MiSeq Micro HiSeq 4000 iSeq 100 NovaSeq 6000 SP NovaSeq 6000 S1 NovaSeq 6000 S2 NovaSeq 6000 S4
Run Time 4-56 hours 24 hours 2-4 days 9-17.5 hours 13-38 hours 13-25 hours 16-36 hours 36-44 hours
Maximum Output 15 Gb 1.2 Gb 1500 Gb 1.2 Gb 325-400 Gb 400-500 Gb 1000-1250 Gb 2400-3000 Gb
Average Read Output 22 - 25 million 4 million 250 - 400 million 4 million 325 - 400 million 750 - 800 million 1,650 - 2,050 million 2,000 - 2,500 million
Maximum Read Length 2 x 300 bp 2 x 150 bp 2 x 150 bp 2 x 150 bp 2 x 250 bp 2 x 150 bp 2 x 150 bp 2 x 150 bp

UMI (Unique Molecular Identifier)

 https://dnatech.genomecenter.ucdavis.edu/faqs/what-are-umis-and-why-are-they-used-in-high-throughput-sequencing/

 UMI

also known as 'Molecular Barcodes'

 Quantitative sequencing analysis

- can be used in removal of PCR duplicate 

Genomic variant detection