page

Jul 1, 2022

UMI : Unique Molecular Identifier, What and Why?

What are UMIs and why are they used in high-throughput sequencing?

https://dnatech.genomecenter.ucdavis.edu/faqs/what-are-umis-and-why-are-they-used-in-high-throughput-sequencing/




 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Software:
UMI-Tools: https://github.com/CGATOxford/UMI-tools
zUMIs: https://github.com/sdparekh/zUMIs
fastp: https://github.com/OpenGene/fastp  (transfer of UMIs into read IDs)

 

Fu, Y., Wu, PH., Beane, T. et al. Elimination of PCR duplicates in RNA-seq and small RNA-seq using unique molecular identifiers. BMC Genomics 19, 531 (2018).

 https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-018-4933-1

A higher number of unique combinations can be achieved simply by increasing the number of random-nucleotide positions. The number of UMI combinations must be sufficiently large because as mentioned above, the chance that two cDNA molecules with identical sequences in the starting pool are tagged with the same UMI combination needs to be infinitesimally small.

 

 

cyvcf2 : cython + htslib built for fast parsing of Variant Call Format (VCF)

https://github.com/brentp/cyvcf2

 

cyvcf2 is a cython wrapper around htslib built for fast parsing of Variant Call Format (VCF) files.