Hi everyone,

I’ve recently started analyzing single-cell RNA-seq data (with FASTQ files as a starting point) and so far I have used 10x genomics data from their website.

Now, I’m interested in using data generated by other protocols, specifically SMART, because it is the most used full-length protocol (the two main paradigms are tag-based like 10x and full length). However, I’m having trouble understanding the raw data and I figured that it would be worth discussing the differences between FASTQ files from 10x and SMART-seq.
Both methods are sequenced in Illumina sequencers, which depending on the model, yield a different number of files, but it’s always one set of files. What about SMART-seq? is that the protocol where there’s one set of files for each cell?

To further complicate matters, I understand that full-length protocols (SMART-seq2) -unlike tag-based protocols- do not support UMIs, but SMART-seq3 does use UMIs and I had the idea (I read it in some paper) that when you are sequencing full-length transcripts having UMIs is really not a factor that changes anything. So how does the analysis between SMART-seq2 and SMART-seq3 change to account for this?

