High-throughput single nucleus total RNA sequencing of formalin-fixed paraffin-embedded tissues by snRandom-seq

Overview of the droplet-based snRandom-seq method for FFPE tissues

The main workflow of snRandom-seq is shown in Fig. 1. For single nucleus isolation of FFPE tissues, the areas of interest of banked FFPE tissue block were first selected and placed into tubes. Deparaffinization and rehydration were carried out with standard xylene and alcohol wash. Afterward, nuclei were dissociated and permeabilizated. For comprehensive and high-throughput single nucleus total RNA-seq, we provided a strategy with a random-primer-based chemistry to capture full-length total RNAs, and an easy-to-operate droplet-based platform to tag single nucleus. Bare single-strand DNAs were blocked in situ by multiple annealing and extension of blocking primers. cDNAs of total RNA were converted in situ by multiple annealing of random primers and oligo(dT) primers in reverse transcription. To decrease the doublet rate, we involved a pre-indexing strategy into the reverse transcription step according to the published scifi-RNA-seq18. The nuclei were split into different tubes for reverse transcription with pre-indexed random primers, then pooled for the subsequent reaction. Poly(dA) tails were added to the 3′ hydroxyl terminus of the cDNAs in situ by terminal transferase (TdT). We also established a microfluidic platform for high-throughput single nucleus barcoding based on our previous work16,17. During the barcoding reaction in droplets, the poly(dT) primers were released from beads by enzymatic cutting19, and simultaneously, the cDNAs were released from the nucleus by RNA degradation. Then poly(dT) primers bound with the poly(dA) tail on the end of the cDNAs and extended to add a specific barcode to the cDNAs in each droplet. After barcoding, we broke the droplets, amplified the barcoded cDNA, and prepared the next-generation sequencing (NGS) library for paired-end sequencing.

Fig. 1: snRandom-seq for FFPE tissues overview.
figure 1

The workflow of snRandom-seq for FFPE tissues includes FFPE sample selection, paraffin dissolution, single nuclei isolation, and permeabilization, single-strand DNAs blocking, reverse transcription, dA tailing, droplet barcoding, primers releasing and extension, droplets breaking and PCR amplification, and sequencing. Red dashed circle in the FFPE tissue block: the areas of interest. Blue dashed box: the three in situ reactions, including single-strand DNAs blocking, reverse transcription, dA tailing. AAA: dA tail in the 3′ of cDNA. TTT: poly(dT) in the poly(dT) barcoded primers. Gray arrows: the direction of extension.

Validation of snRandom-seq using the human-mouse mixture sample

snRandom-seq utilizes random primers to capture total RNAs in single nuclei (Fig. 1), which differs from the current poly(A)-based and probe-based single-cell RNA-seq methods. Therefore, we performed a standard mixed species experiment with cultured human (293T) and mouse (3T3) cell lines to assess the fidelity of snRandom-seq. Freshly harvested 293T and 3T3 cells were lysed into nuclei and mixed for fixation. The fixed nuclei were used for snRandom-seq (Fig. 1). Before proceeding with microfluidic encapsulation, the nuclei were imaged to confirm single nucleus morphology and counted (Supplementary Fig. 1a). A high-throughput microfluidic platform was established for single cell/nuclei barcoding in snRandom-seq (Fig. 2a, Supplementary Fig. 1b). For barcode beads synthesis, the hydrogel bead generation device and the cell encapsulation device were designed and fabricated as previously described20 (Supplementary Fig. 2a). Hydrogel beads of 40 μm diameter were precisely produced (Supplementary Fig. 2b). Three rounds of split-and-pool-based ligation were performed on these hydrogel beads for DNA barcode synthesis (Supplementary Fig. 2c, Supplementary Table 2). The high reaction efficiency of each ligation step was reflected by the sharp peak in the electropherogram of released barcode primers (Supplementary Fig. 2d). Nucleus, barcode bead, and reagents mix were co-compartmentalized in water-in-oil emulsions using the microfluidic platform (Fig. 2a) and each individual nuclei were encapsulated into a droplet with a barcode bead (Fig. 2b).

Fig. 2: Validation and benchmark of snRandom-seq using a human-mouse mixture sample.
figure 2

a Microfluidic encapsulation device for barcoding of nuclei. b Image of encapsulated droplet containing one bead, one nuclei, and reagents mix. c Electropherogram of 293T (human) and 3T3 (mouse) nuclei mixture cDNA library for Qsep100™ DNA Fragment Analyzer. Lower (20 bp) and upper (1 kb) markers were shown. d Barcode plot for identification of the barcodes that represent true nuclei (red line). Barcodes of the 293T-3T3 mixed nuclei were ordered from the largest to smallest gene counts. e Species-mixing scatter plot showing the single-nuclei capture efficiency and doublet rate of snRandom-seq. f Species specificity of UMIs in 293T-3T3 mixture. Identified 293T nuclei: n = 1157, identified 3T3 nuclei: n = 1086. Median of species specificity of UMIs in 293T was 0.992. Median of species specificity of UMIs in 3T3 was 0.986. g Percents of reads mapped to introns and exons. Violin plots and box plots showed the number of genes (h) and UMIs (i) detected in each 293T and 3T3 nucleus. Filtered 293T nuclei: n = 1085, filtered 3T3 nuclei: n = 1066. Data in the box plot corresponded to the first and third quartiles (lower and upper hinges) and median (center). j Saturation analysis of three methods. snRandom-seq used 293T and 3T3 nuclei; 10X Chromium Single Cell 3’ Solution V3 used 293 T and 3T3 cells; VASA-seq used 293T cells. k Read coverage along the gene body by the three methods. snRandom-seq used 293T nuclei; 10X Chromium Single Cell 3’ Solution V3 used 293T cells; VASA-seq used 293T cells. Data in (f, h, i) were presented as median values. Data in the box plot in (f, h, i) corresponded to the first (lower hinges) quartiles, third quartiles (upper hinges), and median (center). The upper whisker extended from the hinge to the maxima no further than 1.5 * IQR (interquartile range) from the hinge. The lower whisker extended from the hinge to the minima at most 1.5 * IQR of the hinge. IQR is the distance between the first and third quartiles. Source data are provided as a Source Data file.

After barcoding and amplification, the fragment size of the cDNA library of the human-mouse mixture peaked between 300 and 800 bps (Fig. 2c), which is not needed to fragment but is just suitable for NGS. After data processing, we identified 2250 high-quality unique nucleus barcodes by the significant steep slope in the barcode-gene rank plot (Fig. 2d), which suggests a clear separation of true nuclei from background noise. The nuclei capture rate was 42.2% and the percentage of reads mapped to the true nuclei was 76%. We counted the ratio of reads mapped to both human and mouse genomes in every single nucleus and found that pre-indexed primers markedly decreased the doublet rate (from 2.9% to 0.3%) (Fig. 2e, Supplementary 1c). The doublet rate of snRandom-seq is significantly lower than that of other droplet-based sc/snRNA-seq methods (sNucDrop-seq: ~2.6%, VASA-drop: 3.1%). Consistently, very high species specificity of UMI (99%) was observed (Fig. 2f), suggesting that snRandom-seq produced high-fidelity single nucleus libraries. The percentage of the reads mapped to exon or intron of identified human and mouse nuclei was calculated, and the results showed that the reads mapped to intron were three times of the reads mapped to exon (Fig. 2g). Additionally, many long non-coding RNAs (lncRNAs) and short non-coding RNAs, including small nucleolar RNA (snoRNA), small nuclear RNA (snRNA) and microRNA (miRNA), were detected (Supplementary Fig. 1d). Those results suggested that snRandom-seq captured full-length transcripts comprehensively.

Gene and UMI count distribution showed that snRandom-seq captured a median of 4141 genes and 11,594 UMIs in single 293T nucleus by sequencing average ~29k reads per 293T nucleus (Fig. 2h), and 3427 genes and 9795 UMIs in single 3T3 nucleus by ~25k reads per 3T3 nucleus (Fig. 2i). The results indicated that snRandom-seq is more sensitive than other two reported droplet-based high-throughput snRNA-seq methods (DroNc-seq21: average 3295 genes and 4643 UMIs with 160k reads per nucleus for 5636 3T3 nuclei; sNucDrop-seq22: average 2665 genes and 5195 UMIs with 23k reads per nucleus for 1984 3T3 nuclei) (Supplementary Fig. 1e). Saturation analysis showed that the number of genes detected in snRandom-seq had not yet reached saturation point by 60k uniquely aligned reads per 3T3 and 293T nucleus (Fig. 2j). We also compared our snRNA-seq data to the widely used high-throughput 10X Chromium Single Cell 3′ Solution V323 and the latest reported high-throughput VASA-drop10 for scRNA-seq. At a low sequencing depth (<10k), the sensitivity of snRandom-seq in 3T3 and 293T nuclei is comparable with 10X Chromium Single Cell 3’ Solution V3 in 3T3 and 293T cells, as well as VASA-drop in 293T cells (Fig. 2j). Unlike poly(A)-based 10X Chromium Single Cell 3′ Solution V3 with obvious 3′-end bias, both snRandom-seq and VASA-drop displayed no obvious 3′- or 5′-end bias across the gene body (Fig. 2k). As expected, snRandom-seq had a slight bias toward the 3′-end due to the extra addition of oligo(dT) primer in reverse transcription (Fig. 2k).

Performance of snRandom-seq in the FFPE tissues

For FFPE tissues, digestion with Proteinase K could isolate cleaner single nuclei than with collagenase (Supplementary Fig. 3a). With an optimized procedure (Fig. 1), single intact nuclei were efficiently isolated from multiple FFPE mouse tissues and a 2-year-old archived clinical FFPE sample of human liver cancer (Fig. 3a, Supplementary Fig. 4a), and the nuclei morphology and size distribution were comparable between FFPE and fresh samples (Supplementary Fig. 4b).

Fig. 3: Comparison of snRandom-seq with other two FFPE snRNA-seq methods.
figure 3

a Image of single nuclei before droplet barcoding and staining by DAPI. Scale bar, 50 μm. b Percentage of reads mapped to different genomic regions under different conditions. c, Electropherogram of FFPE mouse kidney cDNA library for Qsep100™ DNA Fragment Analyzer. Lower marker: 20 bp; upper marker: 1k bp. d Overview of FFPE/fresh comparison and technical replication experiment. The Pearson’s correlation coefficient (R) of the normalized gene expressions between FFPE/fresh samples (e) and technical replication samples (FFPE1, FFPE2) (f). Each dot represents the average expression level of a gene. The red line indicates the linear regression line. p value (p) was computed from two-sided permutation test. g Counts of different RNA biotypes detected in FFPE sample. h Gene detection comparison of mouse tissues (heart, kidney, testis, and liver) and human liver using snRandom-seq with mouse brain by snFFPE-seq15 and breast by snPATHO-seq14. Kidney nuclei: n = 5795, liver nuclei: n = 4287, heart nuclei: n = 6732, testis nuclei: n = 3774, brain nuclei: n = 7031, breast nuclei: n = 5721. Data were presented as median values. Data in the box plot corresponded to the first (lower hinges) quartiles, third quartiles (upper hinges), and median (center). The upper whisker extended from the hinge to the maxima no further than 1.5 * IQR from the hinge. The lower whisker extended from the hinge to the minima at most 1.5 * IQR of the hinge. i Saturation analysis of snRandom-seq based on the FFPE mouse tissues. j Reads distribution along the gene body by three different snRNA-seq methods (snRandom-seq, snFFPE-seq15, and 10X Chromium Fixed RNA Profiling). k Histogram showing the gene body coverage percents datasets generated by the three methods. l Representative raw reads aligned to human gene C1S in snRandom-seq and 10X Chromium Fixed RNA Profiling. Source data are provided as a Source Data file.

In our pilot FFPE snRNA sequencing experiment, little uniquely aligned reads were mapped to exons, with many reads mapped to intergenic regions due to genome contamination (Fig. 3b). Considering that the double-helix of DNA in FFPE tissues is liable to be disrupted after suffering chemical modification, a single-strand DNAs blocking step was added to the initial procedure of snRandom-seq (Fig. 1, box). The bare single-strand DNAs in the isolated FFPE single nucleus were blocked in situ by multiple annealing and extension of blocking primers on single-strand DNAs of genome. After DNA blocking, the percentage of intergenic regions was dramatically reduced (Fig. 3b). The mapping region distribution was comparable among DNA-blocked FFPE sample, fresh sample, and snFFPE-seq (10X Chromium Single Cell 3′ Solution V3), further supporting the high quality of the snRandom-seq data (Fig. 3b). By integrating the above procedures, high-quality cDNA libraries were generated by snRandom-seq from multiple FFPE tissues (Fig. 3c, Supplementary Fig. 4c, d). The fragment size of cDNA libraries from FFPE and fresh samples both peaked between 300 and 800 bps (Fig. 3c, Supplementary Fig. 4e).

To determine whether snRandom-seq can generate enough information from FFPE tissues as fresh samples, we collected both FFPE and fresh samples from the same mouse tissues and compared their RNA profiles using snRandom-seq (Fig. 3d). The RNA quality of FFPE and fresh samples were evaluated firstly by the RNA fragments distribution and DV200. As expected, the RNA quality of the FFPE sample was relatively poorer than that of the fresh sample (Supplementary Fig. 5a), suggesting that the RNA in the FFPE sample was degraded. The merged genome browser tracks of snRandom-seq results showed that the reads coverage areas of FFPE and fresh samples were similar (Supplementary Fig. 6a–g). Consistently, the total RNA profiles of FFPE and fresh samples by snRandom-seq displayed a good correlation (Pearson R: ~0.9, p < 2.2e-16; Fig. 3e, Supplementary Fig. 7a, b). Meanwhile, to prove the repeatability of our method, the same FFPE sample was sequenced independently with snRandom-seq (Fig. 3d), and a high correlation (Pearson R ~ 0.92, p < 2.2e-16) of gene expression profiles across these two batches was also seen (Fig. 3f). These results showed that snRandom-seq performed well in both fresh and FFPE samples.

We next compared our FFPE results with other reported FFPE snRNA-seq results. After data processing, thousands of true nuclei in these FFPE tissues were successfully identified from the snRandom-seq data (Fig. 3h, Supplementary Fig. 8a). snRandom-seq identified a broad spectrum of RNA biotypes in the FFPE sample (Fig. 3g), with about eight times as many lncRNAs as snFFPE-Seq, and snoRNA, scaRNAs, and miRNA were only detected in snRandom-seq (Supplementary Fig. 8b). The medians of genes detected per nuclei in unsaturated snRandom-seq datasets were all over 3000, significantly higher than that in other two reported high-throughput snRNA-seq methods for FFPE samples (snFFPE-Seq 10X Chromium Single Cell 3′ Solution V3: 276 genes/nucleus; snPATHO-Seq: 1850 genes/nucleus) (Fig. 3h), as well as the medians of UMIs (Supplementary Fig. 8c). Our data still has not yet to reach saturation point even sequencing ~300k mapped reads per nuclei and detecting ~10,000 genes (Fig. 3i).

We further compared the RNA coverage of snRandom-seq with the other two FFPE snRNA-seq methods. In the plot of average reads distribution on gene body, snFFPE-Seq using oligo(dT) primers showed a distinct 3′-end bias and 10X Chromium Fixed RNA Profiling using the same probe-base technology of snPATHO-seq showed a mild 5′-end bias (Fig. 3j). However, homogeneous distribution across gene body was observed in snRandom-seq data for the FFPE tissue (Fig. 3j), suggesting that random primers were evenly bound on transcripts and the extra oligo(dT) primers in snRandom-seq were invalid for FFPE sample. For RNA coverage at the level of single nucleus, snRandom-seq showed much higher coverage than that of snFFPE-seq or 10X Chromium Fixed RNA Profiling (Fig. 3k). For RNA coverage at the level of single gene, reads distribution along three selected genes (C1S, EMG1, KLRG1) indicated the critical difference between probe-based technology and the random primer-based strategy (Fig. 3l, Supplementary Fig. 8d). Mapped reads by 10X Chromium Fixed RNA Profiling were limited to the probe-target regions (<100 bp). In contrast, the mapped reads by snRandom-seq were evenly distributed in both exonic and intronic regions. These results suggested that snRandom-seq for FFPE tissues can capture a significant amount of high-quality RNA and extract much more transcriptomic information than the state-of-art platforms.

snRandom-seq revealed cell heterogeneity in FFPE mouse tissues

We next compared the cell types identified in FFPE and fresh samples by snRandom-seq. Unsupervised clustering of the above filtered high-quality single kidney nucleus profile revealed over ten distinct clusters. All clusters could be further annotated based on classical known cell-type markers24,25 (Fig. 4a, b, Supplementary Fig. 9a). Gene expressions of classical known cell-type marker genes22, such as Nphs1 for podocytes, Pecam1 for endothelial cells, and Pdgfrb for mesangial-like cells, were reliably mapped on the corresponding clusters (Fig. 4b). The mammalian renal tubule in the kidney contains at least 16 distinct epithelial cell types26. Here we identified most of the recommended terms for renal tubule epithelial cell types in FFPE mouse kidney samples by snRandom-seq, including proximal convoluted tubule, proximal straight tubule, distal nephron, distal convoluted tubule, loop of Henle, collecting duct principal cells, podocytes, proximal tubular cells, collecting duct intercalated cells, and collecting duct cells (Fig. 4a). Besides the known top markers of cell types, such as Slc14a2 for collecting duct cells, we also discovered several potential markers for these cell types (Fig. 4c). By merging the snRandom-seq data of the FFPE samples and fresh sample, as well as the other batch of FFPE samples, we obtained a robust cell clustering by t-SNE (t-distributed stochastic neighbor embedding) (Supplementary Fig. 9b). Most cell types were identified in the three snRandom-seq datasets (Fig. 4d, Supplementary Fig. 9c). As expected, there are some differences in the proportion of cell types of the FFPE and fresh samples (such as PTC), which might be caused by the sampling error and different nuclei extraction methods for FFPE and fresh samples.

Fig. 4: Cell heterogeneity revealed in FFPE mouse tissues by snRandom-seq.
figure 4

a t-SNE analysis of nuclei isolated from FFPE mouse kidney sample by snRandom-seq based on their gene expressions and colored by identified cell types. Fourteen Cell types identified were colored and shown below. b Expression of selected three cell-type markers in single nuclei in the t-SNE maps of FFPE mouse kidney. Gene expression levels are indicated by shades of red. c Dot plot of the average expressions of top two markers in each of the 14 cell types. d Proportion of annotated cell types of FFPE1, FFPE2, and fresh samples by snRandom-seq. e t-SNE and RNA velocity analysis of snRNAs from FFPE mouse testis by snRandom-seq. Velocity is shown as black arrows in different cell types by separate colors. The black arrows indicate RNA maturation trajectory. f Percents of spliced and unspliced transcripts in different cell types. g Cell cycle analysis of FFPE mouse testis by snRandom-seq. Points in t-SNE were colored by identified cell cycle phases (G1, G2M, or S). Red dashed circle: two subpopulations of late spermatocytes at the G2M phase with active transcriptional activity. Source data are provided as a Source Data file.

We further added more FFPE mouse tissues to demonstrate the biological utility of snRandom-seq data. In total, we sequenced and analyzed 19,258 single nuclei from four FFPE mouse tissues (heart, kidney, testis, and liver) using snRandom-seq and identified a total of 25 cell types (such as hepatocyte, germ cells, fibroblast, cardiomyocyte, etc.). (Supplementary Fig. 10a, b). An underrepresentation of immune cells could be seen, which is consistent with previous findings about cell type composition by single-nucleus RNA-seq libraries27.

The large proportion of intronic sequences detected in FFPE samples (Fig. 3b) suggested that snRandom-seq data would be more suitable for RNA velocity analysis by distinguishing newly transcribed RNAs (unspliced) from mature RNAs (spliced)28. Next, we applied snRandom-seq to FFPE mouse testis, where spermatogenesis is an excellent model for studying cell dynamics. Consistently with other studies on fresh testis by scRNA-seq29,30, t-SNE arranged germ cells at transitionary stages (mainly early spermatocyte and late spermatocyte) to be in continuous succession. In contrast, undifferentiated spermatogonia and mature spermatids are in clusters (Fig. 4e). The velocities computed by detected nascent transcripts were visualized on the t-SNE plot, revealing distinct velocity vector directions in different cell types, especially in the cells located at the left of early and late spermatocytes (Fig. 4e, f). Combined with cell cycle states analysis based on gene expression, the RNA velocity revealed an obvious cell maturation trajectory on two subpopulations of late spermatocytes at the G2M phase with active transcriptional activity (Fig. 4g).

snRandom-seq discovered a proliferative subpopulation in the FFPE clinical human specimen

Finally, we applied snRandom-seq on an about two-year-old clinical FFPE specimen of human macrotrabecular-massive (MTM) hepatocellular carcinoma (HCC) subtype (Fig. 5a). We selected an interested tumorous area on the paraffin block according to the histopathological examinations (Fig. 5b) and performed snRandom-seq. snRandom-seq identified 5914 true nuclei and detected a median of 3220 genes and a median of 8182 UMIs per nucleus in this clinical FFPE specimen (Supplementary Fig. 11a, Fig. 5b). As sequencing depth increases, snRandom-seq detected about 8000 genes at saturation (Supplementary Fig. 11b). A broad spectrum of RNA biotypes including lncRNAs, snRNAs, miscRNAs, miRNAs, and snoRNAs was detected from the sample (Supplementary Fig. 11c). Unsupervised clustering of the human liver single nucleus revealed several distinct clusters. The main cell types of human liver could be identified from the human specimen based on the known cell-type markers31, including hepatocyte (APOA1), kupffer cells (CD163), T cells (CD3E), fibroblast (PDGFB), plasma cells (FCRL5) (Fig. 5d, Supplementary Fig. 11d). Notably, a subcluster of hepatocytes (hepatocyte-2) was separated from the main hepatocyte population, with high expression of the proliferative marker MKI67 and the other two markers (ASPM and TOP2A), which were reported to be related to HCC progression32,33. (Fig. 5e). Meanwhile, cell cycle analysis of these snRNAs revealed that most cells in the hepatocyte-2 cluster were in phase G2M (Fig. 5f), suggesting that the hepatocyte-2 cluster might be a group of dividing tumor cells. After further investigating the cell communication among the clusters (Fig. 5g), we found that hepatocyte-1 and hepatocyte-2 displayed different outcoming and incoming signaling patterns (Fig. 5h). Hepatocyte-2 mainly receives signals from plasma cells through the BMP signaling pathway (Supplementary Fig. 12a), which is reported to be correlated with tumor progression in HCC34,35. Ligand–receptor pair analysis found that plasma cells preferentially sent signals to hepatocyte-2 by BMP6-(ACVR1 + ACVR2A) and the communication between plasma cells and hepatocyte-2 has specific ligand-receptor pairs, including BMP6-(BMPR1B + BMPR2), BMP6-(BMPR1B + ACVR2B), BMP6-(BMPR1B + ACVR2A), BMP6-(BMPR1A + ACVR2A), and BMP6-(ACVR1 + ACVR2A) (Fig. 5i). The gene expression also showed that BMPR1B and ACVR2A have specific expressions in hepatocyte-2 (Supplementary Fig. 12b). Taken together, snRandom-seq discovered a proliferative and activated subpopulation of hepatocytes from a clinical FFPE specimen, which provides a valuable clue for additional study in future.

Fig. 5: snRandom-seq discovered a proliferative subpopulation in the clinical FFPE human specimen.
figure 5

a Experimental overview of clinical FFPE sample of human macrotrabecular-massive hepatocellular carcinoma (MTM-HCC) subtype for snRandom-seq. b Histological appearance of MTM-HCC at low magnification (left, Scale bar, 5 mm.) and high magnification (right, Scale bar, 200 μm). Red dashed circle: tumorous area. Blue dashed circle and line: sampling area. White dashed circle and line: magnified area. c Violin plots and box plots showing the number of genes and UMIs detected in FFPE clinical human sample by snRandom-seq. MTM-HCC nuclei: n = 5914. Data was presented as median values. Data in the box plot corresponded to the first and third quartiles (lower and upper hinges) and median (center). Data in the box plot corresponded to the first (lower hinges) quartiles, third quartiles (upper hinges), and median (center). The upper whisker extended from the hinge to the maxima no further than 1.5 * IQR from the hinge. The lower whisker extended from the hinge to the minima at most 1.5 * IQR of the hinge. d t-SNE map of nuclei isolated from the FFPE sample based on their gene expressions. Six cell types, including two hepatocellular subtypes, were annotated and shown. e Top three markers of each of the six cell types. f Percentages of the nuclei in phase G1, G2M, or S in the six cell types. g The total number of interactions among different cell populations. Circle sizes represented the number of cells in each cell group and edge width represented the communication probability. h The heatmap showing the outgoing signaling patterns (left) and incoming signaling patterns (right) of each cell cluster. i Bubble diagrams showing the communication probability and statistical significance of receptor-ligand pairs in BMP signaling network. Dot color represented communication probabilities and dot size represented computed p values. Empty spaces mean that the communication probability was zero. p-values were computed from one-sided permutation test. Source data are provided as a Source Data file.

snRandom-seq was also performed on an FFPE specimen of human normal HCC subtype (Supplementary Fig. 13a). Based on the snRandom-seq data, sufficient gene count and UMI count were detected, and main cell clusters of the liver were identified (Supplementary Fig. 13b, c). Previous studies have indicated that lncRNAs exhibit tissue-specific expression36,37, which is always ignored in routine single-cell RNA-seq analysis due to their low expression. We found that hepatocyte clusters of normal HCC subtype had a markable expression of lncRNAs, including LINC02476 and LINC01151 in hepatocyte-2, LINC00540, LINC02307, and LINC02109 in hepatocyte-3, LINC02384 in hepatocyte-4 (Supplementary Fig. 13d). It has been reported that LINC02476 promotes the malignant phenotype of HCC by sponging miR-497 and increasing HMGA2 expression38, and LINC00540 influences human HCC progression and metastasis via the NKD2-dependent Wnt/β-Catenin Pathway39. These results suggested that hepatocyte-2 (expressed LINC02476) and hepatocyte-3 (expressed LINC00540) of normal HCC subtype might exhibit different pathogenesis. Taken together, snRandom-seq with the advantages of full-length transcripts coverage shows promise in lncRNA analysis in cancer biology.

We further performed an application of snRandom-seq on a matched pair of initial and relapsed FFPE clinical specimens from the same colorectal cancer liver metastasis (CRLM) patient. snRandom-seq detected medians of ~1000 gene counts and ~2000 UMI counts in both initial and relapsed FFPE specimens (Supplementary Fig. 14a). The cells from the initial and relapsed FFPE specimens were comprehensively integrated, and the major cell types (hepatocytes, cancer cells, T cells, fibroblasts, myeloid cells, endothelial cells, stellate cells, macrophages, cholangiocytes, B/plasma cells) were identified in both samples (Supplementary Fig. 14b, c). We observed that the proportion of T cells was higher in the relapsed FFPE sample (Supplementary Fig. 14d), suggesting a more active antitumor immune response in the relapsed sample. Consistently, the proportions of the dominating cancer clusters (cancer cells-1, −2, and −3) were decreased in the relapsed sample (Supplementary Fig. 14d). However, the proportion of cancer cells-4 was increased in the relapsed sample (Supplementary Fig. 14d). We further found that the genes encoding lipids composition regulator (SCD) and proteins binding lipids (APOA2, APOC3, and APOA1) displayed high expression levels in cancer cell-4 cluster in the relapsed sample (Supplementary Fig. 14e), suggesting an enhanced lipid metabolism in the cancer cells subcluster of the relapsed CRLM.

Read more here: Source link