3.1. Native RNA unveils additional DEGs and DETs compared to cDNA.
Sequencing procedures produced 2 x 580 290 571 and 9 238 584 short- and long reads, respectively. The eight sequencing libraries for both technology distributed 72 536 321 and 1 154 823 mean raw reads per library. After trimming short raw reads, 2 x 514 565 651 sequences survived the quality checkpoint (Supporting Information S1: Table 1).
Using direct RNA sequencing, the genes were characterized according to coding potential to 12 051 expressed active regions, of which 8 043 were classified as protein coding, 1 326 as long non-coding RNAs, and 2 677 were classified as other RNAs. DE analysis provided information about 76 significant genes between land and water Riccia form. The 45 genes were signed as DEGs, of which 33 were downregulated (land-specific) and 12 were upregulated (water-specific). The logarithmic value of fold change (log2FC) for DEGs ranged from -7.02 to 3.54. Deep transcriptome analysis revealed 9 DELs (8 down- and 1 upregulated) under land-water environmental change. The log2FC values for DELs were in the range from -6.98 to 1.76. Additionally, the differential analysis revealed 18 land-specific (with the lowest log2FC = -6.13) and four water-specific (with the highest log2FC = 1.89) expression fluctuations for other RNA (Supporting Information S1: Table 2). Co-expression analysis revealed 8 trans-interactions between DEGs - DELs, 25 trans -interactions between DEG and other RNA, and 4trans -interactions between DELs - and other RNAs. All interactions were positively correlated based on the Pearson coefficient (Supporting Information S1: Table 3). The expression profiles of all DEGs, DELs, and other RNAs were presented in a volcano plot (Figure 1D) MA-plot (Figure 1E) and heatmap enriched by trans- interactions (Figure 1C). All significant 76 genes were checked by Illumina RNA-seq results (Supporting Information S1: Table 4). The correlation across the expression modification (obtained by Illumina and Nanopore) for these genes was calculated and the coefficient showed a high value equal to 0.72. (Figure 1B). Interesting that one DEG - evm.TU.utg2036_2952540_3002010__.5 (annotated as Chlorophyll A-B binding family protein) was expressed in Nanopore direct RNA only in plants grown under terrestrial conditions, but has no transcription in any group sequenced by Illumina technology. The log2FC of 6 significant genes (with Gene ID; CL.12695, CL.21377, CL.25655, CL.29541, CL.31779, CL.32326) from direct RNA sequencing did not overlap with the signature of genes from Illumina sequencing. Certain modifications like m6A, m5C, pseudouridine, and hm5U have been shown to increase error rates and reduce fidelity during reverse transcription into cDNA . This is likely due to interference with proper Watson-Crick base pairing, causing misincorporations of incorrect nucleotides. RNA modifications can also cause premature termination or stalling of the reverse transcriptase enzyme upstream of the modification site , leading to truncated cDNA products with reduced sequence coverage. Additionally, some modifications like pseudouridine may induce deletions or mutations in the synthesized cDNA sequence under certain conditions , further reducing accuracy. Ontology analysis revealed significance for 192 functional processes which included cytoplasm (GO:0005737; 19 genes), response to stimulus (GO:0050896; 16), plastid (GO:0009536; 14), chloroplast (GO:0009507; 12), response to stress (GO:0006950; 10), and response to abiotic stimulus (GO:0009628; 8.) (Supporting Information S1: Table 5 and Figure 1A).
Information on the expression of specific transcripts was also revealed by direct RNA. An analysis of transcript expression showed similar results, while differences in Riccia fluitans response to environmental changes were found to be significant and more detailed. The transcript level analyses revealed expression of 17 064 mRNAs in both land and aquatic form of Riccia fluitans . The 61 transcripts were classified as significant, of which 46 transcripts increased expression in land condition and 15 had higher expression in aquatic condition. The distribution of log2FC values ranged from -7.8 to 5.39. Among DETs, 38 were identified as protein coding, while 7 and 16 were classified as DELs and other RNAs (Supporting Information S1: Table 6). The distributions of DETs, DELs, and OtherRNA were presented in a MA-plot (Figure 2C and Supporting Information S2: Figure 1) and a circular plot with a heatmap (Figure 2D). The direct RNAs expression values for DETs, DELs and other RNAs were correlated with Illumina sequencing data. In the result the Pearson coefficient was equal to 0.6 (Figure 2B). Among DELs, two transcripts with unknown function (CL.16392,CL.16402; CL.16392.1 and evm.model.group3.1783) exhibited expression solely in Riccia fluitans grown under land conditions, while Illumina sequencing failed to detect any expression for both transcripts. Interestingly, our results revealed the eight transcripts with opposite expression trends in the use of Nanopore and Illumina sequencing. The most divergent expression profile detection showed transcript (CL.12695; evm.model.group2.1430) with largest log2FC (form -3.95 to 2.32) fluctuations in Illumina and Nanopore, respectively (Figure 2E and Supporting Information S1: Table 6 and 7). The transcripts were annotated to the 201 GO terms (FDR < 0.05), such as response to stimulus (GO:0050896), cytoplasm (GO:0005737), response to stress (GO:0006950), cellular response to stimulus (GO:0051716), and plastid (GO:0009536) (Figure 2A and Supporting Information S1: Table 8). Native RNA revealed 27 additional statistically significant genes (Figure 1D and Figure 1E and Supporting Information S1: Table 2 and Supporting Information S1: Table 4 and Supporting Information S2: Figure 2 and Figure 3) and 28 statistically significant transcripts (Figure 2C and Supporting Information S1: Table 6 and Table 7 and Supporting Information S2: Figure 1, Figure 4 and Figure 5) through gene and transcript differential analysis, respectively, when compared to cDNA.
In the case of the evm.model.group2.1430 transcript, applied methods didn’t reveal any m6A events that can impact reverse transcription, but other types of modification, due to lack of proper trained model weren’t identified. While RNA modifications can directly impact reverse transcription fidelity and coverage, gene expression analyses are still generally comparable between direct RNA sequencing and cDNA sequencing approaches . However, properly accounting for modifications is important for accurate transcriptome characterization. Fasciclin-like domains are found in a subclass of arabinogalactan proteins (AGPs) known as FASCICLIN-LIKE ARABINOGALACTAN PROTEINS (FLAs) in plants. These domains are essential for FLA function and are associated with cell adhesion functions . Fasciclin domains are typically 110 to 150 amino acids long and contain two highly conserved regions, H1 and H2, of approximately 10 amino acids each. FLAs are widely distributed in plant tissues and play roles in plant growth, development, and stress response. InArabidopsis , they have been found to impact secondary cell wall development, stem biomechanics, and cell wall architecture . They are also involved in responses to stress and are thought to be involved in cell adhesion . However, their function and structure in non-seed plants is poorly explored.