3.1. Native RNA unveils additional DEGs and DETs compared to
cDNA.
Sequencing procedures produced 2 x 580 290 571 and 9 238 584 short- and
long reads, respectively. The eight sequencing libraries for both
technology distributed 72 536 321 and 1 154 823 mean raw reads per
library. After trimming short raw reads, 2 x 514 565 651 sequences
survived the quality checkpoint (Supporting Information S1: Table 1).
Using direct RNA sequencing, the genes were characterized according to
coding potential to 12 051 expressed active regions, of which 8 043 were
classified as protein coding, 1 326 as long non-coding RNAs, and 2 677
were classified as other RNAs. DE analysis provided information about 76
significant genes between land and water Riccia form. The 45
genes were signed as DEGs, of which 33 were downregulated
(land-specific) and 12 were upregulated (water-specific). The
logarithmic value of fold change (log2FC) for DEGs ranged from -7.02 to
3.54. Deep transcriptome analysis revealed 9 DELs (8 down- and 1
upregulated) under land-water environmental change. The log2FC values
for DELs were in the range from -6.98 to 1.76. Additionally, the
differential analysis revealed 18 land-specific (with the lowest log2FC
= -6.13) and four water-specific (with the highest log2FC = 1.89)
expression fluctuations for other RNA (Supporting Information S1: Table
2). Co-expression analysis revealed 8 trans-interactions between DEGs -
DELs, 25 trans -interactions between DEG and other RNA, and 4trans -interactions between DELs - and other RNAs. All
interactions were positively correlated based on the Pearson coefficient
(Supporting Information S1: Table 3). The expression profiles of all
DEGs, DELs, and other RNAs were presented in a volcano plot (Figure 1D)
MA-plot (Figure 1E) and heatmap enriched by trans- interactions
(Figure 1C). All significant 76 genes were checked by Illumina RNA-seq
results (Supporting Information S1: Table 4). The correlation across the
expression modification (obtained by Illumina and Nanopore) for these
genes was calculated and the coefficient showed a high value equal to
0.72. (Figure 1B). Interesting that one DEG -
evm.TU.utg2036_2952540_3002010__.5 (annotated as Chlorophyll A-B
binding family protein) was expressed in Nanopore direct RNA only in
plants grown under terrestrial conditions, but has no transcription in
any group sequenced by Illumina technology. The log2FC of 6 significant
genes (with Gene ID; CL.12695, CL.21377, CL.25655, CL.29541, CL.31779,
CL.32326) from direct RNA sequencing did not overlap with the signature
of genes from Illumina sequencing. Certain modifications like m6A, m5C,
pseudouridine, and hm5U have been shown to increase error rates and
reduce fidelity during reverse transcription into cDNA . This is likely
due to interference with proper Watson-Crick base pairing, causing
misincorporations of incorrect nucleotides. RNA modifications can also
cause premature termination or stalling of the reverse transcriptase
enzyme upstream of the modification site , leading to truncated cDNA
products with reduced sequence coverage. Additionally, some
modifications like pseudouridine may induce deletions or mutations in
the synthesized cDNA sequence under certain conditions , further
reducing accuracy. Ontology analysis revealed significance for 192
functional processes which included cytoplasm (GO:0005737; 19 genes),
response to stimulus (GO:0050896; 16), plastid (GO:0009536; 14),
chloroplast (GO:0009507; 12), response to stress (GO:0006950; 10), and
response to abiotic stimulus (GO:0009628; 8.) (Supporting Information
S1: Table 5 and Figure 1A).
Information on the expression of specific transcripts was also revealed
by direct RNA. An analysis of transcript expression showed similar
results, while differences in Riccia fluitans response to
environmental changes were found to be significant and more detailed.
The transcript level analyses revealed expression of 17 064 mRNAs in
both land and aquatic form of Riccia fluitans . The 61 transcripts
were classified as significant, of which 46 transcripts increased
expression in land condition and 15 had higher expression in aquatic
condition. The distribution of log2FC values ranged from -7.8 to 5.39.
Among DETs, 38 were identified as protein coding, while 7 and 16 were
classified as DELs and other RNAs (Supporting Information S1: Table 6).
The distributions of DETs, DELs, and OtherRNA were presented in a
MA-plot (Figure 2C and Supporting Information S2: Figure 1) and a
circular plot with a heatmap (Figure 2D). The direct RNAs expression
values for DETs, DELs and other RNAs were correlated with Illumina
sequencing data. In the result the Pearson coefficient was equal to 0.6
(Figure 2B). Among DELs, two transcripts with unknown function
(CL.16392,CL.16402; CL.16392.1 and evm.model.group3.1783) exhibited
expression solely in Riccia fluitans grown under land conditions,
while Illumina sequencing failed to detect any expression for both
transcripts. Interestingly, our results revealed the eight transcripts
with opposite expression trends in the use of Nanopore and Illumina
sequencing. The most divergent expression profile detection showed
transcript (CL.12695; evm.model.group2.1430) with largest log2FC (form
-3.95 to 2.32) fluctuations in Illumina and Nanopore, respectively
(Figure 2E and Supporting Information S1: Table 6 and 7). The
transcripts were annotated to the 201 GO terms (FDR < 0.05),
such as response to stimulus (GO:0050896), cytoplasm (GO:0005737),
response to stress (GO:0006950), cellular response to stimulus
(GO:0051716), and plastid (GO:0009536) (Figure 2A and Supporting
Information S1: Table 8). Native RNA revealed 27 additional
statistically significant genes (Figure 1D and Figure 1E and Supporting
Information S1: Table 2 and Supporting Information S1: Table 4 and
Supporting Information S2: Figure 2 and Figure 3) and 28 statistically
significant transcripts (Figure 2C and Supporting Information S1: Table
6 and Table 7 and Supporting Information S2: Figure 1, Figure 4 and
Figure 5) through gene and transcript differential analysis,
respectively, when compared to cDNA.
In the case of the evm.model.group2.1430 transcript, applied methods
didn’t reveal any m6A events that can impact reverse transcription, but
other types of modification, due to lack of proper trained model weren’t
identified. While RNA modifications can directly impact reverse
transcription fidelity and coverage, gene expression analyses are still
generally comparable between direct RNA sequencing and cDNA sequencing
approaches . However, properly accounting for modifications is important
for accurate transcriptome characterization. Fasciclin-like domains are
found in a subclass of arabinogalactan proteins (AGPs) known as
FASCICLIN-LIKE ARABINOGALACTAN PROTEINS (FLAs) in plants. These domains
are essential for FLA function and are associated with cell adhesion
functions . Fasciclin domains are typically 110 to 150 amino acids long
and contain two highly conserved regions, H1 and H2, of approximately 10
amino acids each. FLAs are widely distributed in plant tissues and play
roles in plant growth, development, and stress response. InArabidopsis , they have been found to impact secondary cell wall
development, stem biomechanics, and cell wall architecture . They are
also involved in responses to stress and are thought to be involved in
cell adhesion . However, their function and structure in non-seed plants
is poorly explored.