Nematode Alternative Splicing
Alternative splicing (AS) of mRNA is a vital mechanism for enhancing genomic complexity in eukaryotes. Spliced isoforms of the same gene can have diverse molecular and biological functions and are often differentially expressed across various tissues, life cycle stages, and environmental conditions. Thus, AS has important implications in the study of parasites with complex life cycles. Transcriptomes from many parasitic nematode species have been sequenced as a customary first step in exploring their genetic repertoire. However, most of these transcriptome assemblies were generated using protocols that were not designed to account for AS, so data should be revisited with cDNA-specific assembly software using parameters optimized for high-confidence transcript isoform reconstruction.
cDNA from the model worm Caenorhabditis elegans was sequenced using 454/Roche technology and assembled with Newbler software, invoking the cDNA option. Various combinations of parameters were tested, and the resulting assemblies were compared to C. elegans genes and transcript isoforms to assess accuracy and coverage. Novel transcript isoforms were validated using Illumina RNAseq data.
Through careful adjustment of program parameters, we were able to increase the percentage of isotigs that matched known C. elegans transcript isoforms, decrease mis-assembly rates (i.e., cis- and trans-chimeras), and improve the coverage of the gene set. Our optimized, cDNA-specific assembly protocol was used to update de novo transcriptome assemblies from nine species of parasitic nematodes, including several human and animal pathogens with medical, veterinary, and agricultural importance. These assemblies indicated AS rates in the range of 20-30%, typically with 2-3 transcripts per AS locus, depending on the species. In most cases, the Roche/454 data explored in this study are the only sequences available from the species in question; however, the recently published genome of the human hookworm Necator americanus provided an additional opportunity to validate our results. Transcript isoforms from the nine species were translated and searched for similarity to known proteins and functional domains. Some 21 InterPro domains, including several involved in nucleotide and chromatin binding, were statistically correlated with AS genetic loci.
Our optimized assembly parameters have facilitated the first broad survey of AS among parasitic nematodes. The nine transcriptome assemblies, as well as protein translations and basic annotations, are available from Nematode.net as a resource for the research community. These should be useful for studies of specific genes and gene families of interest as well as for curating genome assemblies as they become available.