Favorite this paper
How to cite this paper?
Abstract

Over 19 billion reads obtained from an Illumina NovaSeq sequence run of Saccharum sp. derived from 48 multiplexed libraries (PE 2x100bp) were analyzed using FastQC. We conducted rRNA removal and QC filtering. This process resulted in the exclusion of reads associated with 5 microorganisms known to be present in the sample. The mentioned libraries encapsulate data from four distinct sugarcane cultivars, comprising one pathogen-tolerant cultivar and three susceptible cultivars. Subsequently, de novo assembly of contigs was performed using the TRINITY software. In this study, we propose using AlphaFold to elucidate protein function within a subgroup of differentially expressed peptides from the entire assembled contigs. Specifically, this subgroup includes contigs prevalent across all four cultivars and those uniquely found simultaneously within the three susceptible cultivars. Furthermore, contigs within these subsets underwent filtering to isolate those aligning with the Saccharum sp. genome, demonstrating coding potential, differential expression, and classification as unassigned within the COG framework as per the EggNOG program. Using the Multiple Sequence Alignment (MSA) from AlphaFold output, a subgroup of 400 contigs was screened for quality information such as sequence identity and e-value to determine the function by comparing them with known protein structures. Consequently, we were able to ascertain the function of a subgroup of contigs and identify that all organisms with high scores belong to Poaceae but not yet described in Saccharum sp.

 

Share your ideas or questions with the authors!

Did you know that the greatest stimulus in scientific and cultural development is curiosity? Leave your questions or suggestions to the author!

Sign in to interact

Have a question or suggestion? Share your feedback with the authors!

Institutions
  • 1 Genomics and Transposable Elements Laboratory (GaTE Lab), Departamento de Botânica. Universidade de São Paulo (USP) São Paulo-SP
  • 2 Departamento de Genética ESALQ - Universidade de São Paulo (USP) Piracicaba-SP/Brazil.
  • 3 Departamento de Fitopatologia ESALQ - Universidade de São Paulo (USP) Piracicaba-SP/Brazil.
  • 4 Centro de Cana - Instituto Agronômico (IAC) - Ribeirão Preto-SP/Brazil.
Track
  • 1. Protein Dynamics and Function
Keywords
Protein structures
AlphaFold
unknown function
Plant-microbe interaction
protein functional annotation