4LIPME - Laboratoire des Interactions Plantes Microbes Environnement (Chemin de Borde-Rouge - BP 27 31326 CASTANET TOLOSAN CEDEX - Changement de libellé et de sigle le 1er janvier 2021 - Ancien libellé : Laboratoire des interactions plantes micro-organismes - Ancien sigle : LIPM - France)
Abstract : The availability of next-generation sequences of transcripts from prokaryotic organisms offers the opportunity to design a new generation of automated genome annotation tools not yet available for prokaryotes. In this work, we designed EuGene-P, the first integrative prokaryotic gene finder tool which combines a variety of high-throughput data, including oriented RNA-Seq data, directly into the prediction process. This enables the automated prediction of coding sequences (CDSs), untranslated regions, transcription start sites (TSSs) and non-coding RNA (ncRNA, sense and antisense) genes. EuGene-P was used to comprehensively and accurately annotate the genome of the nitrogen-fixing bacterium Sinorhizobium meliloti strain 2011, leading to the prediction of 6308 CDSs as well as 1876 ncRNAs. Among them, 1280 appeared as antisense to a CDS, which supports recent findings that antisense transcription activity is widespread in bacteria. Moreover, 4077 TSSs upstream of protein-coding or noncoding genes were precisely mapped providing valuable data for the study of promoter regions. By looking for RpoE2-binding sites upstream of annotated TSSs, we were able to extend the S. meliloti RpoE2 regulon by similar to 3-fold. Altogether, these observations demonstrate the power of EuGene-P to produce a reliable and high-resolution automatic annotation of prokaryotic genomes.
https://hal-cnrs.archives-ouvertes.fr/hal-03082947 Contributor : Fernanda de Carvalho-NiebelConnect in order to contact the contributor Submitted on : Thursday, January 14, 2021 - 2:19:15 PM Last modification on : Wednesday, March 23, 2022 - 3:47:13 AM Long-term archiving on: : Thursday, April 15, 2021 - 6:56:36 PM