modENCODE home
Browse Genomes: C.elegans D.melanogaster mine modENCODE: query!!

Global Identification of Transcribed Regions of the C. elegans Genome

Robert Waterston (PI)
Mark Gerstein
Philip Green
Michael MacCoss
David Miller
Valerie Reinke
Frank Slack
University of Washington
Yale University
University of Washington
University of Washington
Vanderbilt University
Yale University
Yale University

Despite the complexity of the challenge, the goal of this proposal is to approach a complete description of the transcribed genome, classified where possible by function such as likely acting as an RNA or protein. Where we cannot definitively classify a sequence, we expect to provide estimates of the likelihood of its function. We will combine all existing data for protein coding genes with gene models from GeneFinder, Twinscan and the current WormBase to define a set of confirmed exon-intron junctions and combine these into complete coding genes with their alternative splice forms. This thorough reannotation of the genome alone will move beyond the current WormBase gene set both by incorporating additional available data, including the emerging genome sequences of related nematodes, and by examining all the datasets across the entire genome.

We will in turn test predicted but unconfirmed exons including UTRs and their intron junctions by specific RT-PCR and RACE and sequence analysis of the products.

We will seek to find new evidence for transcribed regions, exploiting the recently commercially available genome tiling arrays for C. elegans. To maximize the chances for discovering stage and cell specific transcripts, we will use carefully staged populations, mutants lacking or overproducing certain cell types, purified embryonic cell populations and mRNA populations purified from specific post-embryonic cell types.

These data will in turn be added to existing data sets along with any other data emerging from the community to suggest additional gene models for both protein coding and non-coding transcripts. Again these results will guide further RT-PCR and RACE experiments. We expect these data will also be particularly valuable for suggesting previously undefined 5 and 3 prime UTRs.

Finally we will look for evidence of translation of small open reading frames from either single exon genes or spliced transcripts using mass-spectrometry. The overall result should be a usefully comprehensive set of transcripts, cataloged into protein-coding and non-coding RNAs that will aid in the identification of other functional elements in the worm genome as well as provide a resource for broader explorations of gene function in this incomparable worm.

Contact Information

  1. PI contact: Bob Waterston (waterston@gs.washington.edu)
  2. Informatics coordination: LaDeana Hillier (lhillier@watson.wustl.edu)
  3. Microarrays: David Miller (david.miller@vanderbilt.edu), Valerie Reinke (valerie.reinke@yale.edu)
  4. Mass Spec.: Michael MacCoss (maccoss@gs.washington.edu)