Researchers from Argonne National Laboratory, The University of Chicago and San Diego State University have developed an open source system for the automated processing of metagenomic sequence data.
The new metagenomic-Rast server is designed for use with data generated by 454 Sequencing systems.
The study of microbial communities from environmental samples, known as metagenomics, has been advanced by high-throughput low-cost sequencing such as 454's Genome Sequencer FLX System.
With the growing use of sequencing approaches for metagenomic analysis, the major research challenges have shifted from generating to analysing data.
The open source system was developed to resolve these analysis bottlenecks.
The resource, designed for data files generated by the 454 Sequencing platform, generates phylogenetic and functional summaries of the genomes by comparing the sequence against protein and nucleotide databases.
The metagenomics-Rast server is based on the Seed framework for comparative genomics.
Researchers can upload data sets in the file format generated by the GS FLX instrument, either as raw reads or as assembled contigs.
Rob Edwards, the project lead, said: 'We built this analysis tool with specific consideration for 454 Sequencing data sets.
'Only the long sequence-reads from the GS FLX System ensure the specificity needed to compare data against DNA or protein databases for functional annotation, making it the platform of choice for metagenomic analysis.' Metagenomes uploaded to the high-throughput pipeline are compared against a variety of known sequence databases, including rRNA and mitochondrial databases, and are screened for protein-encoding genes.
The tool provides the data types needed for phylogenetic comparisons, functional annotations, binning of sequences, phylogenomic profiling and metabolic reconstructions.
The study, titled 'The Metagenomic-Rast Server - a Public Resource for the Automatic Phylogenetic and Functional Analysis of Metagenomes', appears in the journal BMC Bioinformatics.
The 454 Life Sciences group develops and commercialises the 454 Sequencing system for ultra-high-throughput DNA sequencing.
Specific applications include de novo sequencing and re-sequencing of genomes, metagenomics, RNA analysis and targeted sequencing of DNA regions of interest.
The 454 Sequencing system allows simple and unbiased sample preparation and long and accurate sequence reads, including paired-end reads.
The 454 Sequencing system has enabled hundreds of peer-reviewed studies in diverse research fields such as cancer and infectious disease research, drug discovery, marine biology, anthropology, paleontology and more.