The bioinfomaticians of the Australian Centre for Ecogenomics actively design and develop software used in the analysis of ecogenomic datasets, software which furthers the research aims of the centre. 

Please read further for examples of software currently used.

RefineM

RefineM is a set of tools for improving population genomes. It provides methods designed to improve the completeness of a genome along with methods for identifying and removing contamination. RefineM comprises only part of a full genome QC pipeline and should be used in conjunction with existing QC tools such as CheckM.

Please see https://github.com/dparks1134/RefineM.

CheckM

CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. It provides robust estimates of genome completeness and contamination by using collocated sets of genes that are ubiquitous and single-copy within a phylogenetic lineage. Assessment of genome quality can also be examined using plots depicting key genomic characteristics (e.g., GC, coding density) which highlight sequences outside the expected distributions of a typical genome. CheckM also provides tools for identifying genome bins that are likely candidates for merging based on marker set compatibility, similarity in genomic characteristics, and proximity within a reference genome tree.

Please see https://ecogenomics.github.io/CheckM/ http://genome.cshlp.org/content/25/7/1043.full

GraftM

GraftM is a meta-omic tool that identifies and classifies marker genes in short read datasets (metagenomes and metatranscriptomes), as well as assembled contigs, whole genomes and protein sequences. GraftM outputs a taxonomic/functional summary table, a krona plot, as well as various other run statistics. Both unaligned and aligned "hit" sequences are provided. GraftM is designed for speed and accuracy: it is able to find marker genes in a 200Mb of assembled metagenome in <20 sec, and compares favourably with similar tools in accuracy benchmarking.

Please see http://geronimp.github.io/graftM

GroopM


GroopM is a metagenomic binning toolset. It leverages spatio-temoral dynamics (differential coverage) to accurately (and almost automatically) extract population genomes from multi-sample metagenomic datasets.

Please see http://ecogenomics.github.io/GroopM https://peerj.com/articles/603

BamM

BamM is a C library, wrapped in python, to efficiently generate and parse BAM files, specifically for the analysis of metagenomic data.  For instance, it implements several methods to assess
contig-wise read coverage, and provides a convenient interface for mapping multiple sequencing libraries against an assembly.

Please see http://ecogenomics.github.io/BamM/

FinishM

Metagenome and isolate assemblers generate contigs from reads, but still leave valuable information on the table. FinishM exploits this information to improve/finish a draft genome without any further laboratory-based work.

In even a moderately successful assembly, resultant contigs constitute the vast majority of the genome being sequenced, but this fact is ignored by assemblers. Unlike a traditional assembler FinishM does not attempt to directly extend contigs, but instead focuses on connecting already assembled contigs.

Please see https://github.com/wwood/finishm

SingleM

SingleM is a tool to find the abundances of discrete operational taxonomic units (OTUs) directly from shotgun metagenome data, without heavy reliance of reference sequence databases. It is able to differentiate closely related species even if those species are from lineages new to science.

Please see https://github.com/wwood/singlem

OrfM

A simple and not slow open reading frame (ORF) caller. No bells or whistles like frameshift detection, just a straightforward goal of returning a FASTA file of open reading frames over a certain length from a FASTA/Q file of nucleotide sequences.

Please see https://github.com/wwood/OrfM http://bioinformatics.oxfordjournals.org/content/early/2016/06/02/bioinf...
 

Last updated 1 September 2017
Last reviewed 13 July 2016

You are here

Address

Australian Centre for Ecogenomics
Level 5, Molecular Biosciences Bldg
University of Queensland
ST LUCIA QLD 4072
Brisbane, Australia

Stay connected

Copyright

© 2010-2017 Australian Centre for Ecogenomics