Fizzy: Feature subset selection for metagenomics

Gregory Ditzler, J. Calvin Morrison, Yemin Lan, Gail L. Rosen

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using aα- & β-diversity. Feature subset selection - a sub-field of machine learning - can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.

Original languageEnglish (US)
Article number358
JournalBMC bioinformatics
Volume16
Issue number1
DOIs
StatePublished - Nov 4 2015

Keywords

  • Comparative metagenomics
  • Feature subset selection
  • Open-source software

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Fizzy: Feature subset selection for metagenomics'. Together they form a unique fingerprint.

Cite this