Gene families as soft cliques with backbones: Amborellacontrasted with other flowering plants

Chunfang Zheng, Alexey Kononenko, Jim Leebens-Mack, Eric H Lyons, David Sankoff

Research output: Contribution to journalArticle

Abstract

Background: Chaining is a major problem in constructing gene families. Results: We define a new kind of cluster on graphs with strong and weak edges: soft cliques with backbones (SCWiB). This differs from other definitions in how it controls the "chaining effect", by ensuring clusters satisfy a tolerant edge density criterion that takes into account cluster size. We implement algorithms for decomposing a graph of similarities into SCWiBs. We compare examples of output from SCWiB and the Markov Cluster Algorithm (MCL), and also compare some curated Arabidopsis thaliana gene families with the results of automatic clustering. We apply our method to 44 published angiosperm genomes with annotation, and discover that Amborella trichopoda is distinct from all the others in having substantially and systematically smaller proportions of moderate- and large-size gene families. Conclusions: We offer several possible evolutionary explanations for this result.

Original languageEnglish (US)
Article numberS8
JournalBMC Genomics
Volume15
Issue number6
DOIs
StatePublished - Oct 17 2014

    Fingerprint

Keywords

  • Amborella trichopeda
  • Angiosperms
  • Clustering
  • Gene families
  • S-plex

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this