Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data.

H. Li, J. Li, S. H. Tan, S. K. Ng

Research output: Contribution to journalArticle

9 Scopus citations


Unravelling the underlying mechanisms of protein interactions requires knowledge about the interactions' binding sites. In this paper, we use a novel concept, binding motif pairs, to describe binding sites. A binding motif pair consists of two motifs each derived from one side of the binding protein sequences. The discovery is a directed approach that uses a combination of two data sources: 3-D structures of protein complexes and sequences of interacting proteins. We first extract maximal contact segment pairs from the protein complexes' structural data. We then use these segment pairs as templates to sub-group the interacting protein sequence dataset, and conduct an iterative refinement to derive significant binding motif pairs. This combination approach is efficient in handling large datasets of protein interactions. From a dataset of 78,390 protein interactions, we have discovered 896 significant binding motif pairs. The discovered motif pairs include many novel motif pairs as well as motifs that agree well with experimentally validated patterns in the literature.

Original languageEnglish (US)
Pages (from-to)312-323
Number of pages12
JournalPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
StatePublished - 2004
Externally publishedYes


ASJC Scopus subject areas

  • Medicine(all)

Cite this