Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data.

Haiquan Li, J. Li, S. H. Tan, S. K. Ng

Research output: Chapter in Book/Report/Conference proceedingChapter

9 Citations (Scopus)

Abstract

Unravelling the underlying mechanisms of protein interactions requires knowledge about the interactions' binding sites. In this paper, we use a novel concept, binding motif pairs, to describe binding sites. A binding motif pair consists of two motifs each derived from one side of the binding protein sequences. The discovery is a directed approach that uses a combination of two data sources: 3-D structures of protein complexes and sequences of interacting proteins. We first extract maximal contact segment pairs from the protein complexes' structural data. We then use these segment pairs as templates to sub-group the interacting protein sequence dataset, and conduct an iterative refinement to derive significant binding motif pairs. This combination approach is efficient in handling large datasets of protein interactions. From a dataset of 78,390 protein interactions, we have discovered 896 significant binding motif pairs. The discovered motif pairs include many novel motif pairs as well as motifs that agree well with experimentally validated patterns in the literature.

Original languageEnglish (US)
Title of host publicationPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Pages312-323
Number of pages12
StatePublished - 2004
Externally publishedYes

Fingerprint

Amino Acid Motifs
Proteins
Binding Sites
Information Storage and Retrieval
Carrier Proteins
Datasets

Cite this

Li, H., Li, J., Tan, S. H., & Ng, S. K. (2004). Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data. In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing (pp. 312-323)

Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data. / Li, Haiquan; Li, J.; Tan, S. H.; Ng, S. K.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 2004. p. 312-323.

Research output: Chapter in Book/Report/Conference proceedingChapter

Li, H, Li, J, Tan, SH & Ng, SK 2004, Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data. in Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. pp. 312-323.
Li H, Li J, Tan SH, Ng SK. Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data. In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 2004. p. 312-323
Li, Haiquan ; Li, J. ; Tan, S. H. ; Ng, S. K. / Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 2004. pp. 312-323
@inbook{c15ab8eddcf047f09a7f9c6877a582cf,
title = "Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data.",
abstract = "Unravelling the underlying mechanisms of protein interactions requires knowledge about the interactions' binding sites. In this paper, we use a novel concept, binding motif pairs, to describe binding sites. A binding motif pair consists of two motifs each derived from one side of the binding protein sequences. The discovery is a directed approach that uses a combination of two data sources: 3-D structures of protein complexes and sequences of interacting proteins. We first extract maximal contact segment pairs from the protein complexes' structural data. We then use these segment pairs as templates to sub-group the interacting protein sequence dataset, and conduct an iterative refinement to derive significant binding motif pairs. This combination approach is efficient in handling large datasets of protein interactions. From a dataset of 78,390 protein interactions, we have discovered 896 significant binding motif pairs. The discovered motif pairs include many novel motif pairs as well as motifs that agree well with experimentally validated patterns in the literature.",
author = "Haiquan Li and J. Li and Tan, {S. H.} and Ng, {S. K.}",
year = "2004",
language = "English (US)",
pages = "312--323",
booktitle = "Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing",

}

TY - CHAP

T1 - Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data.

AU - Li, Haiquan

AU - Li, J.

AU - Tan, S. H.

AU - Ng, S. K.

PY - 2004

Y1 - 2004

N2 - Unravelling the underlying mechanisms of protein interactions requires knowledge about the interactions' binding sites. In this paper, we use a novel concept, binding motif pairs, to describe binding sites. A binding motif pair consists of two motifs each derived from one side of the binding protein sequences. The discovery is a directed approach that uses a combination of two data sources: 3-D structures of protein complexes and sequences of interacting proteins. We first extract maximal contact segment pairs from the protein complexes' structural data. We then use these segment pairs as templates to sub-group the interacting protein sequence dataset, and conduct an iterative refinement to derive significant binding motif pairs. This combination approach is efficient in handling large datasets of protein interactions. From a dataset of 78,390 protein interactions, we have discovered 896 significant binding motif pairs. The discovered motif pairs include many novel motif pairs as well as motifs that agree well with experimentally validated patterns in the literature.

AB - Unravelling the underlying mechanisms of protein interactions requires knowledge about the interactions' binding sites. In this paper, we use a novel concept, binding motif pairs, to describe binding sites. A binding motif pair consists of two motifs each derived from one side of the binding protein sequences. The discovery is a directed approach that uses a combination of two data sources: 3-D structures of protein complexes and sequences of interacting proteins. We first extract maximal contact segment pairs from the protein complexes' structural data. We then use these segment pairs as templates to sub-group the interacting protein sequence dataset, and conduct an iterative refinement to derive significant binding motif pairs. This combination approach is efficient in handling large datasets of protein interactions. From a dataset of 78,390 protein interactions, we have discovered 896 significant binding motif pairs. The discovered motif pairs include many novel motif pairs as well as motifs that agree well with experimentally validated patterns in the literature.

UR - http://www.scopus.com/inward/record.url?scp=2442704465&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2442704465&partnerID=8YFLogxK

M3 - Chapter

SP - 312

EP - 323

BT - Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

ER -