TransportTP: A two-phase classification approach for membrane transporter prediction and characterization

Haiquan Li, Vagner A. Benedito, Michael K. Udvardi, Patrick X. Zhao

Research output: Contribution to journalArticle

35 Citations (Scopus)

Abstract

Background: Membrane transporters play crucial roles in living cells. Experimental characterization of transporters is costly and time-consuming. Current computational methods for transporter characterization still require extensive curation efforts, especially for eukaryotic organisms. We developed a novel genome-scale transporter prediction and characterization system called TransportTP that combined homology-based and machine learning methods in a two-phase classification approach. First, traditional homology methods were employed to predict novel transporters based on sequence similarity to known classified proteins in the Transporter Classification Database (TCDB). Second, machine learning methods were used to integrate a variety of features to refine the initial predictions. A set of rules based on transporter features was developed by machine learning using well-curated proteomes as guides.Results: In a cross-validation using the yeast proteome for training and the proteomes of ten other organisms for testing, TransportTP achieved an equivalent recall and precision of 81.8%, based on TransportDB, a manually annotated transporter database. In an independent test using the Arabidopsis proteome for training and four recently sequenced plant proteomes for testing, it achieved a recall of 74.6% and a precision of 73.4%, according to our manual curation.Conclusions: TransportTP is the most effective tool for eukaryotic transporter characterization up to date.

Original languageEnglish (US)
Article number418
JournalBMC Bioinformatics
Volume10
DOIs
StatePublished - Dec 14 2009
Externally publishedYes

Fingerprint

Membrane Transport Proteins
Proteome
Membrane
Proteins
Membranes
Machine Learning
Prediction
Learning systems
Homology
Arabidopsis
Testing
Databases
Cross-validation
Yeast
Computational Methods
Computational methods
Genome
Integrate
Protein
Predict

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Structural Biology
  • Applied Mathematics

Cite this

TransportTP : A two-phase classification approach for membrane transporter prediction and characterization. / Li, Haiquan; Benedito, Vagner A.; Udvardi, Michael K.; Zhao, Patrick X.

In: BMC Bioinformatics, Vol. 10, 418, 14.12.2009.

Research output: Contribution to journalArticle

@article{fe762c34791f43dcb09ef470e739bff4,
title = "TransportTP: A two-phase classification approach for membrane transporter prediction and characterization",
abstract = "Background: Membrane transporters play crucial roles in living cells. Experimental characterization of transporters is costly and time-consuming. Current computational methods for transporter characterization still require extensive curation efforts, especially for eukaryotic organisms. We developed a novel genome-scale transporter prediction and characterization system called TransportTP that combined homology-based and machine learning methods in a two-phase classification approach. First, traditional homology methods were employed to predict novel transporters based on sequence similarity to known classified proteins in the Transporter Classification Database (TCDB). Second, machine learning methods were used to integrate a variety of features to refine the initial predictions. A set of rules based on transporter features was developed by machine learning using well-curated proteomes as guides.Results: In a cross-validation using the yeast proteome for training and the proteomes of ten other organisms for testing, TransportTP achieved an equivalent recall and precision of 81.8{\%}, based on TransportDB, a manually annotated transporter database. In an independent test using the Arabidopsis proteome for training and four recently sequenced plant proteomes for testing, it achieved a recall of 74.6{\%} and a precision of 73.4{\%}, according to our manual curation.Conclusions: TransportTP is the most effective tool for eukaryotic transporter characterization up to date.",
author = "Haiquan Li and Benedito, {Vagner A.} and Udvardi, {Michael K.} and Zhao, {Patrick X.}",
year = "2009",
month = "12",
day = "14",
doi = "10.1186/1471-2105-10-418",
language = "English (US)",
volume = "10",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - TransportTP

T2 - A two-phase classification approach for membrane transporter prediction and characterization

AU - Li, Haiquan

AU - Benedito, Vagner A.

AU - Udvardi, Michael K.

AU - Zhao, Patrick X.

PY - 2009/12/14

Y1 - 2009/12/14

N2 - Background: Membrane transporters play crucial roles in living cells. Experimental characterization of transporters is costly and time-consuming. Current computational methods for transporter characterization still require extensive curation efforts, especially for eukaryotic organisms. We developed a novel genome-scale transporter prediction and characterization system called TransportTP that combined homology-based and machine learning methods in a two-phase classification approach. First, traditional homology methods were employed to predict novel transporters based on sequence similarity to known classified proteins in the Transporter Classification Database (TCDB). Second, machine learning methods were used to integrate a variety of features to refine the initial predictions. A set of rules based on transporter features was developed by machine learning using well-curated proteomes as guides.Results: In a cross-validation using the yeast proteome for training and the proteomes of ten other organisms for testing, TransportTP achieved an equivalent recall and precision of 81.8%, based on TransportDB, a manually annotated transporter database. In an independent test using the Arabidopsis proteome for training and four recently sequenced plant proteomes for testing, it achieved a recall of 74.6% and a precision of 73.4%, according to our manual curation.Conclusions: TransportTP is the most effective tool for eukaryotic transporter characterization up to date.

AB - Background: Membrane transporters play crucial roles in living cells. Experimental characterization of transporters is costly and time-consuming. Current computational methods for transporter characterization still require extensive curation efforts, especially for eukaryotic organisms. We developed a novel genome-scale transporter prediction and characterization system called TransportTP that combined homology-based and machine learning methods in a two-phase classification approach. First, traditional homology methods were employed to predict novel transporters based on sequence similarity to known classified proteins in the Transporter Classification Database (TCDB). Second, machine learning methods were used to integrate a variety of features to refine the initial predictions. A set of rules based on transporter features was developed by machine learning using well-curated proteomes as guides.Results: In a cross-validation using the yeast proteome for training and the proteomes of ten other organisms for testing, TransportTP achieved an equivalent recall and precision of 81.8%, based on TransportDB, a manually annotated transporter database. In an independent test using the Arabidopsis proteome for training and four recently sequenced plant proteomes for testing, it achieved a recall of 74.6% and a precision of 73.4%, according to our manual curation.Conclusions: TransportTP is the most effective tool for eukaryotic transporter characterization up to date.

UR - http://www.scopus.com/inward/record.url?scp=75149162948&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=75149162948&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-10-418

DO - 10.1186/1471-2105-10-418

M3 - Article

C2 - 20003433

AN - SCOPUS:75149162948

VL - 10

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 418

ER -