A probabilistic model for approximate identity matching

G. Alan Wang, Hsinchun Chen, Homa Atabakhsh

Research output: Contribution to conferencePaper

Abstract

Identity management is critical to various governmental practices ranging from providing citizens services to enforcing homeland security. The task of searching for a specific identity is difficult because multiple identity representations may exist due to issues related to unintentional errors and intentional deception. We propose a probabilistic Naïve Bayes model that improves existing identity matching techniques in terms of effectiveness. Experiments show that our proposed model performs significantly better than the exact-match based technique as well as the approximate-match based record comparison algorithm. In addition, our model greatly reduces the efforts of manually labeling training instances by employing a semi-supervised learning approach. This training method outperforms both fully supervised and unsupervised learning. With a training dataset that only contains 10% labeled instances, our model achieves a performance comparable to that of a fully supervised learning.

Original languageEnglish (US)
Pages462-463
Number of pages2
DOIs
StatePublished - Dec 1 2006
Event7th Annual International Conference on Digital Government Research, Dg.o 2006 - San Diego, CA, United States
Duration: May 21 2006May 24 2006

Other

Other7th Annual International Conference on Digital Government Research, Dg.o 2006
CountryUnited States
CitySan Diego, CA
Period5/21/065/24/06

Keywords

  • Identity matching
  • Naïve Bayes model
  • Semi-supervised learning

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'A probabilistic model for approximate identity matching'. Together they form a unique fingerprint.

  • Cite this

    Wang, G. A., Chen, H., & Atabakhsh, H. (2006). A probabilistic model for approximate identity matching. 462-463. Paper presented at 7th Annual International Conference on Digital Government Research, Dg.o 2006, San Diego, CA, United States. https://doi.org/10.1145/1146598.1146750