The Arizona IDMatcher: A probabilistic identity matching system

G. Alan Wang, Siddharth Kaza, Shailesh Joshi, Kris Chang, Homa Atabakhsh, Hsinchun Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Various law enforcement and intelligence tasks require managing identity information in an effective and efficient way. However, the quality issues of identity information make this task non-trivial. Various heuristic based systems have been developed to tackle the identity matching problem. However, deploying such systems may require special expertise in system configuration and customization for optimal system performance. In this paper, we propose an alternative system called the Arizona IDMatcher. The system relies on a machine learning algorithm to automatically generate a decision model for identity matching. Such a system requires minimal human configuration effort. Experiments show that the Arizona IDMatcher is very efficient in detecting matching identity records. Compared to IBM Identity Resolution (a commercial, heuristic-based system), the Arizona IDMatcher achieves better recall and overall F-measures in identifying matching identities in two large-scale real-world datasets.

Original languageEnglish (US)
Title of host publicationISI 2007: 2007 IEEE Intelligence and Security Informatics
Pages229-235
Number of pages7
StatePublished - 2007
EventISI 2007: 2007 IEEE Intelligence and Security Informatics - New Brunswick, NJ, United States
Duration: May 23 2007May 24 2007

Other

OtherISI 2007: 2007 IEEE Intelligence and Security Informatics
CountryUnited States
CityNew Brunswick, NJ
Period5/23/075/24/07

Fingerprint

Optimal systems
Law enforcement
Learning algorithms
Learning systems
Experiments

Keywords

  • Adaptive detection
  • Fuzzy search
  • Identity matching
  • Identity resolution
  • Naïve bayes

ASJC Scopus subject areas

  • Computer Science(all)
  • Control and Systems Engineering

Cite this

Wang, G. A., Kaza, S., Joshi, S., Chang, K., Atabakhsh, H., & Chen, H. (2007). The Arizona IDMatcher: A probabilistic identity matching system. In ISI 2007: 2007 IEEE Intelligence and Security Informatics (pp. 229-235). [4258703]

The Arizona IDMatcher : A probabilistic identity matching system. / Wang, G. Alan; Kaza, Siddharth; Joshi, Shailesh; Chang, Kris; Atabakhsh, Homa; Chen, Hsinchun.

ISI 2007: 2007 IEEE Intelligence and Security Informatics. 2007. p. 229-235 4258703.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wang, GA, Kaza, S, Joshi, S, Chang, K, Atabakhsh, H & Chen, H 2007, The Arizona IDMatcher: A probabilistic identity matching system. in ISI 2007: 2007 IEEE Intelligence and Security Informatics., 4258703, pp. 229-235, ISI 2007: 2007 IEEE Intelligence and Security Informatics, New Brunswick, NJ, United States, 5/23/07.
Wang GA, Kaza S, Joshi S, Chang K, Atabakhsh H, Chen H. The Arizona IDMatcher: A probabilistic identity matching system. In ISI 2007: 2007 IEEE Intelligence and Security Informatics. 2007. p. 229-235. 4258703
Wang, G. Alan ; Kaza, Siddharth ; Joshi, Shailesh ; Chang, Kris ; Atabakhsh, Homa ; Chen, Hsinchun. / The Arizona IDMatcher : A probabilistic identity matching system. ISI 2007: 2007 IEEE Intelligence and Security Informatics. 2007. pp. 229-235
@inproceedings{7447dbf224cc4a699b47bcea24053bc3,
title = "The Arizona IDMatcher: A probabilistic identity matching system",
abstract = "Various law enforcement and intelligence tasks require managing identity information in an effective and efficient way. However, the quality issues of identity information make this task non-trivial. Various heuristic based systems have been developed to tackle the identity matching problem. However, deploying such systems may require special expertise in system configuration and customization for optimal system performance. In this paper, we propose an alternative system called the Arizona IDMatcher. The system relies on a machine learning algorithm to automatically generate a decision model for identity matching. Such a system requires minimal human configuration effort. Experiments show that the Arizona IDMatcher is very efficient in detecting matching identity records. Compared to IBM Identity Resolution (a commercial, heuristic-based system), the Arizona IDMatcher achieves better recall and overall F-measures in identifying matching identities in two large-scale real-world datasets.",
keywords = "Adaptive detection, Fuzzy search, Identity matching, Identity resolution, Na{\"i}ve bayes",
author = "Wang, {G. Alan} and Siddharth Kaza and Shailesh Joshi and Kris Chang and Homa Atabakhsh and Hsinchun Chen",
year = "2007",
language = "English (US)",
isbn = "1424413303",
pages = "229--235",
booktitle = "ISI 2007: 2007 IEEE Intelligence and Security Informatics",

}

TY - GEN

T1 - The Arizona IDMatcher

T2 - A probabilistic identity matching system

AU - Wang, G. Alan

AU - Kaza, Siddharth

AU - Joshi, Shailesh

AU - Chang, Kris

AU - Atabakhsh, Homa

AU - Chen, Hsinchun

PY - 2007

Y1 - 2007

N2 - Various law enforcement and intelligence tasks require managing identity information in an effective and efficient way. However, the quality issues of identity information make this task non-trivial. Various heuristic based systems have been developed to tackle the identity matching problem. However, deploying such systems may require special expertise in system configuration and customization for optimal system performance. In this paper, we propose an alternative system called the Arizona IDMatcher. The system relies on a machine learning algorithm to automatically generate a decision model for identity matching. Such a system requires minimal human configuration effort. Experiments show that the Arizona IDMatcher is very efficient in detecting matching identity records. Compared to IBM Identity Resolution (a commercial, heuristic-based system), the Arizona IDMatcher achieves better recall and overall F-measures in identifying matching identities in two large-scale real-world datasets.

AB - Various law enforcement and intelligence tasks require managing identity information in an effective and efficient way. However, the quality issues of identity information make this task non-trivial. Various heuristic based systems have been developed to tackle the identity matching problem. However, deploying such systems may require special expertise in system configuration and customization for optimal system performance. In this paper, we propose an alternative system called the Arizona IDMatcher. The system relies on a machine learning algorithm to automatically generate a decision model for identity matching. Such a system requires minimal human configuration effort. Experiments show that the Arizona IDMatcher is very efficient in detecting matching identity records. Compared to IBM Identity Resolution (a commercial, heuristic-based system), the Arizona IDMatcher achieves better recall and overall F-measures in identifying matching identities in two large-scale real-world datasets.

KW - Adaptive detection

KW - Fuzzy search

KW - Identity matching

KW - Identity resolution

KW - Naïve bayes

UR - http://www.scopus.com/inward/record.url?scp=34748907260&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34748907260&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:34748907260

SN - 1424413303

SN - 9781424413300

SP - 229

EP - 235

BT - ISI 2007: 2007 IEEE Intelligence and Security Informatics

ER -