Writeprint-based identification is getting very popular in crime investigations due to increasing cybercrime incidents, and unavailability of fingerprints in cybercrime. Writeprint is composed of multiple features, such as vocabulary richness, length of sentence, use of function words, layout of paragraphs, and keywords. These writeprint features can represent an author's writing style, which is usually consistent across his or her writings, and become the basis of authorship analysis. A GA-baased feature selection model to identify writeprint features, can generate different combinations of features to achieve the highest fitness value. These selected key feature of writeprint, corresponding to the high accuracy of classification, is able to effectively represent the distinct writing style of author and can assist in identifying the authorship of online messages.
|Original language||English (US)|
|Number of pages||7|
|Journal||Communications of the ACM|
|State||Published - Jun 26 2006|
ASJC Scopus subject areas
- Computer Science(all)