Quality Assessment of Peer-Produced Content in Knowledge Repositories using Development and Coordination Activities

Srikar Velichety, Sudha Ram, Jesse C Bockstedt

Research output: Contribution to journalArticle

Abstract

We develop a method to assess the quality of peer-produced content in knowledge repositories using their development and coordination histories. We also develop a process to identify relevant features for quality assessment models and algorithms for processing datasets in large-scale knowledge repositories. Models using these features, on English language Wikipedia articles, outperform existing methods for quality assessment. We achieve an overall accuracy of 81 percent which is a 7 percent improvement over existing models. In addition, our features improve the precision and recall of each class up to 9 percent and 17 percent respectively. Finally, our models are robust to ten-fold cross validation and techniques used for classification. Overall, our research provides a comprehensive design science framework for both identifying and efficiently extracting features related to development and coordination activities and assessing quality using these features. We also provide details of potential implementation of a quality assessment system for knowledge repositories.

Original languageEnglish (US)
Pages (from-to)478-512
Number of pages35
JournalJournal of Management Information Systems
Volume36
Issue number2
DOIs
StatePublished - Apr 3 2019

Fingerprint

Peers
Quality assessment
Repository
Processing
Wikipedia
Cross-validation
Design science

Keywords

  • big data analytics
  • design science
  • knowledge repositories
  • peer-produced content
  • predictive analytics
  • Wikipedia

ASJC Scopus subject areas

  • Management Information Systems
  • Computer Science Applications
  • Management Science and Operations Research
  • Information Systems and Management

Cite this

@article{169582e9b47243478629bd90eea3d381,
title = "Quality Assessment of Peer-Produced Content in Knowledge Repositories using Development and Coordination Activities",
abstract = "We develop a method to assess the quality of peer-produced content in knowledge repositories using their development and coordination histories. We also develop a process to identify relevant features for quality assessment models and algorithms for processing datasets in large-scale knowledge repositories. Models using these features, on English language Wikipedia articles, outperform existing methods for quality assessment. We achieve an overall accuracy of 81 percent which is a 7 percent improvement over existing models. In addition, our features improve the precision and recall of each class up to 9 percent and 17 percent respectively. Finally, our models are robust to ten-fold cross validation and techniques used for classification. Overall, our research provides a comprehensive design science framework for both identifying and efficiently extracting features related to development and coordination activities and assessing quality using these features. We also provide details of potential implementation of a quality assessment system for knowledge repositories.",
keywords = "big data analytics, design science, knowledge repositories, peer-produced content, predictive analytics, Wikipedia",
author = "Srikar Velichety and Sudha Ram and Bockstedt, {Jesse C}",
year = "2019",
month = "4",
day = "3",
doi = "10.1080/07421222.2019.1598692",
language = "English (US)",
volume = "36",
pages = "478--512",
journal = "Journal of Management Information Systems",
issn = "0742-1222",
publisher = "M.E. Sharpe Inc.",
number = "2",

}

TY - JOUR

T1 - Quality Assessment of Peer-Produced Content in Knowledge Repositories using Development and Coordination Activities

AU - Velichety, Srikar

AU - Ram, Sudha

AU - Bockstedt, Jesse C

PY - 2019/4/3

Y1 - 2019/4/3

N2 - We develop a method to assess the quality of peer-produced content in knowledge repositories using their development and coordination histories. We also develop a process to identify relevant features for quality assessment models and algorithms for processing datasets in large-scale knowledge repositories. Models using these features, on English language Wikipedia articles, outperform existing methods for quality assessment. We achieve an overall accuracy of 81 percent which is a 7 percent improvement over existing models. In addition, our features improve the precision and recall of each class up to 9 percent and 17 percent respectively. Finally, our models are robust to ten-fold cross validation and techniques used for classification. Overall, our research provides a comprehensive design science framework for both identifying and efficiently extracting features related to development and coordination activities and assessing quality using these features. We also provide details of potential implementation of a quality assessment system for knowledge repositories.

AB - We develop a method to assess the quality of peer-produced content in knowledge repositories using their development and coordination histories. We also develop a process to identify relevant features for quality assessment models and algorithms for processing datasets in large-scale knowledge repositories. Models using these features, on English language Wikipedia articles, outperform existing methods for quality assessment. We achieve an overall accuracy of 81 percent which is a 7 percent improvement over existing models. In addition, our features improve the precision and recall of each class up to 9 percent and 17 percent respectively. Finally, our models are robust to ten-fold cross validation and techniques used for classification. Overall, our research provides a comprehensive design science framework for both identifying and efficiently extracting features related to development and coordination activities and assessing quality using these features. We also provide details of potential implementation of a quality assessment system for knowledge repositories.

KW - big data analytics

KW - design science

KW - knowledge repositories

KW - peer-produced content

KW - predictive analytics

KW - Wikipedia

UR - http://www.scopus.com/inward/record.url?scp=85067342869&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067342869&partnerID=8YFLogxK

U2 - 10.1080/07421222.2019.1598692

DO - 10.1080/07421222.2019.1598692

M3 - Article

VL - 36

SP - 478

EP - 512

JO - Journal of Management Information Systems

JF - Journal of Management Information Systems

SN - 0742-1222

IS - 2

ER -