Public sharing of medical advice using social media: An analysis of Twitter

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Introduction: Social media tools, such as Facebook®, Twitter™, blogs and online communities, are increasingly utilized for networking and to distribute information in medicine and public health. Participation in these media has increased sharply over the past decade. Six years ago, Twitter did not exist yet now an estimated 15% of the world population subscribes to Twitter. This has created a large-scale, complex, and unindexed publicly available data source. Goal: We sought to understand the richness and novelty of health-related Tweets by analyzing the characteristics of health information-focused tweets using automated and manual analysis. Research methods: Utilizing the Twitter Search application programming interface (API) we retrieved two sets of English language tweets using keywords related to asthma (#asthma and asthma). Tweets were categorized by the assumed source (retweeted by a person, sent by organization, originated by an individual) and content (containing medication, symptoms, triggers, a combination, or none of these) using natural language processing. Regarding tweet source we assumed that tweets retweeted to a person (i.e., @username) were sent by an individual; those not retweeted that contained a URL were sent by an organization; and those tweets remaining were original content tweeted by an individual. Regarding content categorization, we used lexicons containing terms for asthma medication, symptoms, and five different types of asthma triggers (activities, air pollutants, allergens, environmental and irritants). In addition, we conducted content analysis using a combined text mining and manual approach. Applying association rule mining to the tweets, we generated an overview of the most frequency combination of terms presented as if-then rules. The manual, in-depth analysis evaluated a random sample of 200 tweets for originality, content, credibility and relevance. Costs: The costs associated with this project were time to process tweets. While over 500 million tweets are generated daily, the cost of this information distribution is shared among millions of Twitter subscribers. Results: The analysis showed that the majority of tweets contain URLS and many are retweeted. The proportion of tweets containing personal, new content is small. The majority of tweets are sent by organizations, both commercial and noncommercial, and the content are broad facts and statements. Both medication and environmental triggers are common topics. Conclusion: The high diversity in topics and terminology combined with the small proportion of personal tweets should be taken into account when using Twitter as a resource for tracking and discovering health behaviors or problems in the population. The large proportion of tweets referring to external information may make this a very useful tool for accessing grey literature and using the tweets as descriptors. Further research is needed to create comprehensive vocabularies and methods to efficiently labels tweets.

Original languageEnglish (US)
Title of host publication17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Proceedings
PublisherTextRelease
Pages83-93
Number of pages11
ISBN (Electronic)9789077484272
StatePublished - 2016
Event17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Amsterdam, Netherlands
Duration: Dec 1 2015Dec 2 2015

Other

Other17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016
CountryNetherlands
CityAmsterdam
Period12/1/1512/2/15

Fingerprint

twitter
social media
Health
Allergens
Costs
medication
Blogs
Association rules
Public health
Terminology
Application programming interfaces (API)
Medicine
Labels
Websites
costs
gray literature
organization
human being
world population
internet community

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Library and Information Sciences

Cite this

Leroy, G. A., Harber, P. I., & Revere, D. (2016). Public sharing of medical advice using social media: An analysis of Twitter. In 17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Proceedings (pp. 83-93). TextRelease.

Public sharing of medical advice using social media : An analysis of Twitter. / Leroy, Gondy Augusta; Harber, Philip I; Revere, Debra.

17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Proceedings. TextRelease, 2016. p. 83-93.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Leroy, GA, Harber, PI & Revere, D 2016, Public sharing of medical advice using social media: An analysis of Twitter. in 17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Proceedings. TextRelease, pp. 83-93, 17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016, Amsterdam, Netherlands, 12/1/15.
Leroy GA, Harber PI, Revere D. Public sharing of medical advice using social media: An analysis of Twitter. In 17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Proceedings. TextRelease. 2016. p. 83-93
Leroy, Gondy Augusta ; Harber, Philip I ; Revere, Debra. / Public sharing of medical advice using social media : An analysis of Twitter. 17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Proceedings. TextRelease, 2016. pp. 83-93
@inproceedings{5a2c9fa4568e48faaa621d828510dec0,
title = "Public sharing of medical advice using social media: An analysis of Twitter",
abstract = "Introduction: Social media tools, such as Facebook{\circledR}, Twitter™, blogs and online communities, are increasingly utilized for networking and to distribute information in medicine and public health. Participation in these media has increased sharply over the past decade. Six years ago, Twitter did not exist yet now an estimated 15{\%} of the world population subscribes to Twitter. This has created a large-scale, complex, and unindexed publicly available data source. Goal: We sought to understand the richness and novelty of health-related Tweets by analyzing the characteristics of health information-focused tweets using automated and manual analysis. Research methods: Utilizing the Twitter Search application programming interface (API) we retrieved two sets of English language tweets using keywords related to asthma (#asthma and asthma). Tweets were categorized by the assumed source (retweeted by a person, sent by organization, originated by an individual) and content (containing medication, symptoms, triggers, a combination, or none of these) using natural language processing. Regarding tweet source we assumed that tweets retweeted to a person (i.e., @username) were sent by an individual; those not retweeted that contained a URL were sent by an organization; and those tweets remaining were original content tweeted by an individual. Regarding content categorization, we used lexicons containing terms for asthma medication, symptoms, and five different types of asthma triggers (activities, air pollutants, allergens, environmental and irritants). In addition, we conducted content analysis using a combined text mining and manual approach. Applying association rule mining to the tweets, we generated an overview of the most frequency combination of terms presented as if-then rules. The manual, in-depth analysis evaluated a random sample of 200 tweets for originality, content, credibility and relevance. Costs: The costs associated with this project were time to process tweets. While over 500 million tweets are generated daily, the cost of this information distribution is shared among millions of Twitter subscribers. Results: The analysis showed that the majority of tweets contain URLS and many are retweeted. The proportion of tweets containing personal, new content is small. The majority of tweets are sent by organizations, both commercial and noncommercial, and the content are broad facts and statements. Both medication and environmental triggers are common topics. Conclusion: The high diversity in topics and terminology combined with the small proportion of personal tweets should be taken into account when using Twitter as a resource for tracking and discovering health behaviors or problems in the population. The large proportion of tweets referring to external information may make this a very useful tool for accessing grey literature and using the tweets as descriptors. Further research is needed to create comprehensive vocabularies and methods to efficiently labels tweets.",
author = "Leroy, {Gondy Augusta} and Harber, {Philip I} and Debra Revere",
year = "2016",
language = "English (US)",
pages = "83--93",
booktitle = "17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Proceedings",
publisher = "TextRelease",

}

TY - GEN

T1 - Public sharing of medical advice using social media

T2 - An analysis of Twitter

AU - Leroy, Gondy Augusta

AU - Harber, Philip I

AU - Revere, Debra

PY - 2016

Y1 - 2016

N2 - Introduction: Social media tools, such as Facebook®, Twitter™, blogs and online communities, are increasingly utilized for networking and to distribute information in medicine and public health. Participation in these media has increased sharply over the past decade. Six years ago, Twitter did not exist yet now an estimated 15% of the world population subscribes to Twitter. This has created a large-scale, complex, and unindexed publicly available data source. Goal: We sought to understand the richness and novelty of health-related Tweets by analyzing the characteristics of health information-focused tweets using automated and manual analysis. Research methods: Utilizing the Twitter Search application programming interface (API) we retrieved two sets of English language tweets using keywords related to asthma (#asthma and asthma). Tweets were categorized by the assumed source (retweeted by a person, sent by organization, originated by an individual) and content (containing medication, symptoms, triggers, a combination, or none of these) using natural language processing. Regarding tweet source we assumed that tweets retweeted to a person (i.e., @username) were sent by an individual; those not retweeted that contained a URL were sent by an organization; and those tweets remaining were original content tweeted by an individual. Regarding content categorization, we used lexicons containing terms for asthma medication, symptoms, and five different types of asthma triggers (activities, air pollutants, allergens, environmental and irritants). In addition, we conducted content analysis using a combined text mining and manual approach. Applying association rule mining to the tweets, we generated an overview of the most frequency combination of terms presented as if-then rules. The manual, in-depth analysis evaluated a random sample of 200 tweets for originality, content, credibility and relevance. Costs: The costs associated with this project were time to process tweets. While over 500 million tweets are generated daily, the cost of this information distribution is shared among millions of Twitter subscribers. Results: The analysis showed that the majority of tweets contain URLS and many are retweeted. The proportion of tweets containing personal, new content is small. The majority of tweets are sent by organizations, both commercial and noncommercial, and the content are broad facts and statements. Both medication and environmental triggers are common topics. Conclusion: The high diversity in topics and terminology combined with the small proportion of personal tweets should be taken into account when using Twitter as a resource for tracking and discovering health behaviors or problems in the population. The large proportion of tweets referring to external information may make this a very useful tool for accessing grey literature and using the tweets as descriptors. Further research is needed to create comprehensive vocabularies and methods to efficiently labels tweets.

AB - Introduction: Social media tools, such as Facebook®, Twitter™, blogs and online communities, are increasingly utilized for networking and to distribute information in medicine and public health. Participation in these media has increased sharply over the past decade. Six years ago, Twitter did not exist yet now an estimated 15% of the world population subscribes to Twitter. This has created a large-scale, complex, and unindexed publicly available data source. Goal: We sought to understand the richness and novelty of health-related Tweets by analyzing the characteristics of health information-focused tweets using automated and manual analysis. Research methods: Utilizing the Twitter Search application programming interface (API) we retrieved two sets of English language tweets using keywords related to asthma (#asthma and asthma). Tweets were categorized by the assumed source (retweeted by a person, sent by organization, originated by an individual) and content (containing medication, symptoms, triggers, a combination, or none of these) using natural language processing. Regarding tweet source we assumed that tweets retweeted to a person (i.e., @username) were sent by an individual; those not retweeted that contained a URL were sent by an organization; and those tweets remaining were original content tweeted by an individual. Regarding content categorization, we used lexicons containing terms for asthma medication, symptoms, and five different types of asthma triggers (activities, air pollutants, allergens, environmental and irritants). In addition, we conducted content analysis using a combined text mining and manual approach. Applying association rule mining to the tweets, we generated an overview of the most frequency combination of terms presented as if-then rules. The manual, in-depth analysis evaluated a random sample of 200 tweets for originality, content, credibility and relevance. Costs: The costs associated with this project were time to process tweets. While over 500 million tweets are generated daily, the cost of this information distribution is shared among millions of Twitter subscribers. Results: The analysis showed that the majority of tweets contain URLS and many are retweeted. The proportion of tweets containing personal, new content is small. The majority of tweets are sent by organizations, both commercial and noncommercial, and the content are broad facts and statements. Both medication and environmental triggers are common topics. Conclusion: The high diversity in topics and terminology combined with the small proportion of personal tweets should be taken into account when using Twitter as a resource for tracking and discovering health behaviors or problems in the population. The large proportion of tweets referring to external information may make this a very useful tool for accessing grey literature and using the tweets as descriptors. Further research is needed to create comprehensive vocabularies and methods to efficiently labels tweets.

UR - http://www.scopus.com/inward/record.url?scp=85012934440&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012934440&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85012934440

SP - 83

EP - 93

BT - 17th International Conference on Grey Literature: A New Wave of Textual and Non-Textual Grey Literature, GL 2016 - Proceedings

PB - TextRelease

ER -