Analyzing the language of food on social media

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

We investigate the predictive power behind the language of food on social media. We collect a corpus of over three million food-related posts from Twitter and demonstrate that many latent population characteristics can be directly predicted from this data: overweight rate, diabetes rate, political leaning, and home geographical location of authors. For all tasks, our language-based models significantly outperform the majority-class baselines. Performance is further improved with more complex natural language processing, such as topic modeling. We analyze which textual features have greatest predictive power for these datasets, providing insight into the connections between the language of food, geographic locale, and community characteristics. Lastly, we design and implement an online system for real-time query and visualization of the dataset. Visualization tools, such as geo-referenced heatmaps and temporal histograms, allow us to discover more complex, global patterns mirrored in the language of food.

Original languageEnglish (US)
Title of host publicationProceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages778-783
Number of pages6
ISBN (Print)9781479956654
DOIs
StatePublished - Jan 7 2015
Event2nd IEEE International Conference on Big Data, IEEE Big Data 2014 - Washington, United States
Duration: Oct 27 2014Oct 30 2014

Other

Other2nd IEEE International Conference on Big Data, IEEE Big Data 2014
CountryUnited States
CityWashington
Period10/27/1410/30/14

Fingerprint

Visualization
Online systems
Medical problems
Processing

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems

Cite this

Fried, D., Surdeanu, M., Kobourov, S. G., Hingle, M. D., & Bell, D. (2015). Analyzing the language of food on social media. In Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014 (pp. 778-783). [7004305] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2014.7004305

Analyzing the language of food on social media. / Fried, Daniel; Surdeanu, Mihai; Kobourov, Stephen G; Hingle, Melanie D; Bell, Dane.

Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014. Institute of Electrical and Electronics Engineers Inc., 2015. p. 778-783 7004305.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fried, D, Surdeanu, M, Kobourov, SG, Hingle, MD & Bell, D 2015, Analyzing the language of food on social media. in Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014., 7004305, Institute of Electrical and Electronics Engineers Inc., pp. 778-783, 2nd IEEE International Conference on Big Data, IEEE Big Data 2014, Washington, United States, 10/27/14. https://doi.org/10.1109/BigData.2014.7004305
Fried D, Surdeanu M, Kobourov SG, Hingle MD, Bell D. Analyzing the language of food on social media. In Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014. Institute of Electrical and Electronics Engineers Inc. 2015. p. 778-783. 7004305 https://doi.org/10.1109/BigData.2014.7004305
Fried, Daniel ; Surdeanu, Mihai ; Kobourov, Stephen G ; Hingle, Melanie D ; Bell, Dane. / Analyzing the language of food on social media. Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 778-783
@inproceedings{da255df7f27541b3a67fccca54290956,
title = "Analyzing the language of food on social media",
abstract = "We investigate the predictive power behind the language of food on social media. We collect a corpus of over three million food-related posts from Twitter and demonstrate that many latent population characteristics can be directly predicted from this data: overweight rate, diabetes rate, political leaning, and home geographical location of authors. For all tasks, our language-based models significantly outperform the majority-class baselines. Performance is further improved with more complex natural language processing, such as topic modeling. We analyze which textual features have greatest predictive power for these datasets, providing insight into the connections between the language of food, geographic locale, and community characteristics. Lastly, we design and implement an online system for real-time query and visualization of the dataset. Visualization tools, such as geo-referenced heatmaps and temporal histograms, allow us to discover more complex, global patterns mirrored in the language of food.",
author = "Daniel Fried and Mihai Surdeanu and Kobourov, {Stephen G} and Hingle, {Melanie D} and Dane Bell",
year = "2015",
month = "1",
day = "7",
doi = "10.1109/BigData.2014.7004305",
language = "English (US)",
isbn = "9781479956654",
pages = "778--783",
booktitle = "Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Analyzing the language of food on social media

AU - Fried, Daniel

AU - Surdeanu, Mihai

AU - Kobourov, Stephen G

AU - Hingle, Melanie D

AU - Bell, Dane

PY - 2015/1/7

Y1 - 2015/1/7

N2 - We investigate the predictive power behind the language of food on social media. We collect a corpus of over three million food-related posts from Twitter and demonstrate that many latent population characteristics can be directly predicted from this data: overweight rate, diabetes rate, political leaning, and home geographical location of authors. For all tasks, our language-based models significantly outperform the majority-class baselines. Performance is further improved with more complex natural language processing, such as topic modeling. We analyze which textual features have greatest predictive power for these datasets, providing insight into the connections between the language of food, geographic locale, and community characteristics. Lastly, we design and implement an online system for real-time query and visualization of the dataset. Visualization tools, such as geo-referenced heatmaps and temporal histograms, allow us to discover more complex, global patterns mirrored in the language of food.

AB - We investigate the predictive power behind the language of food on social media. We collect a corpus of over three million food-related posts from Twitter and demonstrate that many latent population characteristics can be directly predicted from this data: overweight rate, diabetes rate, political leaning, and home geographical location of authors. For all tasks, our language-based models significantly outperform the majority-class baselines. Performance is further improved with more complex natural language processing, such as topic modeling. We analyze which textual features have greatest predictive power for these datasets, providing insight into the connections between the language of food, geographic locale, and community characteristics. Lastly, we design and implement an online system for real-time query and visualization of the dataset. Visualization tools, such as geo-referenced heatmaps and temporal histograms, allow us to discover more complex, global patterns mirrored in the language of food.

UR - http://www.scopus.com/inward/record.url?scp=84921776765&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84921776765&partnerID=8YFLogxK

U2 - 10.1109/BigData.2014.7004305

DO - 10.1109/BigData.2014.7004305

M3 - Conference contribution

AN - SCOPUS:84921776765

SN - 9781479956654

SP - 778

EP - 783

BT - Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014

PB - Institute of Electrical and Electronics Engineers Inc.

ER -