Technology

How can machine learning algorithms find drunk Twitter users?

A University of Rochester study was able to develop a system for identifying and tracking Twitter users' drinking habits based on their Tweets.

Kacper Pempel/Reuters/File

People holding mobile phones are silhouetted against a backdrop projected with the Twitter logo. Twitter was used as a starting point for a project tracking alcohol use through social media.

By Ben Thompson Staff
@BenThompson_CSM

March 17, 2016, 3:35 p.m. ET

Using Twitter to follow trends is nothing new; the social media platform is known for actively tracking popular topics and highlighting them on its website. But a new algorithm may be able to detect a different type of pattern among its users: drinking habits.

Twitter keeps track of what its users post, when they post, and where they post from, and with that data a team of University of Rochester researchers was able to develop a method for evaluating how and where Twitter users drink alcohol.

“Analysis of Twitter has become a widespread approach for geo-spatial studies of human behavior, such as alcohol consumption and exercise, and human latent states, such as sickness and depression,” the researchers wrote in a summary of their study.

“However, nearly all prior work … does not attempt to distinguish mere mentions of activities or states from self-reports of activity. Moreover, no attempt has been made to distinguish reports about future or past activities and in-the-moment reports that provide finer details when geo-tagged tweets are used to map specific locations of activities,” they added, highlighting what they hoped to address through their investigation.

In order to track regional drinking habits through Twitter, the team came up with a system with which they could identify relevant tweets. The Rochester analysts came up with a series of three questions they used to determine if a tweet originated from a drinking user: Does the tweet mention alcoholic beverages – did they use words such as “drunk,” “beer,” or “alcohol?” Is the tweet about the tweeter consuming such beverages? And, is it likely the tweet was sent while the tweeter was drinking?

The study used volunteers on Amazon's Mechanical Turk – an online marketplace where “requesters” can post tasks to be completed by human “turkers” – to best evaluate how to find drinking-related tweets. Using data from the human trials, the team was able to program a support vector machine to follow the same line of inquiry as the humans did in order to accurately find relevant tweets.

Using that initial process, and further machine learning predictive algorithms to estimate tweeters’ locations, an analysis of Twitter users’ alcohol consumption habits was compiled. All tweets in the study were taken from the New York City metropolitan area, and the results are based around drinking preferences in the city versus the suburbs, and drinking at home versus drinking away from home.

The Rochester team found that most drinkers stay relatively close to home when imbibing in both residential situations, with suburban drinkers more likely to stray farther away. The researchers also found a positive correlation between the density of “alcohol outlets” such as liquor stores and bars and the amount of Tweets sent out about drinking. While the paper notes that “correlation does not necessarily imply causation,” it cites several previous studies that arrived at similar conclusions regarding alcohol availability and drinking.

The final results painted an interesting picture of New York’s drinking habits, but also suggested that similar algorithms and research methodology could be used to “help to create a tool for improving a community’s health, given social networks can become a resource to spread positive health behaviour,” wrote the researchers. They did, however, note one significant bias in the report: the relatively high rate of young and minority users on the Twitter platform. But they said that studies in all fields see similar problems and could be weighted accordingly, and that their final conclusions were fairly successful in analyzing the New York drinking scene, with high potential for the future of complementary Twitter-based systematic studies.

“Our results demonstrate that tweets can provide powerful and fine-grained cues of activities going on in cities,” the team said.

Why is Christian Science in our name?

Are Germany’s moves against far-right party a model, or cautionary tale?

‘Sold ... a dream’: Young Senegalese sour on the president they chose

A win for peaceful accord in Turkey

Birthright citizenship reaches the Supreme Court. What’s at stake?

The right to be a society apart, in Ecuador and South Africa

Are Germany’s moves against far-right party a model, or cautionary tale?

Your subscription makes our work possible.

How can machine learning algorithms find drunk Twitter users?

Deepen your worldview with Monitor Highlights.

How can machine learning algorithms find drunk Twitter users?

Help fund Monitor journalism for $11/ month

Unlimited digital access $11/month.

Digital subscription includes:

Related stories

Test your knowledge Are you savvy about social networks? Take our quiz to find out.

Will Twitter abandon real-time?

Global News Blog Should social media founders take ISIS threat seriously?

Will Twitter's long-range plan to tackle online harassment work?

Deepen your worldview with Monitor Highlights.

Subscription expired

Session expired

No subscription

Deepen your worldview
with Monitor Highlights.

Deepen your worldview
with Monitor Highlights.