The University of Michigan is partnering with Facebook parent company Meta to build a social media data archive. The project intends to make the aggregated data useful to social scientists studying the impact of social media on society.
University of Michigan professor Libby Hemphill studies how people organize social change, so she knows the issues that go hand in hand with social data. “Some researchers studying social media have to do everything themselves, from getting approval to accessing data to writing programs to storing and analyzing data on their own,” Hemphill said.
There are additional challenges. “They might need to know how to work with data in different formats. Or, they may need to pay for a service geared toward market research, which carries its own limitations on how they can download and use data,” Hemphill said.
Hemphill has been working to establish a social media archive at the Inter-university Consortium for Political and Social Research at U-M’s Institute for Social Research for the last few years. Her work creating the archive was made possible by a 2021 Propelling Original Data Science grant from the Michigan Institute for Data Science, called Ensuring FAIRness in Social Media Archives. Meta and the University of Michigan will partner to support that archive, called the Social Media Archive or SOMAR.
“From the emotional well-being of local youth to the outcomes of global political processes, social media play a critical but poorly understood role,” said Institute for Social Research Director Kathleen Cagney. “At ISR and ICPSR, it is our imperative to shed light on these processes.”
The SOMAR project will provide access to information that is extremely valuable to society. Meta is offering $1.3 million for the project, to assure the project will continue to support research for years into the future. Of course this comes with privacy concerns. The project will be used to assure election integrity, analyze how advertising influences consumers, and look at how social media influences societal change. The goal with this project is to make the data available to academic researchers, not sell it to consumer brands or think tanks.
The ICPSR has a reputation for handling data with confidentiality and privacy via protections on how data is secured and distributed. ICPSR Director Margaret Levenstein says this attention to ethics in data usage is “irreplaceable” when it comes to handling that data of millions of social media users.
The datasets include such topics as:
- “#MeToo Tweet IDs, October 15-28, 2017 (ICPSR 37447),” a collection of tweet IDs pertaining to the first two weeks of the #MeToo hashtag campaign in October 2017.
- “Appealing to the Base or to the Moveable Middle? Incumbents’ Partisan Messaging Before the 2016 U.S. Congressional Elections,” which contains weekly measures of partisanship for verified official U.S. Congress Twitter accounts for September-November 2016.
- “What Social Media Platforms Miss About White Supremacist Speech,” which includes 274,668 posts scraped from Stormfront and 509,982 comments collected from the Reddit API.
SOMAR will create services to support analysis of this new kind of data from social media users. Students and researchers will analyze information on subjects that relate to societal change and more. SOMAR aims to remove data access hurdles and offer training and outreach to help researchers learn how to use social media data to create valuable insights. “A resource like SOMAR will lower persistent barriers to data access for researchers and is desperately needed,” Levenstein said. “The future of our society depends on it.”
Leave a Reply