There’s no single solution for making all social media algorithms easier to analyze and understand, but dismantling the black boxes that surround this software is a good place to start. Poking a few holes in those containers and sharing the contents with independent analysts could improve accountability as well. Researchers, tech experts and legal scholars discussed how to start this process during The Social Media Summit at MIT on Thursday.
MIT’s Initiative on the Digital Economy hosted conversations that ranged from the war in Ukraine and disinformation to transparency in algorithms and responsible AI.
Facebook whistleblower Frances Haugen opened the free online event with a discussion with Sinan Aral, director at the MIT IDE, about accountability and transparency in social media during the first session. Haugen is an electrical and computer engineer and a former Facebook product manager. She shared internal Facebook research with the press, Congress and regulators in mid-2021. Haugen describes her current occupation as “civic integrity” on LinkedIn and outlined several changes regulators and industry leaders need to make in regard to the influence of algorithms.
Duty of care: Expectation of safety on social media
Haugen left Meta almost a year ago and is now developing the idea of the “duty of care.” This means defining the idea of a reasonable expectation of safety on social media platforms.
This includes answering the question: How do you keep people under 13 off these systems?
“Because no one gets to see behind the curtain, they don’t know what questions to ask,” she said. “So what is an acceptable and reasonable level of rigor for keeping kids off these platforms and what data would we need them to publish to understand whether they are meeting the duty of care?”
She used Facebook’s Widely Viewed Content update as an example of a deceptive presentation of data. The report includes content from the U.S. only. Meta has invested most of its safety and content moderation budget in this market, according to Haugen. She contends that a top 20 list that reflected content from countries where the risk of genocide is high would be a more accurate reflection of popular content on Facebook.
“If we saw that list of content, we would say this is unbearable,” she said.
She also emphasized that Facebook is the only connection to the internet for many people in the world and there is no alternative to the social media site that has been linked to genocide. One way to reduce the impact of misinformation and hate speech on Facebook is to change how ads are priced. Haugen said ads are priced based on quality, with the premise that “high quality ads” are cheaper than low quality ads.
“Facebook defines quality as the ability to get a reaction—a like, a comment or a share,” she said. “Facebook knows that the shortest path to a click is anger and so angry ads end up being five to ten times cheaper than other ads.”
Haugen said a fair compromise would be to have flat ad rates and “remove the subsidy for extremism from the system.”
Expanding access to data from social media platforms
One of Haugen’s recommendations is to mandate the release of auditable data about algorithms. This would give independent researchers the ability to analyze this data and understand information networks, among other things.
Sharing this data also would increase transparency, which is key to improving accountability of social media platforms, Haugen said.
In the “Algorithmic Transparency” session, researchers explained the importance of wider access to this data. Dean Eckles, a professor at the MIT Sloan School of Management and a research lead at IDE, moderated the conversation with Daphne Keller, director of platform regulation at Stanford University, and Kartik Hosanagar, director of AI for Business at Wharton.
Hosanagar discussed research from Twitter and Meta about the influence of algorithms but also pointed out the limitations of those studies.
“All these studies at the platforms go through internal approvals so we don’t know about the ones that are not approved internally to come out,” he said. “Making the data accessible is important.”
Transparency is important as well, but the term needs to be understood in the context of a specific audience, such as software developers, researchers or end users. Hosanagar said algorithmic transparency could mean anything from revealing the source code, to sharing data to explaining the outcome.
Legislators often think in terms of improved transparency for end users, but Hosanagar said that doesn’t seem to increase trust among those users.
Hosanagar said social media platforms have too much of the control over the understanding of these algorithms and that exposing that information to outside researchers is critical.
“Right now transparency is mostly for the data scientists themselves within the organization to better understand what their systems are doing,” he said.
Track what content gets removed
One way to understand what content gets promoted and moderated is to look at requests to take down information from the various platforms. Keller said the best resource for this is Harvard’s Project Lumen, a collection of online content removal requests based on the U.S. Digital Millennium Copyright Act as well as trademark, patent, locally-regulated content and private information removal claims. Daphne said a wealth of research has come out of this data that comes from companies including Google, Twitter, Wikipedia, WordPress and Reddit.
“You can see who asked and why and what the content was as well as spot errors or patterns of bias,” she said.
The is not a single source of data for takedown requests for YouTube or Facebook, however, to make it easy for researchers to see what content was removed from those platforms.
“People outside the platforms can do good if they have this access but we have to navigate these significant barriers and these competing values,” she said.
Keller said that the Digital Services Act the European Union approved in January 2021 will improve public reporting about algorithms and researcher access to data.
“We are going to get greatly changed transparency in Europe and that will affect access to information around the world,” she said.
In a post about the act, the Electronic Frontier Foundation said that EU legislators got it right on several elements of the act, including strengthening users’ right to online anonymity and private communication and establishing that users should have the right to use and pay for services anonymously wherever reasonable. The EFF is concerned that the act’s enforcement powers are too broad.
Keller thinks that it would be better for regulators to set transparency rules.
“Regulators are slow but legislators are even slower,” she said. “They will lock in transparency models that are asking for the wrong thing.”
Hosanagar said regulators are always going to be way behind the tech industry because social media platforms change so rapidly.
“Regulations alone are not going to solve this; we might need greater participation from the companies in terms of not just going by the letter of the law,” he said. “This is going to be a hard one over the next several years and decades.”
Also, regulations that work for Facebook and Instagram would not address concerns with TikTok and ShareChat, a popular social media app in India, as Eckles pointed out. Systems built on a decentralized architecture would be another challenge.
“What if the next social media channel is on the blockchain?” Hosanagar said. “That changes the entire discussion and takes it to another dimension that makes all of the current conversation irrelevant.”
Social science training for engineers
The panel also discussed user education for both consumers and engineers as a way to improve transparency. One way to get more people to ask “should we build it?” is to add a social science course or two to engineering degrees. This could help algorithm architects think about tech systems in different ways and to understand societal impacts.
“Engineers think in terms of the accuracy of news feed recommendation algorithms or what portion of the 10 recommended stories is relevant,” Hosanagar said. “None of this accounts for questions like does this fragment society or how does it affect personal privacy.”
Keller pointed out that many engineers describe their work in publicly available ways, but social scientists and lawyers don’t always use those sources of information.
Hosanagar suggested that tech companies take an open source approach to algorithmic transparency, in the same way organizations share advice about how to manage a data center or a cloud deployment.
“Companies like Facebook and Twitter have been grappling with these issues for a while and they’ve made a lot of progress people can learn from,” he said.
Keller used the example of Google’s Search quality evaluator guidelines as an “engineer-to-engineer” discussion that other professionals could find educational.
“I live in the world of social scientists and lawyers and they don’t read those kinds of things,” she said. “There is a level of existing transparency that is not being taken advantage of.”
Pick your own algorithm
Keller’s idea for improving transparency is to allow users to select their own content moderator via middleware or “magic APIs.” Publishers, content providers or advocacy groups could create a filter or algorithm that end users could choose to manage content.
“If we want there to be less of a chokehold on discourse by today’s giant platforms, one response is to introduce competition at the layer of content moderation and ranking algorithms,” she said.
Users could select a certain group’s moderation rules and then adjust the settings to their own preferences.
“That way there is no one algorithm that is so consequential,” she said.
In this scenario, social media platforms would still host the content and manage copyright infringement and requests to remove content.
This approach could solve some legal problems and foster user autonomy, according to Keller, but it also presents a new set of privacy issues.
“There’s also the serious question about how revenue flows to these providers,” she said. “There’s definitely logistical stuff to do there but it’s logistical and not a fundamental First Amendment problem that we run into with a lot of other proposals.”
Keller suggested that users want content gatekeepers to keep out bullies and racists and to lower spam levels.
“Once you have a centralized entity doing the gatekeeping to serve user demands, that can be regulated to serve government demands,” she said.