4 Ways AI Safety Efforts Could Learn from Experiences with Social Media
AI Safety efforts need to focus on solutions sooner rather than later. Greater online privacy, control, experimental transparency, and use of trust signals will improve the safe use of AI systems.
Tristan Harris has referred to social media as the “first contact” with AI leading to “information overload, doom-scrolling, the sexualization of kids, shortened attention spans, polarization” and many other harms. The media is full of earnest concern about the dangers of AI systems with Sam Altman’s recent testimony to congress including the notable suggestion that “regulatory intervention by governments will be critical to mitigate the risks of increasingly powerful models”. A letter that urged “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4” was signed by more than 20,000 people including many AI researchers and technologists. Yet it isn’t clear what potential solutions would exist in six months that would make it safe to resume AI development. And in a world of open source, international AI development, is it actually even possible to enforce a pause on anything? Instead, we may want to start focusing on the solutions we need to address nearer term risks.
An image from the Center for Humane Technology’s AI Dilemma video
What can today’s AI Safety efforts learn from previous work to improve Social Media’s impact on society? AI technology is newly being discussed, but has been integrated into products we’ve been using for years. A poorly kept secret amongst those working on AI is that there is no actual categorical definition of the term. AI effectively refers to machine learning that is particularly complex and therefore a recurring joke is that AI is effectively machine learning plus hype. Systems that leverage the same neural network based technologies that underlie novel large language models have been leveraged in a wide variety of applications already, including social media recommendation systems. While many of the novel uses of AI are different, there are also a variety of concerns that people have about the widespread adoption of generative AI that map to issues we have already faced with existing technological systems, most notably with respect to social media. We therefore can indeed learn from this “first contact” to understand a few principles that can help us with AI safety efforts, especially as they relate to online social behavior.
To be clear, I am not claiming that these ideas will solve issues with AI safety. These are a few solutions that have proven useful for “first contact” with AI systems and some of them will inevitably be useful for similar upcoming challenges. AI Safety concerns can be grouped into those that are theoretical and those that we are currently facing today. Theoretical concerns include the idea that we may create super-intelligent machines that enslave us or execute our desires too literally, and therefore destroy humanity. There is no experience to learn from in these cases since broadly super-intelligent machines do not exist today. Current day concerns about AI systems include concerns about data ownership, scaled personalized misinformation, algorithmic bias, and bots taking over public discourse. These problems are not entirely new as we have dealt with smaller scale versions of these issues, and the solutions we have used to solve them in the past ought to inform current AI Safety efforts. In many cases, we have had to deal with a large volume of intelligent (human) agents that overwhelm manual efforts and the same tools you use to defend against thousands of hired trolls might be effective against an infinite number of bots as in both cases, you need a solution that is robust to volume.
Here are four specific ways we can make our online world safer in light of increasingly ubiquitous AI:
Create smaller, more private online spaces - One concern that people have is that AI systems will enable automated personalized harassment and manipulation, as deep fake technologies allow bad actors to impersonate those that you trust. This presumes that strangers will be able to contact you leveraging models trained on the likeness of your friends and relatives. An increasing number of people are questioning the wisdom of large unstructured social spaces, where people can both research and contact targets. Many who study social media imagine smaller locally controlled spaces for social interactions, which would make it naturally harder to research and contact those with whom you have nothing in common with. To combat problems of mass harassment, platforms have added more accessible privacy options for particularly vulnerable groups and there is no reason why such features shouldn’t be more universally accessible or even be defaults. An added benefit of private spaces is that it provides users with more explicit control over whether their data is or is not used to train general AI systems that companies are developing.
Let users control interactions with strangers - In our current research, we have found that a significant number of people censor their own online behavior due to fear of the reaction of others. Rather than liberating us all, a completely open system often has the opposite effect where only those who are willing to tolerate toxicity remain. Both Twitter and Facebook have responded by creating functionality that allows users to control who can respond to content they post and if users explicitly decide on the handful of others they want to tag and interact with, it doesn’t really matter how many AI bots attempt to flood that conversation. This functionality is not used by many, but smart defaults could be enabled that make interaction with strangers the exception that we have to choose, rather than the norm. Rather than trying to identify AI powered bots, which may prove impossible, we can instead allow people to choose to interact with people they know are human through more robust means (e.g. having met them in real life).
Elevate trust based content - Social media companies have fought a losing battle with misinformation as billions of dollars have been spent to fact-check important information about health and politics, yet online misinformation about election outcomes and vaccines has proliferated. Synthetic media is already proving difficult to identify and rather than attempting the possibly impossible task of labeling all synthetic media, efforts are underway to publicly identify content that we know we can trust as an alternative. Systems (both technical and social) could eventually learn to only amplify and propagate known trusted content, leveraging provenance signals that are harder to manipulate. Similar efforts have shown some promise with social media systems. Platforms have introduced some reputation based signals that reduce misinformation distribution. Brigading is a type of engagement based attack on discussion spaces that mirrors what some expect may happen if people begin to leverage AI systems to control narratives. The volume of these attacks already cannot be solved with content moderation and while it is not a solved problem, some solutions have worked better than others. For example, trying to remove inauthentic comments does not scale, but since users only read the top 5-10 comments anyway, identifying high quality, high reputation comments has been a far more scalable way to address brigading.
Increase algorithmic alignment with experimental result transparency - Some critics of the current model of AI development have advocated for increased transparency and the resulting accountability that it could enable. Algorithms, especially “AI” based algorithms, are poorly understood, such that the only way that companies make progress is through learning from experimentation. All platforms run numerous experiments to make product decisions and having access to those product experimentation results is essential for understanding where algorithms may be misaligned with individual well-being and societal outcomes. Most significant social media platform progress on reforming algorithms away from misaligned incentives like optimizations for comments, shares, and anger reactions, comes from understanding experimental results. Large language model designers have likely learned similar best practices, learned from experimentation, as to what incentivizes and disincentivizes negative outcomes. If society wants to meaningfully participate in product design decisions, it needs to have access to this evidence base that platforms themselves use to understand their products. Experimental result visibility could also include mandatory examination of metrics relating to vulnerable groups, to help us better understand algorithmic bias when seemingly unrelated optimization functions have unfortunate consequences.
While future AI challenges will certainly be different from those faced previously in relation to social media, we don’t know exactly how they will be different. Some will inevitably bring up the idea that a super-intelligent AI could work around these ideas, by fooling people into giving it a real world reputation, for example. However, none of these ideas have been tried to their fullest extent and so it behooves us to at least start to drive solutions, rather than hope that AI development slows, which seems unlikely. We also can’t be certain that AI systems won’t hit some limit in terms of intelligence as no growth curve continues indefinitely. The pace of innovation will not necessarily scale linearly forever and it’s possible that AI challenges could end up looking very similar to hiring a large number of low wage workers to hack systems, such that systems that are robust against a volume of average people may prove more useful than those who are convinced of AI’s continual growth may think.
As someone who has repeatedly talked about design being more impactful than content moderation, I want to take the opportunity to note that none of the above suggested solutions relate to policy based content moderation. Our friends at the Integrity Institute recently interviewed Dave Willner and Todor Markov of Open AI’s Trust and Safety team and even as their content moderation efforts are clearly important, I was heartened to hear that they had a separate team that worked on broader issues of value alignment. Ideally, we should all learn from those efforts as to how to design safer AI systems at the collective societal level, rather than hoping individual companies make responsible design decisions independently.
People should indeed worry about the challenges of upcoming AI systems. But given the open source, international nature of this race, we need to be sober and concrete about how we are going to meet these challenges and focus on discussing, testing, and evaluating solutions. We need to do that now, given the increasing usage of these systems and the uncertainty about efforts to slow their development. I’m hopeful that the above proposals can be generative for others as we all begin to wrestle with AI’s future and look forward to continuing to collaborate with our community on what it means to design safer AI systems.
—-
Below are a few announcements from our Psychology of Technology Institute community:
This article, “Fighting Misinformation or Fighting for Information” led by Alberto Acerbi and published in the Harvard Kennedy School Misinformation Review recently caught our attention. It uses simulations to show that “interventions aimed at reducing acceptance or spread of such news are bound to have very small effects on the overall quality of the information environment, especially compared to interventions aimed at increasing trust in reliable news sources”.
Both PTI member Kiran Garimella’s work on fear speech and algorithmic design ideas from our working paper on the algorithmic management of polarization and violence were featured in this recent New York Times article by Julia Angwin on the fact that the proliferation of fear speech is an under-appreciated online concern.
This recent Wall Street Journal article about Elon Musk’s proposal to optimize for unregretted time spent quoted our substack article on the topic, where we leverage the extensive research on regret for acts of commission vs. omission.
Ethan Zuckerman’s Initiative for Public Digital Infrastructure recently released this white paper on their vision for a public internet that: “1. Consists of many different platforms with a wide variety of scales and purposes; 2. Users can navigate with a loyal client that aggregates, cross-posts, and curates; 3. Is all supported by cross-cutting services rooted in interoperable data”. We are hopeful that such an internet would support companies that compete on safer, value aligned design, rather than on time spent.
Psychology of Technology Institute Co-Founder Nate Fast was recently featured on AirTalk, discussing challenges and benefits of using AI, as well as on the How Do We Fix It? podcast, discussing whether AI represents a looming disaster or a great leap forward.
Our collaborator, Helena Puig Larrauri, leads BuildUp, and is organizing a conference in Nairobi, Kenya from December 1-3 where the focus will be on “how technology and the arts influence identities relevant to peace and conflict”. Attendees will include technologists as well as a global community of peace builders. Technology systems will almost certainly provide more societal value if they include input from a more global, diverse set of stakeholders, so we are hopeful that readers of this newsletter consider attending in order to solicit more such perspectives.
Hi Ravi! Thank you for this incredibly insightful piece. The problem-space feels so big that it's overwhelming (for me) to know which solutions to prototype and test. I currently lead design for Moderator Tools and Trust and Safety Tools at The Wikimedia Foundation (i.e. for Wikipedia) so I'm hoping to more intentionally define the hypotheses we're testing when we release new tools, just so that we're tracking what we learn and can chip away at the problems.
I learned a lot via the hyperlinks you've included as well and will keep digging :) Thanks again!