How User Experience Metrics complement "Content that Requires Enforcement"
Twitter has argued that more than 99% of their content is “healthy”, but users’ reported experiences tell a more complex story.
Recently, our nationally representative survey of user experiences with social media platforms has been cited by several press outlets to suggest that content on Twitter is more likely to be perceived as “bad for the world” and therefore, advertisers should be wary. Twitter has responded to these stories forcefully and below is an excerpt from a longer tweet that states that “more than 99% of content users and advertisers see on Twitter is healthy” and “only a small amount of content requires enforcement”.
Our survey was one of several metrics cited, alongside analyses of rising slurs, greater reported harassment, and increasing reported desire to take a break from the platform. No metric is perfect, and our reaction to Twitter’s CEO’s tweet was similar to Daphne Keller’s response below, where the slipperiness of the definitions is the key issue. What exactly is “healthy”?
One guess for the definition of “healthy” (which we’d welcome clarification on) would be that it is defined as content that does not violate company policies, or is effectively the inverse of content that requires enforcement. One of the reasons why companies pay attention to user experience (e.g. Facebook has done numerous surveys of “content that is bad for the world”) is that only a relatively small proportion of content that users perceive to be harmful (to themselves or to the world) is actually policy violating. We would argue that there is a big difference between content violating company policy or not, and content being healthy. To illustrate this, below are some sample descriptions of what Twitter users told us was content they experienced that they perceived to be “bad for the world”:
“Any tweet where voting for X or supporting Y policy is conflated with doing Z. The most popular tweets paint the world in black/white or good/evil instead of understanding the nuance of every situation.”
“Posts that glorify violence or normalize assaulting people based on what they say or believe or wear”
“Knowing immigrants that are going through the naturalization process and seeing the hateful things that people post on Twitter, convinced me to not use it any longer. There is so much negative and inaccurate information on that site.”
“The whole Bud Light/Gay Pride situation. Who cares? It is a can of beer. Nothing changes because of what is on it. You are not going to suddenly become gay because you drank a beer with a rainbow on it. Plus, these things are nothing new, but now with all of these politicians that use it to try and win elections, even if they do not care but have to pretend to care to keep a certain demographic, big deals are made about nothing.”
“Many instances of transphobia”
“Negative comments/trolling about body image and appearance “
“Divisive comments following the shooting in Nashville”
“I saw a video of a man who was an anti-protestor at a trans rights rally. He was surrounded, shouted at, and got into a physical altercation with someone there. This type of engagement isn't good for anyone”
“Accounts showing videos of people harassing employees of stores because they view them as promoting the “grooming of children”
“People attacking other people verbally”
“NCAA women's basketball national championship game. People taking things out of context to mislead people and stoke division “
“Misogyny”
“constant arguing between people “
“I see some videos showing crimes and also misleading information. Some naive people might believe whatever they say in the platform and act based on that. This could lead to violence and many other problems.”
“tweets that demean others “
“Content that could be described as mean or angry”
“The amount of pornography and the fact that is on Twitter without warning or age verification I would consider bad for people but especially for our youth, bad for the world.”
Most of this is likely not content that would require enforcement and therefore is likely to be classified as “healthy” by Twitter’s definition. Moreover, we would argue that we would never want companies to enforce on most such content. People should be allowed to be mean, angry, and to argue passionately with others, even as there are some lines that shouldn’t be crossed, as often expressed in policy.
However, there is plenty of research to suggest that people are not entirely wrong when they believe such content can be bad for the world. For example, in a 4 week period, 23% of Twitter users report seeing content that they believe could increase hate, fear, and/or anger between groups of people and 17% of Twitter users report seeing content they believe could increase the risk of violence - numbers which are much higher than other social media platforms. A number of studies have linked depictions of crimes committed by racial groups to negative attitudes toward those groups, and it is easy to game engagement based algorithms by cherry picking such “news” and making it seem more representative than it is. Basic psychological processes like being influenced by people in your social circle or disliking those one is in competition with are consistently activated by such experiences, such that out-group animosity spreads.
One area where we should be sympathetic with Twitter is when they get blamed for every negative piece of content that a user tweets. Such experiences are part of the human experience, so the goal should not be to drive such experiences to zero. We do not want companies becoming even more enforcement oriented, as they will inevitably make important mistakes. But society (and advertisers) can reasonably ask why such experiences are much more common on Twitter vs. other platforms. Are there norms on Twitter about what is acceptable that lead to more negative experiences? Are there aspects of Twitter’s algorithm and design that could be encouraging people to generate content that is perceived to be bad for the world, in order to gain attention? A recent study that compares the algorithmic feed with the chronological one, suggests that the answer is yes.
Human social behavior is complex, and every behavior cannot be reduced to good/bad, healthy/unhealthy, or violating/not-violating. It is our view that comparative user experience metrics can help companies understand more holistically how the experiences on their platforms relate to the experiences on other platforms. This is why we launched the Neely Social Media Index to allow a nationally-representative sample of U.S. adults to report their experiences (positive and negative) across different platforms over time. We hope our findings will encourage companies to ask their own users what experiences they’ve had that they feel are bad for their well-being and/or for the world, and consider how that information might inform their product design.
As Facebook has done previously, it is relatively easy for platforms or stakeholders to ask these questions of their users. Of course, companies may decide to keep these findings private when they do not fit a desired narrative, which means they may not feel the pressure to act on what they find. By publicizing our findings each month, and sharing the data openly with researchers, we hope to improve the quality of the public conversation about this admittedly difficult and messy topic. This public conversation should have the added benefit of incentivizing companies to explore design changes that will improve user experiences on their platforms. It can bring systematic data to what users do naturally - comparing their positive and negative experiences as platforms change. Society clearly has a stake in such matters with advertisers, politicians, and consumers hopefully making decisions accordingly, and collectively pushing for platforms that facilitate safety, health and well-being and, at the very least, don’t amplify content that is bad for the world.