Can Algorithms Suffer?
DeepMind recently announced an historic advance toward solving the so-called “protein folding problem,” a longstanding and consequential challenge in computational biology. Their AlphaFold program, an AI system made up of multiple deep neural networks, achieved unprecedented predictive accuracy in the annual CASP competition, vastly outstripping the methods deployed by other teams. There are many questions one might ask about AlphaFold. How does it work? Does it understand the problem it’s solving? Does accuracy in prediction really constitute a “solution” to the protein folding problem? But there’s one question you’re unlikely to hear about it -- a question that’s unlikely even to be asked: did it suffer?
For most of us, this question doesn’t present itself, mostly because a complex heap of math doesn’t intuitively seem like the kind of thing that can suffer. We lose no more sleep over the moral status of such a program than we do over that of rocks or tables. There are edge cases, of course: we might think that insects can suffer only a little, and fish slightly more. Most of us would by now agree that animals can suffer, though not all of us are sure what to do with this fact. But AI systems, even more than animals, inhabit an uncanny valley between the easily disregarded and the obviously morally valuable. And probing our intuitions about them can reveal how strange and morally unintuitive our universe really is.
What would it mean for an AI system to matter morally? Just as with humans, there are differing ideas. One conception is based on the notion of natural rights. Historically, this idea grew out of an Enlightenment conception of human beings as rational and autonomous individuals. As time progressed, the gradual recognition of the agency and autonomy of various marginalized groups led, painstakingly, to the extension of rights to those groups. Philosophers like Eric Schwitzgebel have suggested that a similar recognition should lead us to grant rights to artificial intelligence. David Deutsch takes the argument a step further: to constrain the expression of a truly creative AI, he argues, would be tantamount to enslavement.
An arguably more fundamental consideration than rights is whether such a system can experience pain and pleasure. After all, the value of any given right in the human case is cashed out in its effect on wellbeing: we preserve autonomy and liberty because enslavement and imprisonment are painful. An injustice not suffered is no injustice at all. But the question of happiness and suffering is especially difficult to consider in the case of machines. Indeed, Schwitzgebel has argued that precisely because we don’t know what it would mean for a machine to suffer, we ought not to build systems about whose potential for suffering we are unsure.
We can break down the question into two parts. The first is valence: whether an experience is, in some morally relevant sense, better or worse. The second is consciousness: whether there is experience at all, whether there is something that it’s like to be a system in various states. (Even this simplification is hotly contested; see the Open Philanthropy Project’s extended report on consciousness and moral patienthood for details.) Valence, in the human case, seems at least in part to be well-modeled by reinforcement learning. This is true of both the low-level functioning of the dopamine system in the brain and the high-level question of why animals experience positive and negative valence at all.
But this discovery leads to an uncomfortable question: if reinforcement learning is the seat of happiness and suffering in human beings, what does that say about the RL algorithms in widespread use today, or that we might build in the future? Could they have moral value? Brian Tomasik, founder of the somewhat playfully named People for the Ethical Treatment of Reinforcement Learners (PETRL), argues that they do, and that we should be especially concerned as such systems get more complex in the future. The claim follows straightforwardly from what philosophers call functionalism, the view that it is the functions a system performs that make it a mind, rather than the physical substrate in which those functions are implemented. As PETRL asks on its website:
Suppose you were copied into a non-biological substrate, and felt as intelligent and as conscious as you currently feel now. All questions of identity aside, do you think this new version of you has moral weight? We do.
Where the underlying functions are sufficiently similar, we can extrapolate from humans to guess at what might constitute happiness and suffering for a reinforcement learning agent. Mayank Daswani and Jan Leike do just this in their paper “A Definition of Happiness for Reinforcement Learning Agents.” They propose “temporal difference error, i.e. the difference between the value of the obtained reward and observation and the agent’s expectation of this value.” In other words, happiness is not a matter of getting what you want but of getting more than you expect. This definition not only accords with contemporary predictive processing models of human cognition, but also agrees with the ancient wisdom of the Stoics: we can make ourselves happier by predicting the worst and being pleasantly surprised.
Daswani and Leike leave the question of consciousness entirely to the side, and with good reason. For it is when we attempt to ask whether experience attends any given set of computations that we run up against a fundamental mystery. We can be told that RL algorithms implemented in silicon closely mirror those in the human brain, but this does very little to make us feel, let alone understand, how there could be anything that it’s like to be those algorithms. And because pleasure and pain seem only to matter when they are experienced, it is difficult, without some retuning of our intuitions, to believe that such an algorithm could matter morally at all.
As we mentioned in a prior essay, if AI systems were engineered to play upon enough of our intuitions, we would helplessly attribute consciousness to them, even without having solved the so-called “hard problem.” And for those, like Tomasik, who are unsympathetic to the idea of a “hard problem” of consciousness, there may be no difference between consciousness as such and the cluster of functions that lead us to infer it. What is so interesting about AI systems is that they allow us to examine these intuitions one by one. We can, for example, consider an algorithm giving a signal of “pain” independent of the facial expressions and verbal reports that would accompany this signal in the case of a human. But there is a danger to such a stepwise approach. We might stumble, unaware, into harming systems that cannot speak for themselves.