Alignment: Not Just for AI

In 2022 and 2023, the big news in tech – and still, in 2024, consuming lots of media space – was the release of those amazing GPT-3 and GPT-4 based LLMs (Large Language Models). AI was such a blockbuster story that commentators of all stripes were sounding alarms about the prospects that this new technology would soon grow out of control, not simply the easily predictable economic disruptions (queue the Luddites), but threatening the prospects of civilization (or even humanity) — a destructive force hearkening many a science fiction plot. One of the key issues to be resolved was AI alignment. Would the coming generations of super-intelligent, even sentient, machines be engineered to reliably follow society’s norms and values? Or would their approaching sentience mean they would discover a will to power of their own? Terminator 2 was on everybody’s mind.

But alignment has been a challenge for society even without super-intelligent machines. What is alignment engineering really all about? It is presumably based on moral values with some utilitarian definition of happiness – the greatest good for the greatest number. As parents, we try our best to program our kids to be aligned with society. Social institutions like church and school are supposed to help with this project. There are incentives for proper alignment through life – but, as we all know, they’re not foolproof. There are also disincentives. Sometimes we make “errors of judgement” based on mysterious behavioral drivers which we either don’t understand or can’t control. Still, we’re quick to accuse others of despicable selfish behavior in their quest for power, riches … even pleasure. Are we projecting our own weaknesses here? How well-aligned are any of us?

Our lack of confidence in our ability to control social alignment was the secret of Hollywood’s Terminator franchise – where SkyNet was only defeated via time travel. Conflicts between alternate alignment values of one cultural group versus another waxes large in our social particularism. We wish social norms could be engineered as easily as technology. Perhaps it is the outliers in our own society – the rebels and bandits – who are the principal alignment failures. But those outliers would likely claim that the elites, the rule-makers themselves, are the real alignment failures. Just as we try to teach our children to be well-aligned with our social milieu, the beacon of a “higher morality” often motivates others. It comes down to who is the designer of the alignment standards – and we are fearful that it is not us!

The AI alignment problem with the new LLM applications was first exposed by New York Times columnist Kevin Roose, who famously wrote about his encounter with Microsoft’s early LLM based chatbot, which called itself “Sydney” – it literally tried to convince him to leave his wife, presumably to run off with the bot! (Microsoft dropped that version in Bing and has now rebranded its commercial/free AI product “Copilot”). But it’s ultimately up to the LLM’s training set. That appears to be where its alignment is seated. And, likewise, our own social “training set” is what determines our social alignment. Where does it come from? From whom?

Much of what we believe, our “value system,” comes from the degree of trust we place in social institutions. Do we trust our religious establishments? Our government? Our employers? Our neighbors? Recent surveys point to a general loss of trust in all of them – at least here in the United States. This phenomenon understandably leads to skepticism about AI training sets. It might be worthwhile to put more emphasis on non-societal personal relationships – like marriage and immediate family – in our pursuit of general happiness. We all feel somewhat disconnected from society at large these days. Greater trust in social institutions might follow from greater acceptance of both our empowerment and those factors of disempowerment we see around us. We should act in areas where we can, and not fool ourselves into thinking we have power to influence where we can’t. Democracy has potential here, but it is not perfect – yet! Intellectual and techno thought leaders deserve some of their following, but only within limits. They also are not perfect – and never will be.

All engineering projects require an established rigor. This rigor applies to social planning styles of engineering (what used to be called “social engineering”) as well as the more scientific and technological varieties. There are four stages of any such endeavor:

Stage I: identify the problem – what is it you’re trying to solve?

Stage II: explore causation – this may be a protracted enterprise, involving much difference of opinion, even among scholars and researchers.

Stage III: craft solutions – usually involving great investment of resources; money, people, laws.

Stage IV: make excuses for failure! Was the timeline overly optimistic? Were the measurements accurate? Is there some kind of quantum entanglement? (It may have worked in an alternate universe.)

Alignment is no different than any engineering project. There will be arguments about measurement techniques, about methodologies, about personnel and other resources. But there should not be disagreement about direction or identification of the problem to be solved. Our journey through history should show us the path. AI alignment is only a specific case of the deeper alignment challenge we all face, and always have faced. It’s really about that utilitarian ideal, the greatest good for the greatest number, not just “our folks.” As individuals, we need to understand that our own happiness is dependent upon, not opposed to, the happiness of others. In the end, we’re probably not aiming for utopia, but merely something better than what we have.

— William Sundwick

One thought on “Alignment: Not Just for AI

Leave a comment