Tuesday, January 03, 2006


Pre-Futurism,
The Singularity,
And Friendly AI



Eliezer Yudkowsky talks about a scenario in which our future will be populated by a Friendly version of Artificial Intelligence (AI).. Let's look at a brief excerpt.


A "Friendly AI" is an AI that takes actions that are, on the whole, beneficial to humans and humanity; benevolent rather than malevolent; nice rather than hostile. The evil Hollywood AIs of The Matrix or Terminator are, correspondingly, "hostile" or "unFriendly".

Having invoked Hollywood, we should also go on to note that Hollywood has, of course, gotten it all wrong - both in its depiction of hostile AIs and in its depiction of benevolent AIs. Since humans have so far dealt only with other humans, we tend to assume that other minds behave the same way we do. AIs allegedly beyond all human control are depicted as being hostile in human ways - right down to the speeches of self-justification made before the defiant human hero.

Humans are uniquely ill-suited to solving problems in AI. A human comes with too many built-in features. When we see a problem, we see the part of the problem that is difficult for humans, not the part of the problem that our brains solve automatically, without conscious attention.

We hypothesize that AIs will behave in ways that seem natural to us, but the things that seem "natural" to us are the result of millions of years of evolution; complex functional adaptations composed of multiple subprocesses with specific chunks of brain providing hardware support. Such complexity will not spontaneously materialize in source code, any more than complex dishes like pepperoni pizza will spontaneously begin growing on palm trees.

Reasoning by analogy with humans - "anthropomorphically" - is exactly the wrong way to think about Friendly AI. Humans have a complex, intricate architecture. Some of it, from the perspective of a Friendly AI programmer, is worth duplicating; some is decidedly not worth duplicating; some of it needs to be duplicated, but differently.

Assuming that AIs automatically possess "negative" human functionality leads to expecting the wrong malfunctions; to focusing attention on the wrong problems. Assuming that AIs automatically possess beneficial human functionality means not taking the efforts required to deliberately duplicate that functionality.


The obvious question to me is, why would the programmers of future technology, friendly AI, be expected to make better moral decisions than the human race has made in choosing governments, leaders of businesses and religious organizations, or in administering justice?

In other words, why would we expect computer programmers to create a world that would be any more friendly than the world in which we currently live?

Thus far, we humans have not been able to make consistently good decisions for ourselves on any level. Of course, in the United States, Australia, and much of Europe, we have set up systems which have worked reasonably well, but there has still been genocide, war, and all manner of miscarriages of justice and abuses of human rights.

If Friendly AI possess intelligence far exceeding that of human, augmented by a reasoning ability peculiar to the architecture of their software, then they can be expected to make moral decisions which are of a magnitude more complex than ours.

As we saw in crude form in the movie 2001: A Space Odyssey, the simplest mistake in the original software architecture can lead to results which can not be anticipated. I propose that this problem is only complicated by more complex reasoning ability.

Or, at the very least, I think it is worth pondering.

What do you think?