Eliezer Yudkowsky is an employee with the Singularity Institute, a non-profit devoted to developing human-level or smarter-than-human Artificial Intelligence from a self-improving seed. The Institute also puts on the annual Singularity Summit, a conference that draws various scientists, futurists, businesspeople, and laypeople to discuss emerging technologies and their implications, especially as they relate to the potential enhancement of human intelligence or the creation of smarter-than-human Artificial Intelligence. Yudkowsky is also known for his extensive blogging on the econblog Overcoming Bias, where he formerly blogged with GMU economist Robin Hanson. Yudkowsky’s blog posts, which delve into philosophical issues surrounding rationality, identity, metaphysics, materialism, determinism, and quantum physics, to name a few, are archived at the community weblog Less Wrong.
Yudkowsky will be speaking at the upcoming Singularity Summit 2010 in San Francisco, August 14-15.
H+ Magazine: Hi Eliezer. What do you do at the Singularity Institute?
Eliezer Yudkowsky: My job title is Research Fellow, but I often end up doing things other than research. Right now I’m working on a book on human rationality (current pace is around 10,000-13,000 words/week for a very rough first draft, I’m around 150,000 words in and halfway done with the rough draft if I’m lucky). When that’s done I should probably block out a year to study math and then go back to Artificial Intelligence theory, hopefully ever after (until the AI theory is done, then solid AI development until the AI is finished, et cetera).
H+: What are you going to talk about this time at Singularity Summit?
EY: Tentative talk title, "Simplified Humanism and Positive Futurism." "Simplified Humanism" is what I call humanism after you simplify some of the moral judgments; for example, instead of holding that life is a good thing and illness is a bad thing up until age 80, at which point death is a good thing and old age is a blessing in disguise (gotta say, that’s one heck of a disguise, sure coulda fooled me), you just flatly say "life is good, death is bad" and you keep that moral judgment out to age 500, 50,000 or, should you manage to live so long, fifty billion years.
"Positive futurism" is the extension of the Enlightenment spirit that got us this far on out ahead; the project that Sir Francis Bacon once described as "the effecting of all things possible", though this is technically all sorts of wrong (it should be "the effecting of the best things possible using available resources" or something like that). It stands in contrast to the idea that we now have enough technology, and should do the sophisticated, mature, cynical thing and be pessimistic about the value of any future progress. I’ll probably also include a couple of words on the rationality-values in this particular memeplex, like the idea that every added detail in your story is burdensome and needs to be justified, rather than just going off and spinning big complicated fun entertaining predictions; and for that matter, the idea that positive futurism is about values, not predictions — it doesn’t say the future will be positive by default, it just includes a vision of a positive future that seems achievable and which is still considered a gettable stake on the board.
H+: Some people consider "rationality" to be an uptight and boring intellectual quality to have, indicative of a lack of spontaneity, for instance. Does your definition of "rationality" match the common definition, or is it something else? Why should we bother to be rational?
EY: Well, that’s a version of the concept I call "Hollywood Rationality", and my book is going to contain some lovely contrasting quotes from Richard Feynman and Spock. The tradition of rationality as passed down among real scientists, as opposed to the television sort, is about being willing to accept reality -— to find it in yourself to listen to the results of experiments, rather than the thousand voices of wishful thinking, politics, fear, et cetera. The modern technical definition of rationality breaks down into epistemic rationality (believing what is true) and instrumental rationality (steering the future where you want it to go). All of this is so far removed from the Hollywood version that it might as well be on another planet, and not one that can talk to us faster than light, either. One of the best nontechnical definitions of rationality I’ve heard is "That which can be destroyed by the truth should be," and the converse is "That which the truth nourishes should thrive." Nowhere in probability theory does it say that you’re not supposed to feel anything, but it can, perhaps, tell you the way the world really is, so that what you feel is about the way the world really is, and not about some other world that doesn’t exist. And as for "lack of spontaneity," I’m not really sure how to answer that but I will say that up the chimney factor is a happy dance puppy.
H+: In your recent work over the last few years, you’ve chosen to focus on decision theory, which seems to be a substantially different approach than much of the Artificial Intelligence mainstream, which seems to be more interested in machine learning, expert systems, neural nets, Bayes nets, and the like. Why decision theory?
EY: The mainstream is justifiably interested in getting things done immediately that they already know how to do, which means being strictly pragmatic and thinking in terms of toolboxes full of tools that would be considered "ad hoc" if you were a scientist calculating statistical significance, and would be considered invalid if you were a mathematician producing theorems. If you look at the winning system on the Netflix Prize — to predict unknown movie ratings from known movie ratings for each user — it was a gigantic conglomerate of, if I recall correctly, literally dozens of different algorithms.
No science paper would ever have been allowed to calculate statistical significance like that, even though, in modern-day practice with hundreds of teams actively competing, that gigantic mixture of systems was the most efficient at extracting information from the data. But Artificial General Intelligence is not, I think, something that happens if you can mix together a few more dozen systems into a movie recommender. It’s something that is going to take things we currently don’t know how to do, and that are probably extremely different in kind from things we currently know how to do. That’s one reason to go back to basics. And if we want to build Friendly AI rather than just any AI, we will want the AI to understand itself and probably write proofs about its self-modifications. That means going back to mathy basics, not just trying to invent lots more ad hoc tools. But on an even deeper level, we are faced with a problem of understanding something we don’t currently know, and this is a different sort of challenge from throwing lots of tools at things. It means you have to think.
H+: What do you mean by Friendly AI?
EY: Two components. First, you have to be able to make an AI that wants anything it is designed to want and will go on wanting that. Second, you have to choose the right thing for the AI to want. Both problems are a lot harder than people seem to think. If you solve both you get a Friendly AI, though I sometimes use the term to refer to the first challenge only because it is more technical.
H+: What makes you think it would be possible to program an AI that can self-modify and would still retain its original desires? Why would we even want such an AI?
EY: If Gandhi doesn’t currently want to kill people, and you offer Gandhi a pill that makes him want to kill people, and Gandhi knows this is what the pill does, Gandhi will refuse to take the pill, because he knows that if he takes it, he will kill people, and that is not currently what he wants. That is an informal argument that most decision systems with coherent utility functions automatically preserve their utility function under self-modification if they are able to do so. If I could prove it formally I would know a great deal more than I do right now. We want an AI like that because there are a gigantic number of AIs in the design space, and most of them have random utility functions. Random utility functions will not steer the future anywhere we want it to go. We would prefer one that leads to a future galactic civilization that we would regard as worthwhile, as opposed to the whole universe being transformed into paperclips or something.
The main reason we want AIs with goal systems we know are stable is because, as the eminent mathematician I. J. Good pointed out a while back, there’s a potential positive feedback cycle where an AI makes itself smarter, and having become smarter, is smart enough to see new design improvements. Then it goes back and make itself even smarter, a process sometimes colloquially refered to as a FOOM, as in, "AI go FOOM", and less colloquially called an "intelligence explosion". So, it’s entirely possible that an Artificial Intelligence sufficiently good at self-modifying will end up way smarter than we are, and at this point what happens, possibly to the whole Earth plus all the galaxies we can see in our telescopes, is determined by what the AI wants to happen, which is what makes the problem of Friendly AI arguably the most important math problem known to science. In a very fast nutshell, that’s what the Singularity Institute is about.
H+: How does your rationality writing relate to your Artificial Intelligence work?
EY: The AI problem is sufficiently complicated and sufficiently full of tempting fallacies that we were literally not making progress on talking about it until we went all the way back to basics and started talking about the rules for reasoning about the problem. To give a very simple example, there’s something called the "conjunction fallacy" which means that if you ask one group of expert forecasters about the probability of a flood that kills more than 1,000 people in the next year, and another group of expert forecasters about the probability of an earthquake in Calfiornia which leads to a flood that kills more than 1,000 people, the second group will give you higher probabilities even though you asked them about a more complicated statement which must be less probable by the conjunction rule of probability theory, P(A&B) <= P(A), the probability of "A and B happens" is always equal to or less than the probability "A happens whether or not B happens". Things like that. It may sound odd, but every time we tried to explain anything, we eventually hit on one tempting little fallacy that would derail the person we were talking to, and finally I wrote up everything in about a million words of daily blog posts over more than a year, and people read it, and then we started making progress. So the book is going to be some of that, in hopes it will be a popular book and we can make some more progress. Again, whether or not it sounds odd, this is what we’ve found works in practice for explaining really complicated issues in futurism where there are lots of tempty and tasty fallacies.
H+: The Singularity Institute turned ten years old in June. Has the organization grown in the way you envisioned it would since its founding? Are you happy with where the Institute is today?
EY: Nope and nope, and if you’d asked me ten years ago whether I’d be surprised to know I would say that ten years later, the answer would once again be nope. We’ve taken way too long to get where we are now. And the main thing which I am happy about is that I am no longer scurrying around trying to find enough money to stay afloat instead of doing research. Michael Vassar is doing the scurrying and for the last few years I have actually been able to work on real things. It’s an awesome feeling. But it took years to get here and all of us are feeling increasingly nervous about timescales.
Michael Anissimov is Media Director for the Singularity Institute and Co-Organizer of the Singularity Summit, an annual conference that focuses on emerging technologies like nanotechnology, biotechnology, robotics, and Artificial Intelligence. He also writes the popular futurist blog Accelerating Future. He lives in San Francisco.