Artificial General Intelligence and its
Potential Role in the Singularity

 

Ben Goertzel, PhD
June, 2006

            I’m almost 40 years old now, and one of the changes I’ve noticed in myself as I’ve grown older is – not surprisingly – a different attitude toward time.   I’m not talking about becoming more patient with age, though this has definitely happened.  Nor about becoming more comfortable with death – that’s never going to happen!  I’m not terrified of death, but I think the same thing about it as I always have, since childhood: it’s a very unfortunate thing that should definitely be avoided if possible... and I very much hope this becomes possible during my lifetime.  I’m talking more about achieving a greater visceral understanding of the dramatic ways in which the world can change during a fairly brief period of time..  It’s been an interesting period to grow up in. 

            Unlike my grandfather, I can’t remember when onions cost a nickel.  But I can remember when computers were huge things that only existed in a few major labs, when there was no World Wide Web and the Internet was for military purposes only, and music and books and research papers didn’t exist in digital form.    There was no modafinil to keep you awake all night programming or partying, no Viagra to keep Bob Dole and the other old Republicans sexually active, no artificial retinas or cochlear implants, and there were wooden legs instead of electronic artificial limbs. 

            And in science, whole belief systems have risen and fallen during the course of my research career.  The complex systems research community didn’t exist when my intellectual career started – then the Santa Fe Institute came, and before long the complex systems theme spread everywhere and became mainstream.  All the radical ideas I read in musty cybernetics books in the library stacks back in the late 70’s and early 80’s have long since become mainstream science.  Macroscopic quantum effects aren’t just science fiction anymore.  I’ve been headhunted by quantum computing companies.  Famous physicists are writing books about the possibility of time travel.  Venture capitalists are psyched about nanotechnology companies.  Automated theorem-provers haven’t yet put mathematicians out of business -- but they’re used in order to verify the correct function of computer chips, and we couldn’t have modern computers without them.  There are books on sale in Barnes and Noble – or bn.com – containing images of blood flow inside people’s brains, taken while they carry out various forms of thought, perception, action and emotion.  We’ve genetically engineered mice that can regrow severed limbs.

            The message is clear: a lot can happen in just one, two or four decades.  But there’s a catch – at least, so far.  I’m not the only one to get the subjectiveimpression that things are going faster and faster where science and technology are concerned.  But the impact on everyday life is another story.  Computer technology has transformed science – with the Internet, data analysis and visualization, laboratory automation and so forth.  But everyday life is really not all that different from how it was when I was a kid.   MP3 players are better than Walkmans, but it’s not exactly a world-changing transformation.  The Internet makes research a lot easier and better, but I’m still limited in how fast I can ingest information by the processing rate of my brain.  All in all, in spite of the amazing progress, I can see why some people might say all this advancement isn’t such a big deal, in human terms. 

            But when you look a little deeper, it’s not hard to see that this gap between science and everyday life is probably a temporary situation.  As exciting as the last century has been, the next will be even more so.   The discoveries of the last century are going to make their way into everyday life – big time.  The same overall research methodology that brought us mice with regenerating limbs and cloned sheep is very likely to end aging, and make possible the creation of human beings with intelligence vastly greater than that of Einstein or von Neumann.  The overall research and engineering methodology that’s brought us World of Warcraft, Grand Theft Auto, Microsoft Word and Mathematica is going to bring us simulated worlds with more sensory and motoric richness than the real world – virtual worlds that humans can actually live in, so that we perceive the physical world around as just one of many possible systems for organizing our experience.  Social institutions that we take for granted – like work and school – are going to disappear, as obsolete as the pack of pygmies hunting in the jungle.  Exactly when all this will happen is hard to predict – it could be just a couple decades, or things could drag on till around the end of the century.  But, barring some major calamity or some truly bizarre science and engineering bottlenecks, it’s going to happen sooner than most people think.

            But out of all these exciting possibilities on the horizon, there’s one that I think is more exciting and more important than all the rest – and it’s this possibility that I intend to talk about here.  This is the possibility of creating minds vastly more powerful – and vastly better, in various senses – than human minds.  This is what I call Artificial General Intelligence – to distinguish it from the special-purpose “narrow AI” programs that contemporary researchers are creating to carry out various tasks like chess playing, car driving and so forth.  Ending work, living forever and making video games more real than real life – these things are amazing, for sure.  But as long as human minds are the apex of intelligence, the future is limited by the scope of human imagination.  Human imagination is broad and wonderful and powerful, but compared to what’s possible in the total space of minds, it’s tiny and insignificant.  Just as the Earth itself is probably tiny and insignificant compared to what there existss in the universe as a whole.

            The creation of Artificial General Intelligence – AGI – is the grand adventure.  Whatever other achievements you can imagine, they’ll be massively easier to achieve with the help of a benevolent AGI.  This is not a religious thing; I’m not talking about some mystical savior.  I’m talking about machines that we can actually build – and I believe we have the technology to do so now.

            Imagine the benefits that would accrue to a community of cats and dogs, living in very difficult conditions, if a community of advanced humans oriented toward helping them were suddenly to arise in their midst.  And then imagine something far, far beyond that – because ultimately we’re talking about intelligences vastly more different from us than we are from cats and dogs, since all mammals largely share the same genetic material.

            There are serious scientists and engineers working on AGI, right now.  I’ve spent the last decade of my life directly working on the problem – creating a design for a generally intelligent software system, based on cognitive science and computer science, and crafting the architecture to make it simpler and more efficient so it can be implemented and run on contemporary computers.  There’s still a lot of work ahead before you can boot up your computer and run a superhuman artificial intellect – but it’s definitely not a crazy idea.  It’s a palpable thing which is achievable by a relatively small team of experts working in a focused way for considerably less than a lifetime.

 

Quantitative and Qualitative Arguments for a Coming Singularity

 

            There’s nothing very new about the idea of massive scientific and technological advances causing massive changes in human life and experience.   Anyone who grew up on science fiction has this idea in their bones.  But lately the idea has become much more mainstream, and the arguments in favor of its plausibility have been laid out more carefully.  As of mid-2006 when I’m writing this, Ray Kurzweil’s recent book The Singularity Is Near has sold around 100,000 copies and is still going strong – this says a lot regarding the ripeness of the world these days for radical futurist thinking.  Kurzweil has eloquently laid out the case that the 21st century is going to bring a massive increase in the rate of creation and utilization of new science and technology, with radically transformative consequences in every domain of human being.  People are naturally skeptical of this kind of idea, but the number and speed of innovations making it to the mass market has started to wake people up a bit, to the reality of what the future may hold.

            Clearly, a Singularity-type outcome is far from guaranteed – the future is full of uncertainties.   But after careful reflection on all the evidence, a number of serious thinkers have concluded that it seems a reasonably likely prospect.  And more and more people are starting to realize that this isn’t as insane as it seems at first, from a naive “common sense” point of view.  Common sense assumes the future will be like the past – which is a good approximation in regimes of slow change, and not a good approximation for the 21st century.

            The argument for a coming Singularity can be approached from two directions: quantitative and qualitative.   Regarding the quantitative aspect, Kurzweil and his staff have drawn hundreds of curves showing trends analogous to Moore’s Law: the speed of computer processing, the spatiotemporal accuracy of brain scanning, the cost of sequencing a genome, the size of the smallest machine, and so on and so on, are all increasing exponentially and with impressive growth rates.  They have shown that, in a surprising number of cases, the growth rate can be extrapolated to approach infinity by mid-century.  While this kind of extrapolation can’t be trusted in detail, it’s certainly evocative.

            Of course not every aspect of technology demonstrates this kind of exponential acceleration: counterexamples abound – such as the relative stagnation in dinner fork technology, the palpable but slow rate of improvement in sock and shoe technology; and, more to the point, the arguably depressing rate of progress in software reliability.  But it’s not necessary for every aspect of technology to accelerate or even advance steadily in order to make a Singularity happen rapidly.   This point follows from the qualitative argument that a Singularity is coming, which to me is more important than the quantitative argument.  The key part of the argument for a coming Singularity is the contention that a few critical, transformative technologies are likely to be developed in the next few decades, the next century or so at the latest.  If these technologies come about – nanotechnology, genetic engineering, artificial intelligence, human brain enhancement – then a technological Singularity radically transforming human life is very likely to occur.            

 

AGI May Play a Special Role in the Singularity

 

            The Singularity, if it comes, will involve a variety of converging technologies – but, as I said earlier, among all these, there is one that is likely to have a special role, due to its uniquely strong ability to accelerate the development of other technologies.  This is Artificial General Intelligence (AGI): the creation of software systems with the capability to deeply understand themselves and the world, and then to use this understanding to solve an ever-expanding variety of different problems in a variety of different domains, and to improve their own capabilities based on what they have learned.  The reason I’m personally spending most of my time working on AGI instead of other aspects of science or technology is that I think AGI has a strong potential to play a very special role during this last phase of pre-Singularity human life.

            Genetic engineering, quantum computing, nanotechnology, brain-computer interfacing, artificial intelligence – none of these Singularity-enabling technologies are discrete, separate entities.  Many of these technologies have the capability to accelerate the rate of one another’s advancement.  Nanotechnology, sufficiently developed, would enable extremely rapid advances in brain-computer interfacing, quantum computing, AI and other areas.  Quantum computing would accelerate other technologies via permitting the rapid execution of massive-scale simulations of various processes.   And so on.

            The greatest acceleration potential, however, is provided by technologies that address the problem of improving the intelligence of the scientists and technologists themselves.  Super-fast quantum computers and powerful nanomachines will be wonderful indeed, but their capability to help us will still be limited by the capabilities of the humans who use them.  Technologies like advanced brain-computer interfacing and genetic engineering, with the potential to significantly increase the maximum level of human intelligence, would be very likely to increase the rate of advance of all other sciences and technologies via the creation of smarter scientists and technologists.  And the same holds, even more so, for AGI.

            Intelligence-enhanced humans will be better able to develop the next wave of technologies than ordinary humans – and on a personal level, it is easy to see the appeal of enhancing human intelligence before moving ahead with other revolutionary technologies.  Perhaps, once we’re smarter, we’ll be able to better understand how to deploy these other things, combining advanced intelligence with human sensitivity.

            However, the dramatic enhancement of human intelligence, in the near term, is fraught with difficulties.  Biological science is nowhere near advanced enough to permit the initiation of serious experimentation in this domain.  And once bioscience is ready, one can expect a significant amount of criticism to be leveled at this sort of research, on ethical grounds. 

            There is also reason to doubt whether it will be possible to make humans that are dramatically more intelligent than current humans while still retaining their fundamental human-ness, any more than a dog with a 180+ IQ would still be a dog in any fundamental sense.  Ultimately, the creation of intelligence-enhanced humans turns into the creation of artificial intelligences based on a biological rather than silicon-chip substrate, and using current humans as an initial seed.  Given the complex and in many ways suboptimal architecture of the human brain, it is not clear that the guidance of enhanced humans would necessarily be more reliable from the perspective of ordinary humans than the guidance of digital artificial intelligences.

            Enhancement of human intelligence is important and should be done – indeed, it would be unethical for us, as a society, not to pursue it.  But it seems clear to me that, as compared to Artificial General Intelligence, human intelligence enhancement has significantly less potential to accelerate the rate of advancement of science and technology in the near term.

 

What About the Destructive Potential of AGI?

 

            “With great power, comes great responsibility.”  I believe this quote comes from Spiderman’s uncle – Uncle Ben – and it aptly addresses some of the major issues that AGI brings along with it.   A number of futurists have complained, with some justice, that Ray Kurzweil’s book  gives short shrift to the potentially  hazardous nature of Singularity-enabling technologies.  Some futurist thinkers with a less optimistic bent than Kurzweil have asked some difficult questions:  Couldn’t acceleration of technological and scientific advancement be dangerous – potentially posing an “existential risk” of annihilating humanity in toto?  Couldn’t a rogue AGI turn against us – perhaps taking inspiration from one of the numerous science fiction movies and novels with this theme?

            The only honest answer is: Yes, potentially, bad things could happen as the result of AGI development.  A rogue AI like SkyNet in the Terminator movies, or Colossus in the old film Colossus: The Forbin Project could certainly occur.  However, it’s well known that science fiction is better at using the future as a medium for communicating current hopes and fears, than at actually projecting the future.  There are clear potential dangers regarding AGI, but science fiction should not be taken as a guide to discussing them. 

            There are significant dangers associated with AI, and also very significant dangers associated with human intelligence enhancement, nanotechnology, biotechnology  and a host of other radical technologies.  And, given our current level of knowledge, there is really no way to rationally assess which of these poses the greater danger.  Just as each of these awakening technologies has the capability to accelerate the others, so each one has the capability to help us defend against the dangers that the others pose.  As an example, a powerful and benevolent AGI would be a valuable ally in defending against bio and nano terrorism.

            Some futurist philosophers have taken an alarmist tack, arguing that unless AGIs are designed in such a way as to guarantee their benevolence to humans, the odds are high that they will be destructive to humans, not necessarily via malice but perhaps via reappropriating the molecules constituting our bodies for purposes they judge more valuable.  But this is an unwarranted assumption.  Right now we are simply too ignorant about AGI to know what the real dangers are.  We currently have no rational method to calculate the odds that a superhumanly intelligent AGI will be benevolent versus malicious versus indifferent.

            My own feeling is that if we play our cards in a reasonably sensible way we can create powerful AGIs that are beneficial rather than harmful.  Suppose we create an AGI that qualitatively appears to us to have a benevolent “personality” – through a combination of engineering and then teaching it in that way.  Then, suppose we engineer it so that one of its few top-level goals is: “Don’t allow any of my top-level goals to change significantly” – and then give it plenty of instruction in this area, to be sure it intuitively understands this aspect of its goal system and dynamics.   If an AI system thus architected and taught has a reasonable level of self-understanding, it seems likely that it will remain reasonably benevolent as it grows and improves itself.   I have no formal, ironclad proof that this kind of simple approach to safe AGI development will be effective, but the hypothesis seems plausible enough to be worth exploring.

            So, the situation with AGI and its dangers is as follows: We know the potential for good is tremendous, and the potential for danger equally so.  And there are simple, plausible approaches that seem reasonably likely to result in AGIs that are benevolent rather than malevolent.  Furthermore, we know there are other sorts of technologies being developed in parallel to AGI, also with great potential benefits and great potential dangers.  Now, if we were living in a society with a primitivist, back-to-nature focus, in which none of these other radical technologies were being pursued, it might make sense to restrain the rate of AGI development in order to ward off the possible dangers while studying how to guarantee  more thoroughly.  But in the current situation, where these other technologies are also being pushed ahead, I believe it makes sense to push ahead with AGI as rapidly as possible. 

            In a nutshell: AGI has more potential for good than anything else on the horizon, whereas regarding dangers, AGI is “merely” one among a number of potential hazards looming; and there are straightforward techniques that may minimize these potential hazards.  And most importantly, the dangers of AGI are relatively easy to study because computer programs are easier to experiment with than biological organisms or nanomachines.  There are all sorts of experiments we can carry out with our early-stage AGIs to assess the safety – or otherwise – of allowing them to get smarter and smarter and more and more powerful.

            AGI researchers should be extremely and exquisitely cognizant of the potential dangers of their work – in the same way that chemists and nuclear and genetic engineers should be.   But given the tremendous potential benefits, and the broad range of dangers facing the human race as we move forward, it would be foolish to forego or delay AGI research because of the general, poorly-understood possibility of danger.

 

How Fast Will It Happen, When It Happens?

 

            When superhuman AGI does come about, how fast will it come?  Could it happen that one day – out of the Internet or out of some nerd’s computer – a digital god just emerges, whole and mature ready to do our bidding?  Or will the emergence of AGI happen slowly, step by step, with an increase of half an IQ point each year as a team of hundreds of AI engineers add more machines and more cognitive algorithms?

            We really don’t know of course – though I have my own opinion.   In the lingo of AGI futurism, what I’m talking about here is the distinction between “hard versus soft takeoff” scenarios.  This distinction turns out to be extremely relevant to the AGI safety issue.  A hard takeoff scenario is one in which an AGI rapidly develops its intelligence from the human level to the massively superhuman level.  A soft takeoff scenario, on the other hand, is one in which this transition takes decades or more, so that there is a long period in which roughly human-level AGIs coexist with humans, without possessing massively superhuman powers.  It’s not clear which of these is the more likely type of scenario, though I gravitate toward the hard takeoff possibility, for reasons I’ll discuss a little later. 

            How does the hard versus soft takeoff issue relate to AGI safety?  It helps to divide the safety issue into two categories: generic and system-specific.  Generic AGI safety issues apply no matter what AGI system you’re talking about.  System-specific AGI issues are extremely different depending on the specifics of the AGI system in question: the design of the system, the method of teaching it, and so forth.

            An example of a system-specific AGI safety issue is the question of safe self-modification:  How do you build an AGI that is not likely to do anything dangerous, even after it repeatedly modifies itself based on its experience?  I’ll come back to this one later.

            On the other hand, discussion of generic AGI safety issues leads immediately into politics.  One runs into question like: What kind of organization do we want to see supervising the development of the first human-level artificial mind, or the first superhuman artificial mind?

            In a soft takeoff scenario, it doesn’t matter so much who develops AGI first.  In a soft takeoff, it’s likely that after the basic principles of AGI design, implementation and teaching become known, every major government and corporation will have an AGI of its own.  The legal issues regarding AGIs will become significant – a topic currently being explored by Martine Rothblatt, among other legal experts.   AGIs could even become a part of every home, if this was judged legal and ethical; for instance one might have specialized human-level intelligences as “non-player characters” in video games, or as the equivalent of secretaries operating within Microsoft Office 2050!  Possibly the most powerful AGIs would be held by organizations such as IBM or the US military with the access to the greatest amount of computer hardware – or by Wall Street trading firms interested in using AGIs to trade the markets (which naturally will be the only way for them to win in the markets since everyone else will be using an AGI trader too).

            In a hard takeoff scenario, on the other hand, the “first mover advantage” may be incredibly significant.  If the path from human-level AI to vastly superhuman AI takes years rather than decades – or even takes months, days, hours or minutes – then it makes a very, very big difference who gets there first.  In this type of scenario, potentially the creation of the first human-level AGI could be kept secret for the whole duration of the takeoff, so that the world would only find out about it after it had already become the most powerful being on Earth.  Or, even if it wasn’t kept secret, it might simply be too hard for any other group to catch up in time once they found out what was going on.

            The hard and soft takeoff scenarios don’t necessarily contradict each other, either: we could well have a soft takeoff for a while, with human-level AGIs integrating themselves into society, followed by a hard takeoff when some human or AGI figures out a better AGI design more capable of rapid advancement.

            In the soft takeoff scenario, whoever develops the first AGI – if they want to remain major players in the AGI domain -- must be careful to do it in a way that ensures they can continue to develop their project at a rapid rate in spite of the presence of potentially powerful competitors.  Many times in history the individuals who first developed a new technology were not the ones who profited from it, and ended up having very little input into the course of the technology’s development.  One way for the first-movers to increase their odds of retaining continued leadership is to use their AGI to make a large profit before others begin to catch up to them in the AGI race.  This may be carried out via applying their AGI to one of the many exciting application areas potentially benefiting from powerful computational intelligence – financial prediction, drug discovery, or natural language question answering, to name just a few examples.

            In a hard takeoff scenario, on the other hand, the considerations are totally different.  There are many people and organizations on this planet that I personally would not feel comfortable having control of an AGI system capable of undergoing a hard takeoff and becoming tremendously more intelligent than anything else on Earth.  There are a lot of subtle issues involved with self-improving AGI systems, both technical issues and ethical ones, and just because some individual or institution has the motivation and resources to fund the creation of an AGI doesn’t mean that they will necessarily have thought through these issues thoroughly.  (I almost want to say “lived through” rather than “thought through,” as these are extremely deep and difficult issues, and anyone who truly takes them seriously will inevitably wind up spending a large percentage of their time mulling them over from every available perspective.)

 

What Are the Bottlenecks?

 

            What are the bottlenecks standing in the way of achieving powerful AGI? The first answer is: We don’t really know, of course.  Until AGI is achieved we won’t really know for sure how AGI can be achieved.

            But still, even at this stage we can make some plausible guesses.  It’s possible that computer hardware is a bottleneck.  Some theorists have argued that hardware is a bottleneck by making rough estimates of the processing power of the human brain, and then calculating this according to Moore’s Law computers won’t catch up to the human brain for decades.  This is an interesting sort of computation, but it makes two poorly-justified assumptions: one, that we know enough about how the brain works to meaningfully define estimate its “processing power”; and two, that a well-designed AGI won’t be able to make more efficient use of its available processing power than the brain does..

            My own opinion is that computer hardware probably isn’t the bottleneck.  It could be, of course – but I’ll only be willing to conclude that it is once we have a likely-looking AGI software system running on a sizeable network of contemporary computers, and we have a reasonable technical argument that the system might display far more intelligent behaviors if were implemented it on a bigger network or a network composed of faster computers. 

            It seems to me that, until very recently, the bottleneck has been software design.  No one has had an adequate design for an AGI system.  It’s not as though people have been implementing well-designed, well-thought-out AGI systems on contemporary computer networks and then stomping up and down because their systems can’t achieve the predicted level of intelligence because the computers aren’t good enough or the network isn’t big enough.  Rather, nearly all so-called “AI” research consists of what Kurzweil calls “narrow AI” – software systems that solve some particular class of problems generally associated with intelligence, like playing chess or driving a car or diagnosing a disease based on a list of symptoms, or placing trades in the market based on databases of historical market data. Very, very few serious AGI system designs have ever been proposed – and nearly all of those that have been proposed have obvious conceptual flaws.  I think the main reason we don’t have an AGI now is that no one has known how to do it.

 

Minds and Patterns

 

            From a certain perspective, creating a workable AGI design can seem incredibly hard.  Humans have so many capabilities, and so much natural variability – so much ability to grow, create and learn.  The machines we’ve made so far are a lot less flexible, adaptive and inventive than an average human – let alone a creative genius.  If creating a new version of Microsoft Word takes a team of hundreds a couple years, then surely creating an AGI software program must be out of reach!

            But if one takes the right perspective, the job of AGI design can seem less onerous.  The first thing one has to realize is that, at its heart, intelligence isn’t such a complex thing – in fact the essence of intelligence is very simple.  A mind is a system for recognizing patterns in itself and in the world – nothing more and nothing less.  A mind learns to achieve its goals by recognizing patterns regarding which behaviors have helped it achieve similar goals in the past. 

            There you go – it’s simple – just plug in a supergoal to a pattern recognition engine, add in some sensors and actuators, and you’re done!   AGI is achieved!

            Recently some mathematicians have proved some nice theorems along these lines.  Probably the most impressive work has been done by Marcus Hutter and Juergen Schmidhuber in Switzerland.  What Marcus Hutter has shown is that, if you have a truly huge amount of computing power, achieving an arbitrarily high degree of intelligence is easy.  This is fairly obvious intuitively, but it’s nice to have a rigorous proof.  The only problem is that the amount of computing power required is more than could be achieved using all the particles in the observable universe.

            So much for theory – now what about reality?  The reason it’s not so simple to build an actual AGI system in the real world is that pattern recognition is extremely computationally expensive – and in the real world we’re dealing with processors of finite speed, and memory storage units of finite capacity.  To make intelligence work given finite computational resources isn’t so easy.  General-purpose pattern recognition methods are too inefficient.  You need to take some general pattern recognition methods and a variety of specialized pattern recognition methods and put them together.  Which specialized methods you need depends on the kind of intelligent system you’re building – there may be specialized pattern recognition methods dealing with visual patterns, or patterns in how to move parts of the body, or linguistic or logical patterns and so forth.  Then you need to create a framework in which these special and general pattern recognition methods can work together effectively and efficiently: they all may work differently, but they need to speak a common language and share information together in real time.

            All this is necessarily complicated, but there’s a sense in which it’s not that profound.  None of the parts of a mind need to be all that profound – the pattern-recognizers, the common language for describing patterns.  The brain does pattern recognition using various kinds of specialized neural circuits, and it represents patterns using patterns of connections between neurons.  AI software programs may represent and recognize patterns by different methods – they don’t need to have simulated neurons and so forth.  In either case, the really profound thing is the structures and dynamics that emerge when you put all the pieces together and let all the pattern-recognizers interact with each other to build up a body of patterns recognized in the system and the world – based on the system’s interaction with the world and with other minds in the world.   Among these patterns that emerge when you put all the pieces together are the things we call self, awareness, will and memory.  From the subjective point of view these feel like special qualities – but from an engineering or scientific point of view, they’re just emergent properties that come about when you put together certain kinds of pattern recognizers in a system and have them interact with an environment.  I spent the first decade of my research career working on theoretical cognitive science, with a focus on using the ideas of pattern and pattern recognition to explain various aspects of intelligence.

            So, the task of AGI design comes down to designing the right specialized pattern recognizers, and the right language for recognizing patterns, so that the emergent structures and dynamics associated with mind may come about.   This is not an easy task by any means, but it’s not an ineffable and mysterious task either.  It’s just hard engineering and science work, involving an integration of knowledge from cognitive science, computer science, mathematics and neuroscience.

 

Bringing Up Novababy

 

            The next part of my message is probably the most controversial one: I think I know how to do it.  I’m not absolutely sure that I do – I’m enough of a scientist and a skeptic not to be absolutely sure of anything -- but I feel pretty confident.  I’ve been working on narrow AI research, cognitive science theory and AGI design for a long time now, and it seems to me that the Novamente AI Engine design that my colleagues and I have been working on actually has what it takes.

            I should clarify what this means:  Has what it takes to do what? Creating a powerful AGI is a complex, long-term project, and our intention in the Novamente project is to approach it in multiple stages.  My colleagues and I have created a 3D simulation world similar to a video game world, and our Novamente system controls a humanoid agent in this world.  The humanoid – the Novababy – lives in a simulated apartment, much like one sees in The Sims or similar games.  The goal of the first stage of our project is to make a Novababy that can act, roughly speaking, like a simulated human baby.  Not a very young baby that just lies on its back and waves its arms and says “goo goo goo,” but a baby roughly equivalent to a human one year old.  This is where we are right now: in the middle of the first stage, working on Novababy, moving its intelligence gradually toward the one-year-old level through a combination of software engineering and instruction.

            Then, for the second stage, we’ll teach Novababy human language.  Not by feeding it information from linguistic databases or masses of Web pages, but by talking to it about the things that it sees and does in the simulation world.  It will learn language like a human toddler does – so everything it says, it will understand in the context of its own life.  No AI program has ever achieved this before.  There are AI programs that use human language, of course – there are chat bots that pretend to hold a conversation with you – but none of them really understands what they’re talking about in the sense that a human language-user – even a small child – does.  At this stage we can give it all sorts of psychological tests like the ones used by child psychologists, to assess how well it understands the world.  We’ll be able to study its development, and hopefully create the beginnings of a new science of “AGI developmental psychology.”

            In the third stage we’ll cheat a little.  Now that the system has some real understanding of its world, based on its own embodied experience and its interactions with us, it has the context to integrate knowledge from outside its experience.  We can feed it knowledge from dictionaries, encyclopedias and the Internet – it can surf the Web like anyone else, only potentially faster.  We can give it access to software like Mathematica to help it carry out calculations faster.  At this point the development of the system begins to deviate rather far from human psychology, but it will still be relatively simplistic and completely within our control in terms of the underlying dynamics of its learning and the representations in its memory.  Now we have a system that can really reason abstractly, that can reflect on itself and us in the context of the knowledge it’s achieved.  Such a system will be able to act, to an extent, as an artificial scientist and engineer, helping us make new discoveries in various domains. 

            At this stage the system should lead to some exciting things, and be able to serve as the core of some unprecedently powerful software applications.  I could speculate for a long time here about the various thrilling things we could do with an artificial mind like this, but the truth is we’ll only know once we’ve built it what its real strengths and weaknesses are.  One strength I’m sure it will have, though, it the ability to answer questions posed to it in English based on the knowledge it has.  So one application of this artificial mind will be to create the world’s first really effective natural language question-answering system – a software program with full access to all the information on the Net, as well as the information in any databases we choose to give it access to, and with the ability to understand our English questions and to find information on the Net addressing our questions and synthesize the information in an intelligent and relevant way. 

            This is one application that I believe will be extremely effective for making money using an AGI at this intermediate level of development.  Unlike us humans with our single focus of attention, a single Novamente system should be able to answer a huge number of different questions at once – and not just regurgitating information it’s found, but creating new information by synthesizing bits and pieces of knowledge that it’s found in various places.  This application in itself will be an amazing tool for advancing human knowledge.

            And this is the point – somewhere in the middle of this third stage of development, when the Novamente system is able to read information and answer our questions – where we will stop, take a deep breath, and try to understand the system really thoroughly – trying to come as close as we can to a science of AGI development and AGI dynamics.  Because when we take the next step beyond here, moving into the fourth stage along our developmental path, we’re getting into potentially dangerous territory.  We’d like to create a powerful artificial AI scientist, able to make new discoveries in AI and modify its own internal structures accordingly, but at this point things become unpredictable.

            Given the way the Novamente architecture is set up, as long as a Novamente system can’t modify its own algorithms for learning and its own techniques for representing knowledge, it’s possible for us, as the system’s designers and teachers, to understand reasonably well what’s going on inside its mind.  Unlike the human brain, Novamente is designed for transparency – it’s designed to make it easy to look into its mind at any given stage and see what’s going on.  But once the system starts modifying its fundamental operations, this transparency will decrease a lot, and maybe disappear entirely.  So before we take the step of letting the system make this kind of modification, we’ll need to understand the likely consequences very well.  Right now we don’t have the understanding to make this kind of decision – but right now we don’t have the data that would be necessary to build such an understanding, either.  The understanding required to analyze and predict the dynamics of strongly self-modifying Novamente systems will come out of experience in observing progressively more and more intelligent Novamente systems as they grow and learn.

            Launching a superhuman AI is a really big decision.  The magnitude of the decision can hardly be stated forcefully enough.  This is bigger than nuclear weapons or human cloning or brain enhancement: it potentially changes everything.  I would be a lot more comfortable proceeding to launch an AGI system with the capability to increase its intelligence beyond the human level if I had a solid theoretical understanding of the way AGI’s value systems tend to change as they encounter new experiences and as they modify themselves progressively over time.   But I am fairly pessimistic about the possibility of achieving this kind of theoretical understanding through armchair theorizing alone.  Theory will play a role but theory works better when grounded in experiment.   Theory is what told the great mathematicians and physicists of the world that human flight was impossible, before the Wright Brothers went ahead and did it.  Understanding the properties of powerful self-modifying AGIs is going to be hard, and we need to use all the tools at our disposal: both theory and experiment.

            Now, it’s possible that what we’ll learn, during this period of study, is that it’s too unsafe and unpredictable to let a smart Novamente system radically modify itself.  It’s possible we’ll learn that the Novamente design itself is unsuitable for this next stage – maybe Novamente will help us to design a better AGI.  It’s possible that, with Novamente’s help, we’ll figure out amazing new methods for human brain enhancement and make ourselves massively smarter so we can better think ourselves about the next steps to take with AGI.  Or it’s possible we’ll decide that the best thing is to let Novamente proceed and modify itself, becoming more and more intelligent and less and less within our comprehension and control.

 

How Long Will It Take?

 

            All this may sound awfully science-fictional, and I expect you to be skeptical.  Just to be sure you understand what I’m saying; I’ll take some effort to phrase the statement I made above more carefully.  I said that the Novamente AI design has what it takes.  But I don’t want to be misleading about what the design constitutes.  I don’t have a blueprint for an AGI sufficiently detailed that I could hand it off to an outsourcing shop in Bangalore and have them return, a few years later, with a superhuman digital mind.  It’s not worked out quite as well as that. 

            What I do have, though, is a detailed design for an AGI, at a slightly higher level of detail: parts of it at the “computer scientist” level of detail, and parts of it at the “programmer“ level of detail.  The details are outlined in hundreds of pages of written material -- as well as tens of thousands of lines of software code written by the programmers and scientists I’ve been working with on Novamente since 2001.   I have been lucky to find  a dedicated team of brilliant scientists to work together with me on transforming the Novamente design into a thinking machine, including Cassio Pennachin and Andre’ Senna and my wife Izabela from Brazil, Ari Heljakka from Finland and Moshe Looks from Israel and the USA.  We’re working hard to implement the design right now, but it’s going at a pretty slow pace because the design is big and complex and there aren’t very many of us.  There are plenty of small gaps in the design where details need to be filled in, and I’m working steadily on filling them in – but actually, most of these gaps are things that are most easily filled-in in the context of implementing the software and then teaching the software system in the context of the simulation world, and seeing how well it learns. 

            In my judgment the completion of the programming of the Novamente system, together with the filling-in along the way of the various small gaps in the design, is going to take a dedicated team of 10 really excellent programmer-scientists – including the current Novamente team -- a period of somewhere around 3-7 years to get to the point where the system has a strong reflective understanding, and it’s time to stop focusing on building and teaching and focus more of our attention on studying the thing we’ve made, and on working with it to solve critical problems and create useful applications.

            Three to seven years ... or, if I’m being overoptimistic, perhaps it may take 10 years of concentrated effort to get to the stage. Whether that seems like a long time or a short time depends on your perspective. This is a long time compared to a single PhD thesis project, or compared to the 1-3 year periods usually funded by government computer science research grants.  But those aren’t really the right comparables.  Given the magnitude of what’s at stake here, this is not really such a huge amount of work.  The Apollo rocket or the atomic bomb projects took a lot more effort than this – to name a couple really important, innovative, ambitious engineering projects in recent history.  And of course nearly any major software program written in the last decade has required a lot more effort than this – there are literally thousands of people working on the next version of Microsoft Windows!

 

Who Else is Seriously Trying?

 

            Who else besides myself and the Novamente team is seriously trying to create a powerful AGI, right now?

            Surprisingly enough, there haven’t been all that many attempts to create AGI.  If you look at the history of AI, the number of really serious efforts is quite small.  There’s Cyc, which has been around since the 1980’s, but so far they’ve focused almost entirely on building a giant database of knowledge expressed in logic format – their work on “artificial thinking” has consisted almost exclusively of work on logical reasoning, which is only one aspect of intelligence.  There are SOAR and ACT-R, which are interesting projects, but have really focused more on modeling human cognition in particular domains rather than trying to achieve autonomous artificial cognition.  The robotics community has done some great work but hasn’t really tried anything more ambitious than artificial insects or artificial cars – nothing with any potential for real self-understanding. 

            A number of projects have popped up with the long-term goal of emulating the human brain. I think this is an important direction of research, and it’s one that’s sure to succeed eventually – but realistically, it’s going to be decades before the neuroscientists tell us enough to let us make any kind of realistic simulation of the parts of the brain responsible for cognition and awareness and the really interesting aspects of human intelligence. IBM’s Blue Brain project and Artificial Development’s CCortex are building distributed infrastructures for brain simulation -- but so far these aren’t being put to very dramatic use, because the neuroscientists don’t yet know enough about how the brain works to make a useful simulation of any of the brain’s impressive aspects.  Jeff Hawkins’ Numenta is an interesting project, but the details he’s released really only seem adequate for making an artificial visual cortex – he doesn’t have any special knowledge about the brain, though he has an innovative theoretical framework for explaining some of the data neuroscience has produced.  None of these groups have the remotest idea how the brain represents and manipulates abstract knowledge, and nor do they have a sufficiently detailed map of the brain to be able to emulate it without such understanding.

All in all, no major, well-funded corporate, university or government research lab has yet made a really serious attempt to create an artificial general intelligence based on computer science and cognitive science principles.  Now, you could say this because they have had the sense to know it’s just not a feasible thing to try at this stage.  But I don’t think this is the reason.  I think the real reason is a kind of institutionalized conservatism.  AI theorists made a lot of false promises in the 60’s and 70’s, and people got cold feet about AI – but by this point in time, computer hardware and computer science and cognitive science have advanced a long way, so that the failures from 30 or 40 years ago don’t really tell us much about what could be achieved now with serious effort.

            In the last decade a number of “start up” AGI efforts have sprung up here and there – Peter Voss’s A2I2 enterprise, Steve Omohundro’s and Michael Wilson’s Bayesian inference based approaches, Pei Wang’s uncertain logic based approach, and a handful of others.  I can’t address these approaches in detail because their creators haven’t explained them to me in detail.   Perhaps one of these will end up being adequate to get to the end goal.  My initial impression, based on what I’ve heard, is that these approaches are a bit too oversimplified to lead to truly powerful AGI – but I admit I could be wrong. 

            Among my favorite current approaches are Stan Franklin’s LIDA system and the Joshua Blue system being developed by Sam Adams at IBM.  These are fairly complex integrative systems, and they have a reasonable amount in common with my own Novamente approach – which may have something to do with why I like them.  However, my personal opinion – based on what I understand about their designs, which isn’t everything – is that their designs aren’t made in such a way as to cause the right kinds of high-level structures and dynamics to emerge.

 

Why Do I Think It Will Work?

 

            So what do I think is so good about Novamente?  The way I think about it, there are four aspects to making an AGI system, and you need to get them all right to achieve powerful intelligence – and naturally enough, they all depend on each other.  Putting these four aspects together is critical to making goal-oriented pattern recognition – the crux of intelligence -- work given realistic computational resources.           

The first aspect is what AI theorists call “knowledge representation.”   How does an AI system represent information?  Does it use simulated neurons?  Does it use logic formulas?  Novamente has a special mathematical knowledge representation that combines aspects of the brain’s neural network representation with aspects of formal logic and probability theory.  This representation seems to be adequate to compactly represent useful knowledge of every sort required for a human-level intelligence: perceptual, linguistic, conceptual, mathematical, emotional, social, motoric, declarative, procedural, episodic....  And most important, it represents these types of knowledge in a way that Novamente’s learning algorithms can handle.

            The second key aspect of AGI is what I call “cognitive architecture.”   This refers to the high-level breakdown of the AGI system’s function into areas like perception, language processing, general cognition, attentional focus, and so forth.   Each of these functions depends on certain other ones, forming a network of interconnected modules.   Here I haven’t tried to innovate that much, and have taken a lot of inspiration from cognitive psychology and cognitive neuroscience.  We do seem to understand a lot about the cognitive architecture of the brain – even though we’re still very ignorant about how the brain carries out each of its functions.  At this very high level, Novamente looks a fair bit like some other AI systems – for instance, Stan Franklin’s LIDA system or Sam Adam’s Joshua Blue.   The key thing about Novamente is the set of structures and dynamics that exists inside the different boxes, and the way these cause the emergence of structures and dynamics spanning the whole system.   

            The third key aspect of AGI is education: teaching methodology.  I said above that we’re teaching Novamente by having it control a humanoid agent in a simulated world.   I think this important.  Embodiment, is an extremely convenient approach to teaching an AI system all sorts of useful things regarding language, perception, cognition and so forth.   The idea is to progressively lead a AGI system through a series of more and more complex situations involving embodied interaction with human-controlled intelligent agents, according to a pattern guided by human developmental psychology.  This is an educational methodology that should allow an artificial mind to grow in a spontaneous and natural way.  And then, after the mind has progressed far enough through embodied learning, it can make use of its ability to import data from linguistic and other knowledge bases.  Because of having learned to understand the world and itself in an embodied way, after a certain stage it will be mature enough to accept the knowledge from databases and process it in a meaningful way.

Finally, the fourth and most critical aspect of AGI systems is learning.  This is the most essential aspect of what happens inside the boxes on the architecture diagram.  The most critical reason why I think Novamente will work as an AGI has Novamente’s learning algorithms.  I interpret “learning” pretty generally as any kind of adaptation of internal knowledge structures based on experience.  Novamente has a number of learning algorithms but there are two particularly important ones.  One is called “Probabilistic Logic Networks,” and it’s a unique approach to making probabilistic reasoning practical in the context of integrated perception, cognition and action.  The second is a kind of evolutionary learning – sort of like John Koza’s genetic programming in that it learns complex patterns and procedures by simulating the evolutionary process, but different from standard genetic programming in the way it uses probability theory to make evolution go faster.  Each of these learning algorithms is a story unto itself ... but the key point regarding Novamente’s learning algorithms is how they fit together as a whole. 

            My friend Debbie Duong introduced the phrase “House of Mirrors Design Pattern” to describe software systems like Novamente and her own language processing software – systems that consist of a number of components that not only exchange information but mutually adapt to each other based on experience.  Novamente’s learning algorithms embody the House of Mirrors Design Pattern in a very carefully constructed way.  Each of the learning algorithms is designed to learn from each other to help each other overcome their weaknesses.  I’m sure this kind of feedback exists in the brain too, between the brain’s different learning algorithms, which are implemented in different kinds of neurons and neurotransmitters and patterns of neural connectivity.  But we don’t know how the brain does these things yet – whereas in Novamente, because we’re designing it ourselves, we can do the math and the software engineering to make sure everything works right and the different parts all fit together intelligently.

            When you get all these four aspects right – then, I suggest, the magic happens.  But it’s not really magic of course – it’s just the emergence that happens when multiple general and specialized pattern recognizers act together on an appropriate common representation of patterns, figuring out the patterns a system needs to follow in order to achieve its complex goals in an environment shared with other minds.  Self, awareness, will and all that can’t be programmed into an AI system, but an AI system can be designed and programmed so that their emergence is almost inevitable.  That’s what the Novamente design is intended to achieve.

 

How Near is the Singularity?

 

            The Singularity is near – but how near? In large part this depends on how near we want it to be. Right now, our society spends vast amounts of money on the creation of tastier chocolates, sexier bikinis, funnier TV shows and movies, and so forth.  We build special machines to rip holes in newly manufactured blue jeans to give them a cooler, pre-worn look.  I like tasty chocolates and sexy bikinis as much as the next guy – though I prefer to tear my own jeans rather than have a machine do it for me.  But it seems pretty ironic to me that the creation of new forms of life, mind and experience is so far down there on the priority list. 

            But yet, I think the Singularity is very likely to happen, even if directly Singularity-relevant technologies are poorly funded, because of indirect effects.  The increase in computer processor speed may be driven largely by video games, but it will benefit AGI tremendously.  Pharmacogenomics isn’t focused on curing aging these days – but aging is closely related to cancer and other diseases, so it’s making advances toward curing aging anyway.  And so on.

            On the other hand, if more of society’s effort was explicitly focused on moving toward a positive Singularity, it could probably happen a lot faster – and potentially a lot better.  This acceleration could happen in a lot of ways, one of which obviously has to do with AGI.  If society chose to divert some of its funds from making tastier chocolates or ripping holes in new blue jeans to funding a few dozen promising, safety-conscious AGI projects, this could have a huge impact on the future of the human race.  It could result in the creation of a benevolent AGI before, rather than after, someone creates a globally dangerous bioengineered pathogen – before, rather than after, some terrorist groups creates botched nanotech that accidentally turns the surface of the earth into gray goo.

            It is difficult to predict – especially the future.  According to Google, these words of eternal wisdom were uttered by the baseball player Yogi Berra, and the quantum pioneer Niels Bohr, and are also an ancient Chinese proverb.  No one knows exactly what will happen.  Not even Nostradamus.  But we can use our reason to try to understand the future, and the ways in which it may be fundamentally different from the past.  And we can use our reason, and our time and effort, to try to make the future what we want it to be.  Personally, I find myself massively scared by the idea of a future full of advanced nanotech and biotech controlled by contemporary humans with our violence-prone, rationality-impaired “legacy brains.”  The next phase in the development of science and technology should be supervised by transhuman intelligences with more rationality, morality and self-understanding than human brain architecture allows.  And the good news is, it seems reasonably likely that we can make this future happen.

 

Dr. Ben Goertzel is founder and Chief Scientist of Novamente LLC. To learn more about the Novamente AGI design, please visit:  http://www.novamente.net


Discussion of Essay: