SAI) software on hybrid analog-digital hardware. Experiment will take place at levels beyond human ability to follow, and we will spend months analyzing answers already being synthesized into new research programs at the bleeding edge of machine understanding. Advances in every aspect of science will be possible simultaneously, and the effectiveness of such progress will be limited only by the resources we can pool to each task, and our willingness to share our results on a global market of ideas. We will see new trade in automated research that will allow nations to specialize and prosper as new knowledge economies arise.
Such a future is of limitless potential, but will only be possible with the focus and the effort to see it through.
There is no other single discovery which could hope to provide the same level of utility and world changing advances as that of strong artificial intelligence. It will have the unique property of entailing all other discoveries. Preeminent beyond measure, it literally and figuratively represents the ne plus ultra of human intellectual and scientific endeavor. And yet, in spite of this, it is a field that is relatively ignored when compared to the participation and funding of programs like CERN and The Human Brain Project. SAI is not only comparable in magnitude but exceeds all their possible future outcomes combined by encapsulating them.
But misconceptions regarding SAI research abound. Therefore, it is important to identify and overcome them so that this direction can be seen for the science that it is—and less like techno-fantasy. The pursuit of strong artificial intelligence is both feasible and practical, and, with a concerted, multidisciplinary effort from the international community, could be realized well before the end of this century. At the very least, we should have a major research effort in place comparable to that of brain emulation, space, or physics programs.
After clarifying the misconceptions, two basic research proposals will be presented. The first will be a program brief in the case of a major international research effort, and the second, an alternative program based on Web collaboration, which will call for the creation of a unique online platform and set of services to crowd-source the effort.
What is strong artificial intelligence? First, it is helpful to contrast it with what it is not. SAI is not plain artificial intelligence of the kind reported in the news. It is not necessarily related to the utility or importance of the task it solves. That is, a self-driving car, while a great accomplishment, is not SAI; because, it is limited in its ability to perform another function. It is incapable of doing something completely different without being reconfigured or redesigned.
Sometimes, SAI is described by another term: artificial general intelligence (
AGI)—though they are not necessarily equivalent. And SAI should also not be confused with
human-level artificial intelligence, which would be a strict subset of its full potential. Rather, SAI must be thought of as meeting and exceeding human cognitive ability in all regards, and, I would further extend that common definition by requiring that SAI not be
mind-blind; that it possess the ability to attribute mental states to human beings, animals, and other SAI entities, and that it possess theory-of-mind ability and affective emulation at sufficient levels to supply adequate empathy. SAI would be functionally capable without the latter, but arguably incomplete on a cognitive level.
The most probable reason why major SAI research is not already Big Science is because it is commonly believed to be infeasible to create. This belief usually comes in one of two forms: (1) that we lack sufficient computational power, and/or (2) the problem is too difficult, with too many gaps in our present understanding to even begin.
The first is not true, and is a misnomer I will address with several others later on. The second, while true, is not permanently so, and I believe a Riemann (the mathematician responsible for the geometry needed to even imagine general relativity) may be amongst our generation today that could bring about the necessary paradigm shifts. Barring that, I believe in the power of collaboration and consensus, so long as we can get out of our own way, and shed old and ineffective ideas—surprisingly, this is hardly ever done without considerable time and effort. Elaborating on the last point, it is critical that we accept the failures of the past approaches to creating strong artificial intelligence, and understand that the dynamics and generality required of SAI are fundamentally different than that of narrow AI projects and concerns. This realization could result in a team discovery, but only if such a team can accept change and take intellectual risks.
Overcoming the infeasibility argument based on optimism alone is difficult and ineffective. Past researchers have focused on energetic displays and futurism—such as the opening to this manifesto—but the reality is that such hand-waiving doesn’t provide results policy makers need to justify the political and financial risk. Nor does it, alone, make the academic specialty of SAI research more appealing (or clearly distinguished from narrow AI research) to young minds. The argument must be presented such that it is clear that SAI is physically possible (it is), and that it’s inception and discovery is probable (it is). This is not speculation, but mathematical conjecture: algorithmic, classical, and quantum information theory; the relationship between entropy and the work of Boltzmann and Gibbs in statistical mechanics; the Church-Turing thesis; and several other theorems and physical laws suggest that SAI is scientifically possible, and therefore probable that, if we conduct experiment and research, that we will uncover the unique assembly of algorithms, heuristics, and theorems needed to entail it. As a result, and considering its potential reward, it becomes unconscionable that we do not already have a Big Science effort for SAI already. To be clear: this is not a call for funding individual SAI researchers in general. This is very specific. It is the call for a large-scale international effort for developing and researching strong artificial intelligence with a large and diverse, interdisciplinary team, on par with the scale of the operations that take place at CERN. If the physical basis of reality is worth discovering then surely the basis of abstract intellectual work is doubly so. We owe it to ourselves to unlock this ultimate accelerator of progress.
Science has benefited immensely from taking risks. The LHC was built on the chance that we would discover new physics. We weren’t certain what we would find. And let me be clear: that installation’s effectiveness is not in question, nor are its undeniable contributions to human understanding. But the reason this comparison is made is to reiterate the comparative magnitude that the discovery of SAI represents. That it literally overshadows and outstrips every breakthrough to-date. And, secondly, to point out the fact that we have already invested enormous financial risk in large scientific projects based on conjecture and theory alone. These have ranged from space exploration to weaponization. Some have even failed spectacularly. But, incredibly, we haven’t even attempted SAI on a fraction of the scale of past and current projects. It is not unreasonable that we make a serious attempt, and I believe it to be absolutely justifiable when we stop to consider the exponential reward that such an invention could bring to us. The opportunity cost of not pursuing it is incalculable, and when we factor in the loss of human (and animal) life, disease and aging, it becomes abhorrent on an ethical level not to act.
The problem has been identified. Now I will attempt to isolate some of the major causes so that they can be overcome. This was, admittedly, a most difficult section to draft, as it touches on aspects of our identity and dignity. However, I am confident that with reflection and contemplation that we can exceed ourselves, acknowledge what must be acknowledged, and change our course to see the most effective future come to pass. I strongly believe this begins with a look within. Namely, with the status quo of neuro chauvanism.
I am going to be direct. There is a time to delicately play on words and there is a time to just cut through denial. And, seeing as how we are now spending billions on massive brain research programs, and controlling advanced drones at home and in space, that time is now. We can clearly afford an attempt at strong artificial intelligence, and one of the reasons why we have not done so is related to something I call supremum bias.
The supremum bias is manifested when there are two or more competing options that are perceived as sharing overlap in function or effectiveness, but are, in reality, worlds apart in both, causing loss of fitness or utility for the afflicted. This is inspired from the mathematical concept of the same name, interpreted here as the least upper bound on a counter-factual truth. One that might have been investigated were it not for an established and competing concept that is less effective or completely misguided, but nevertheless appears true to the beholder.
This leads cleanly to one explanation for our collective inaction towards SAI research on a large scale. The competing option, the one that has established this bias in our current supremum scenario, is brain emulation and neuroscience research. This is not to say that these programs are not needed; they are extremely valuable to the future of medical science and to human understanding. This can not be overstated. However, these disciplines belong to an anthropocentric category which denotes a biologically inspired approach to understanding intelligence. One that seems good enough in the mind of most academics, but is actually a local maximum in the total pursuit of truth. And, given the belief in the intractability of SAI, and the fact that neuroscience research is paramount to understanding our very nature, it is not unreasonable to see why this approach is so prevalent. And, again, this is not to detract from the funding or educational investment in these programs. Rather, it is to highlight the fact that a supremum bias does, in fact, exist. It is also to point out the misconception that to understand intelligence that we must first understand the human brain, which is a completely baseless position, as abstract intelligence isn’t dependent upon biology. And that statement takes us to the crux of the problem.
There is a very real belief among many who plan neuroscience curricula, and those who study, teach, and practice it, that a biological approach has primacy over mathematics and the information-theoretic. This is a kind of neuro chauvinism, and is an instantiation of a very old notion. It harkens back to the geocentric Ptolemaic system. Deep down, there are many very prominent academics that believe that there is some special quality to neurons that gives them sovereignty over intelligence, and many philosophers who hold the same, but towards consciousness.
That we are not aware of our tendency towards such bias is puzzling, as science (mainly physics) has continually pushed back the boundaries of our understanding, updating our place in the order of things. And, each time we took a step in the correct direction towards new knowledge, we were humbled. This issue has surfaced again, and it is the most difficult of challenges because it is the closest to us—it is us. The infinitesimally close proximity to our minds is the ultimate distraction, blocking us from the most optimal path to a full understanding of what intelligence, consciousness, and experience is, in the most abstract and technical sense of those words.
Indeed, all of neuroscience is a proper subset of the physical—and therefore, mathematical—principles that underwrite the science of strong artificial intelligence. Not just computational neuroscience. Not just cognitive science. Every approach is just convention over top the structures that make up the information processes governing conscious experience and cognition—and everything else. It shouldn’t be controversial, but it is. Even though we know the truth. It is simply a physical fact that the brain is a dissipative structure, and that such a system, though convenient to model through physiological nomenclature and convention, is nonetheless an open thermodynamic system that exhibits complex feedback between itself and its environment. And, that such a system is ultimately governed by the same principles that underwrite the connections between thermodynamics and the information theoretic.
Further, using algorithmic information theory, we will eventually create a universal formalism that will bridge the correspondence limit between soft biological science and mathematical modeling. That is to say, these fields will converge to allow an automatic interdisciplinary approach. One that is critically necessary to making ultimate progress possible. I predict that the brain, being an information processing system, will be exhaustively entailed through heuristics and non-deterministic algorithms, and that the resulting descriptions will be significantly less complex than anticipated. This reduced complexity conjecture is based on the observation that significant portions of ontogenesis exhibit dependency on feedback from the environment. From an information perspective in the brain, regions explicitly develop first that supply engrams that, for various reasons, are not specified ab initio. The algorithmic consequence of this being a vast compression (shortened description) of the total information needed to describe a mature, fitness generating organism. That is, total information defined as what would have been recorded if one instrumented all cognitive processing between sensory and bodily feedback in the nervous system and the relevant parts of the brain that receive them. By exploiting the constraints, modalities, and functionality of the host embodiment, this future information is encapsulated as an expected input that is bounded using known, learnable structures, with only those structures, and the organ or region itself, needing to be encoded, and not the total expected information that it will produce over a lifetime. This fits neatly with Chomsky’s universal grammar (
UG), in that we must clearly share the basic structures over which learning supervenes. No amount of raw, free association of learning from data will ever spontaneously give rise to the objective structures of higher-order forms and categories that exploit reuse, reflexivity, and recursion without being explicitly encoded. This is true of biological systems, and true of software systems. It is my major criticism with present approaches to machine learning. Especially those that believe hierarchy and causal chaining will give a linguistic free lunch. More likely, the evidence suggests specialization and modularity explicitly granted through adaptation.
Put another way: there was a kind of learning involved in grammar, in the most abstract sense of the term, but it was memoized through natural selection and encoded into our genes. I find it incredible that such a simple, logical concept is not already accepted by the mainstream.
One of the first steps involved in selling something of great value, say, a home, is having it appraised. You wouldn’t contact a celebrity, or an electrician, or even a renowned scientist, to perform this job unless they were qualified to do it. But qualification is perhaps the wrong term as well. What we really want is to know that someone’s valuation is going to be accurate. The major point being that the relative success or prominence of an individual has very little to do with their actual understanding of the issue at hand. We tend to take shortcuts in belief with social proof to save time, but sometimes it pays to do the homework yourself. This is a metaphor for the present problem with the misconceptions surrounding not only cognitive science, but the feasibility of strong artificial intelligence. Much of this is due to those without the relevant background jumping to conclusions.
The previous section actually is contained within this greater issue. Namely, the biomemetic approach. While bias and anthropocentrism were the focus of the arguments for that section, this section is going to come at it from another angle. If it wasn’t convincing enough already, these facts should slam the door permanently on the issue.
Perhaps the biggest misnomer in machine learning is that we need to emulate, mimic, or duplicate the functionality of the neuron. This can also be extended to network approaches in general. Numerous professionals and academics are beholden to this idea, with it possessing a firm command over their minds. But it doesn’t just stop there. This includes everything from Hierarchical Hidden Markov models to Bayesian networks to Probabilistic Hierarchical Grammars. Connectionism. Hypergraphs. Sparse Distributed Memory. Causal modeling. Parameter learning and Model-free methods. All of these, every single one, is just convention over top a singular computational reality. This is not an opinion. It’s not perspective. It’s a fact given by the concept of universal Turing machines, and proven every single time one of these conventions are executed on a computer. Underneath it all, every single one of these conventions is nothing more than a sequence of 1s and 0s in the set of all finite binary strings as some description language. This is explained by a mathematical concept called prefix complexity.
The existence of universal Turing machines, along with prefix complexity, and the invariance theorem, mean that any possible procedure that can be effectively computed can be described as a two-part message, with the first part being a prefix-free description language and the second being a program or message written in that description language. And, that, when these two are literally affixed to each other end-to-end and fed to a universal Turing machine, you can get the same result up to some logarithmic (practical translation: relatively small even with large input) additive constant even if you switch description languages. The end result and bottom line being that whether something is an artificial neural network or a probabilistic grammar, or some other future ontology, be it formally or informally specified, modeled or learned, that it is and always will be nothing more than some input into some universal Turing machine. What this reveals to us is that all of these conventions, and these systems, and methods for making machines do what we want them to do, are merely crutches for our limited ability to specify what it is we want, and that there exist upper bounds that can be calculated in the limit that show us the redundancy and gloat of our own inefficient expressions. We our out of our element, which is why parameter and model-free learning are increasingly becoming the most popular contenders for artificial intelligence research. But this is only a mirage, a temporary oasis on our path towards solving SAI.
In summary, the message to the biologically inspired is that if we model cognitive processes as an information exchange we get to use these powerful proofs for free. But before we can do that we have to address the next misnomer.
Computability is a misunderstood concept, especially among some philosophers of mind. I won’t spend much time on this subject; because, I believe those primarily interested in this discussion are mostly aware of the distinctions I am about to make. The issue is often a point of contention and, in an incredible twist, used as an argument against an information processing account of the mind. These arguments are misleading, and not even remotely true to the issue. To put it in the simplest terms possible: computability and computation are not synonymous. Computation is information processing and exchange. I actually generalize computation to information over an abstract communications channel. But I won’t go into that here. The important point is that anything can be effectively calculable if we release the expectation of getting a definite, deterministic answer. A computable process may halt or it may not. We have to let it play out.
Those that misunderstand this will often say that the human brain is
incomputable. Well, the number π is incomputable, but it is effectively calculable. In a similar fashion, the brain is effectively calculable—if it weren’t, we wouldn’t be here.
What does this mean in the practical, though? It means that, while we won’t have a specific and single neat description for human intelligence or SAI that will be easily understood by us, we will have one that entails it, along with the information needed for it to become fully functional and effective in the world. And, because of the above points from algorithmic information theory, we can conceptually combine this total information in repeated fashion, compressed, and affixed, to be taken as the formal specification by accepting the conjunction of two messages embedded in a prefix free language for some universal Turing machine.
Hence, our genetic code is a message in the description language of nucleotides. That is in turn processed by cellular machinery, which itself, is another description language operated by yet another level of organization, and so forth. Unlike computational ontologies, these ontologies are functional and real, but are ultimately singular as physical information. This fact is actually used, completely incorrectly, by the philosopher Galen Strawson in his book,
Consciousness and Its Place in Nature (2006), to argue against the case for strong emergence, completely missing the mark that abstract information processing has free license for strong emergence because it plays out over time in a computational space. One that is indeed influenced by, but is not materially concordant with, any one medium. Hence, the fact that information can be stored in a vinyl record or as digital encoding on a disc shows us its invariance—and that it need not even be digital.
Naturally, interference in these media cause different artifacts in the presence of noise due to the unique properties of their structure, but this says nothing of the need for the information and computation running through them to have possessed the properties of the media. For example, there is nothing within 1 or 0 that gives us the entire contents of the Wikipedia. Yet there it stands and in full representation of about 36 GiB of those 1s and 0s. Not a single high-level property is obtained by the nature of 1 or 0, except as a concatenation afforded by extension in time and space complexity.
The next big misconception is the notion that SAI is
not possible without great(er) computing power. This is a position held mainly by those proponents of Deep Learning, and is, essentially, a doctrine of miners in a great rush for Big Data. This is actually another occurrence of the supremum bias; an apparently effective solution seems to be a winner, but is, in fact, woefully suboptimal in the grand scheme. While the burden of proof is typically on the one who asserts the claim, I’m going to argue it as unreasonable in this situation, as proving why we don’t need great computing power for reasonably effective SAI solutions would require constructing an entire framework that does not yet exist. One that is extremely advanced, and that will take years of focused, singular effort—the very point of this paper is to see made. Instead, what I will provide here is a strong indicator of the inefficiency present in our present approaches to the problem.
The strongest indicator of our inefficiency is the invariance theorem mentioned above. This tells us that different description languages (architectures, methodologies, conventions) can give us exactly the same results, up to some additive constant. The end result being that there are methods that are more effective and more efficient than others.
Essentially, our problem is that we think we can solve the challenge of SAI through brute force—in reality we are at slightly better than brute force, but closer to it than the optimal best case scenario. And, worse, we believe it will spontaneously give rise to generality… if only we had more computing power.
A very apt analogy would be trying to replicate the Mona Lisa by throwing paint at the wall. Theoretically, this could be achieved with indefinite paint and indefinite time. You may even get other famous paintings along the way. But you will never get a painter. And it will never be as efficient as having a skilled painter recreate it. And this analogy is so apt that no real further argumentation needs to be made, as this is precisely what the software for these approaches is forcing the computer to do when it recursively enumerates all the chains of possibilities as it categorizes and relates a corpus. Nor will it ever—in any amount of time—spontaneously generate the appropriate generalized categories for optimal reuse and efficient universal application of the information it finds. And this does not just apply to this approach, but all approaches which do not afford generality. This is why strong artificial intelligence (in this context, best noted as AGI) will be so powerful; it will not just improve the effectiveness of the system by making it more adaptable, but will dramatically reduce the computational requirements needed for it to operate and learn.
The last point on this issue is that SAI need not be continuous in operation. That is, it is not an all-or-nothing proposition the way it is being presented by those who feel the issue is about computing power. I am willing to go on record with a prediction, and here it is: I predict that a 486/DX, if you can even get one by the time SAI is solved, will be capable of running a SAI that can have a conversation with you through a simple text interface. Don’t expect visual or audio processing. But text conversation, replete with affective emulation, attribution, and theory of mind will be completely feasible. The misnomer that there will need to be immense computational power to understand and reflect is simply a byproduct of the bloated computational abstractions and semantics we use to make our approaches to intelligence more intelligible to us. They are a far cry from the most optimal solution possible, and we have only begun to explore that space.
This next misnomer is also related to, and is a sub-category of, the convention misnomer. It is the belief that parallelism has some magical property in the construction of minds. This is often seen in the form of phrases which contain superlatives like
the massive parallelism of the human brain, and those philosophers which straw-man computation by criticizing
serial digital computers. The latter argument is also addressed by the computational demands misnomer above.
That parallelism is somehow necessary for minds is a category error. Parallelism is not an intrinsic property of minds but of particular implementations of minds. Further, any concurrent information process can be represented as a serial process at a higher level of abstraction, or, emulated as a single thread of execution. The ramification of this is that such serialization doesn’t prevent something from processing, it only causes it to potentially run slower.
The other argument is to appeal to the complexity of the human brain, and then attempt to divine its computational ability. This is usually done by bulk neuron and connection estimates, which are, of course, large in the absurd. The suppositions then follow from these incredibly large numbers to estimate how much the brain must be processing, and that any modern computing system is hopelessly outmatched. But, again, this is in error, and for at least two reasons: (1) the brain is an analog computing system, not digital. This is true no matter how much one wants to interpret spike trains as binary. And, (2) the invariance theorem tells us that computational descriptions and software ontologies are purely conventional, and when we talk about neural nets and such that we are actually referring to what will later be described as early, inefficient, and nascent approaches to narrow AI on digital architectures. What works for biology need not even be necessary at all in the computational spaces described by software and algorithms.
Parallelism is not a feature of the landscape, but used either for developer convenience or efficiency in design and implementation. It is not even necessary or sufficient for strong artificial intelligence, but may be a practical need in the future for the most complex aspects of sensory processing and motor function—which, again, is an aspect of the way that we design computer chips. That said, I see concurrency as a critical aspect to modern computing, and we will continue to see languages and methodologies for finding the best way to inch towards Amdahl’s Law in practice. Present conventions and practices in software rarely come close to that theoretical optimal limit due to the aforementioned bloat and waste afforded by syntactical and semantic convention. This is perhaps why a specialized kernel will have to be written, or real-time architecture used, for efficient SAI implementations to be realized.
The last misnomer is one that will be difficult to abolish completely. This is partially because it is a very complex subject, and because so much hype exists which doesn’t just mislead but blatantly misinforms those in the general public. There is also some controversy over the abstract equivalence of classical and quantum computing. This is unfortunate, as it has created unnecessary confusion in an already arcane field. Given the depth of these theories, only a brief clarification will be provided here on that topic, and only as it relates to SAI.
Generally, the quantum computing misnomers come in one of two forms: (1) quantum computing is necessary for SAI to be realized, and/or (2) that SAI will never be conscious—have experience, or qualia—if it is not
quantum. The word quantum, in that context, can mean a variety of things depending on their arguments for supporting the misnomer.
The first part of the misnomer is patently false, as indicated by what is known about the theory of computation. It is known that a quantum computer is faster only at certain tasks, and is adept at simulating physical systems. The confusion and controversy arises over the difference between simulation and computation, which I’ll get to in a moment. However, from a computational perspective, quantum versus classical computation is not a difference in kind but of degree; any quantum algorithm can be executed as a classical one with up to an exponential increase in time-complexity, and, in the case of unsorted search, only a quadratic increase in time-complexity. Many algorithms are not affected whatsoever, and will, as a result, be more cost effective to run on classical digital computers for a very long time.
What is less certain is how quantum algorithms will impact SAI. From what we have seen so far, the benefits may affect how an SAI implementation performs information retrieval (search); however, there are known classical algorithms and data structures which confer excellent performance, even on immense sets of data. Where quantum computing may be most effective is in the possible applications for helping it process heterogeneous sensory input modalities from many sources, and with greater efficiency—however, specialized analog systems can be devised which would also do this without the need for quantum information processing, specifically.
One of the most significant sources of misunderstanding in the relationship between quantum computers and Turing machines was promulgated by David Deutsch in his 1985 paper,
Quantum theory, the Church-Turing principle and the universal quantum computer, which was later followed up by several others. In that paper he set out a
physical Turing principle. In his defense, the paper does explicitly reiterate that he defines computational equivalence only
under given labellings, but it has been interpreted by some who, literally, have taken his physical Turing principle to show that quantum computers can attempt problems that classical ones can not. And this is the source of one the problems.
When it is said that a problem is undecidable, incalculable, or non-computable, we are referring to whether or not we can ever arrive at the answer with certainty. For some problems, like the Halting problem, this doesn’t just mean that it would take a long time, it means that we can’t claim to know that it will eventually give us the answer at all. That is the nature of undecidability, and its relationship to computability has to do with enumeration and the cardinality of sets—well beyond the scope of this discussion. And, as mentioned in the previous misnomers section on computing power, this is not to be confused with the attempt to calculate something, even if it is labeled incalculable, undecidable, non-computable. Finally, when it is said that two systems are Turing-equivalent, it is that the computation of its inputs can be attempted through translation or emulation in the other. This says nothing of the efficiency of that attempt, only that it can try, step-by-step. And, while it may be hopelessly slow to the point where it would not be useful, it is critical to know the difference between what can never-be-known and that which is merely slow. Even more confusing is that there are entire classes of problems which have exponential complexity but nonetheless can be very accurately and efficiently estimated using heuristics.
The second misnomer can only be described, neither proven nor completely refuted at this time; however, it is extremely unlikely that quantum information processing is qualitatively unique to the generation of experience of consciousness over classical information processing. Those who uphold this position often intertwine it with arguments of parallelism, believing that the unity of experience is impossible to create on classical computers. But these are two separate issues.
While the apparent
parallelism of quantum computing is an apt metaphor for powerful computing for certain problems—the true nature of which no one understands—it is not the same kind of parallelism of a spatially distributed information system or computer network. The actual comparison is closer to analog computers, which excel at the integration of multiple signals. This would seem to be like parallelism in the sense that we are taking multiple signals or sources of information and combining them, but that is not equivalent to the same concept of concurrency in digital computing. Digital computers integrate information numerically, where analog computations combine sources through miniature physical models of the problem space and relating it to us in a way that can be sampled. Indeed, the future may be more about analog computing than not, as they consume less power, are extremely durable, and are very efficient. What we are lacking, technologically, is analog computing hardware that can reconfigure itself—we already have this technology for digital logic circuits, and, if we are open to speculation, the biological neural networks, which are, in my opinion, analog computing systems.
But even without analog computing, the unity problem need not be an intractable issue for digital computers. Philosophers of mind have been conflating an architectural issue with a computational one. This was mentioned in the above section, but I will restate it here for emphasis: it is not a law of nature that strong artificial intelligence require a distributed architecture to undergo experience. These common misconceptions and anthropocentrism have led us to believe that consciousness is only obtainable through massive information exchange or computational processing, as reflected by human anatomy. That since we are the only intelligence we have observed, that, categorically, all intelligence must share our physiology. Or, by extension, that consciousness requires immense computational demands as reflected by neuron counts and connections. This is a position that not only misunderstands the nature of computation and information processing, but is sending us down a path of extreme inefficiency. The instantiation of a digital mind, one that could undergo experience, need not physically resemble the human brain whatsoever. And, further, its software specification, which is the way in which it is specified and updates itself, need not model anything like what we presently understand from experience in biology.
In the most likely scenario, the architecture for SAI will be completely counter-intuitive, which is the primary reason it is eluding us today. We need a paradigm shift, and that begins with the elimination of these dogmas. We know that computing is abstract, need not be digital, and is ultimately the exchange and utilization of information. Therefore, simulating neural networks, neurons, Bayesian nets, hierarchical Markov models, or whatever software ontology that satisfies, is merely redundant convention. These are merely description languages, and we utilize description languages to achieve modularity and to interface with the computational landscape. Each of these model conventions are just layering over an abstract computational process.
The last point, related to the above, is the notion of qualia or the experience of
what it’s like, and how that relates to the information processing that SAI might undergo. My position is that this will, in fact, be qualitatively different from normal human experience. That is, in the sense that consciousness is a reflection of the way information is exchanged and processed, and how that process tolerates noise from the environment.
It is my opinion that substrates will not be mind-independent. Different information systems are supervened upon differently, as they have different underlying implementations, each with intricate subtleties based on their micro and macro-configuration. This can range from logical differences to physical differences, such as encoding, and multiplexing or whether it is analog or digital. This would affect the processing and experience of such minds in two ways: the first would be in the behavior and tolerance to noise, which relates to the second. Noise exposes unique artifacts of each medium and the underlying implementation, but the very way in which that information process is implemented will also affect it. Even though information is an invariant, dimensionless quantity, one that can be expressed in an endless variety of media, the differences between
substrates will be in the way that each implements experience. The quality of that experience will be concordant with the sampling and processing of the implementation, and the way it is impacted by noise. For digital systems, this may result in utterly unique and intractable differences from analog ones, such as the aliasing in digital sampling versus the static or fading of analog sampling.
There are philosophers of mind who believe that machine consciousness is not only unlikely, but impossible. This is related to the unity problem, as discussed above, but is really borne out when discussing qualia, and the capacity for digital computing systems to have experience. These philosophers typically operate not in the realm of engineering or science, but in an abstract logos that involves constructed problems and questions of language that have nothing to do with the physical issues at hand. What logically follows in language may turn out to be explained away. The response here is to acknowledge the importance of these ideas as intuition pumps, and then to move on. When we reduce these philosophical claims to their real world correlates, the only possible meaning of this is the assertion that quantum information processing is accessing the fundamental reality of conscious experience, or that, somehow, experience is exclusive to our biology. I am extremely skeptical of this position, and, unfortunately, it is not one that can be tested right now. Worse, it may be unknowable—but this hasn’t stopped us from treating other people as conscious, even if we are naturally predisposed to the notion.
Basic Research Proposals
Next, a brief sketch of two research plans will be presented. The first is the case for an international research effort, and the second, an alternative, in the event of continued inaction. Included will be recommendations on team composition, materials, and some expected milestones, concluding with suggestions on program direction. Each of these could fill an entire report or series of articles themselves, and will only be outlined here as a starting point for future work.
International Research Program
One of the major pitfalls of current programs, aside from relative program size, is that they lack diversity. It is thought that AI is a discipline of programmers and engineers. This is part of the problem. A more interdisciplinary method is going to be required in order to break through the barrier on generality. The team should be comprised of several individuals from all over the world and in each of the following disciplines:
- Cognitive Science
- Computer Science
- Electrical Engineering
- Philosophy (of Mind)
The mathematicians should have a focus in stochastics, algorithms, complexity theory, and computational complexity theory.
There must also be a very strong developer operations core under the following specialties:
- Distributed, Cloud, and Network Communications Systems
- Real-time Systems Development
- Parser Theory and Compilers
- Data Compression, Data Structures, and Optimization
- Information Retrieval
- Device Driver and Hardware Development
- Linux Development
Small research programs tend towards group-think. This can, in part, be mitigated by having a larger and more diverse team, but leadership will be the biggest factor in how ideas are created and shared throughout. Collaboration and open discussion should be the primary focus in the beginning stages so as to maximize group communication and encourage rapid idea generation. Consider a flat hierarchy and alternative leadership structures.
Since it will be unknown which approaches will be the most optimal, it is recommended that teams be scheduled for sessions of differing approaches throughout the week, rather than create sub-groups that work in independent directions. This will utilize the team diversity to ensure that as many ideas and theories are discussed and presented as possible. The primary goal is to find the paradigm shift that will allow for the needed insights to begin true SAI development.
Commodity hardware will be sufficient for the initial stages of research, as the goal will be to uncover and prototype alternative frameworks to deep learning, artificial neural networks, symbolic, and connectionist architectures—each has failed to realize generalized intelligence.
Running free software platforms will significantly reduce cost and allow for a very wide range of configurations under a single framework. The same platform can be configured to be workstations, dedicated servers, or nodes in a cluster. This is sharply contrasted to many proprietary operating systems which often have several versions to do these same tasks, each with increasingly expensive licensing fees.
Agenda and Milestones
All present machine learning algorithms lack generality, and most, if not all, lack complete model transparency. A model is transparent if and only if it can be efficiently put into terms we can understand and apply in the future. And a machine learning system is only generalized if it can do exactly that without supervision. This is why I believe it is critical that machine learning algorithms be reflexive.
Thus, the first major step is to make the model transparent to us. For example, the numerical weights in artificial neural network nodes represent learned patterns. These patterns could represent anything from wear usage on a multi-million dollar drilling system or the envelopes to an avionics control mechanism. But, in all cases, what is learned is locked into an esoteric description language, one of numbers and weights or edges. It is a model that is not only specific to that architecture or ontology, but is often specific to the implementation itself. Ideally, we need to have machine learning architectures with transparent models; those that we can translate into reusable form. This is no different than having someone show their work, as opposed to just giving us the answer. The reason for this is simple: we want to be able to see exactly what the machine learns. And not just in a cryptic numerical form, but in something efficiently intelligible. Some machine learning methods already do give us some insight into what is learned, but without transliteration this will quickly become intractable as complexity increases in dimension and depth. This should be the minimal first goal, as it would, at the very least, allow us to generalize in the absence of the machine’s ability to generalize. Such steps may aid further research by debugging what is learned to provide a reverse analysis on the algorithms and the dynamics between it and the environment.
The second target should be the discovery of what I will label a general learning algorithm (
GLA). This is actually a category of algorithms, as there are many possible ways this can be obtained. A clarification is in order: general learning algorithms are distinct from artificial general intelligence (
AGI), which refers to a whole or complete system. GLAs are a much more specific and focused component, and refer to the algorithm or heuristic that allows for the property of generalized learning. With generalized learning defined as the ability to create reflexive, reusable information models that enhance and optimize future learning by applying past learning without supervision. Such is the contrast between this and conventional learning algorithms, which simply seek to create effectiveness by automating rote memorization of a very specific information model of a problem space.
Once these two goals are realized, however, one does not just obtain strong artificial intelligence. I believe that this will lay only a foundation, and that an ensemble of finely optimized narrow AI systems will be needed to realize a complete SAI system. Given the immense complexity of the two basis milestones I will end this proposal here, as exactly choice of research direction beyond this point it is less relevant once we already have working GLAs.
In the event that no major international SAI initiative is made, I call for the eventual creation of a non-profit that organizes and hosts the cloud services to allow for open access to the basic working API of a distributed, free software basis for strong artificial intelligence. This will be based on an entirely new framework that utilizes a RESTful API, opening it up to a plethora of programming languages and approaches.
Later, as the organization becomes more successful, it can become self-sustaining by providing this service and support to outside organizations. This will, in turn, provide further research funding to continue development, and perhaps select some of the top contributors over the life of the program to become full-time researchers.
The program would be operated using the same Creative Commons of the Wikipedia for all documentation and engineering schematics, and the GNU GPLv3 for all source code. I would volunteer to assist with the creation of the initial kernel that would power the base API and database services so that developers and hackers could begin accessing it remotely to experiment and test. The system would be designed so as to only allow secure access to the API for registered developers, and, later on, in tiers that would mitigate vandalism and spam from entering into the AI core.
Presently, the only real serious attempt at open source AGI is being done by Ben Goertzel and his team under the OpenCog framework. However, this new approach will require starting with a new basis. The reason for this is that the goal is to lower the barrier of entry for developers world-wide, and one of the easiest ways to achieve that is to make it a highly modular, network oriented, and scalable infrastructure that is native to the cloud. This will not only allow for the application to grow without limit, but will open it up to virtually every programming language in the world. Developers will be able to interface with the system and write software in their choice of language. It is my hope that this design will allow for rapid prototyping and experimentation. Later, if a library emerges that is found to be crucial we can always port the source and incorporate it into the main build on the server side. But allowing the quickest access to the fundamentals of the AI system will allow for immense experimentation opportunities that do not exist today.
The other aspect of this idea is that we need a more flexible and dynamic system than a Wiki to manage the synthesis between documentation, coding, discussion, and collaboration. There would need to be the ability to manage multiple research directions; specifications; notes and research; manage profiles; and, debate on issues in an organized fashion. These features will have to be created for the unique complexity at hand. While the Wikipedia sees success in the Wiki concept, it is not the most optimal solution where frequent changes and discussion have primacy over reaching stasis in content; the very point of this site will be to have as many ideas and as much discussion happening as possible. That will necessarily involve a tremendous amount of changes and updates that will require a more organized approach to article creation.
The same goals and milestones are suggested for this program approach as was in the international research proposal outline. The difference would only change in the fact that everyone would be contributing under free software and documentation principles. Later, it is hoped that open hardware would also be included in this.
The last issue to address is the immediate potential criticism that developing SAI openly would somehow be a danger to society. This is a concept I have already refuted in the article,
Machine Ethics & The Rise of AI Eschatology. I encourage those new to the issues to refer to that paper for a very detailed breakdown of the arguments against secretive or restricted development of SAI.
We’ve gone over the roadblocks, misnomers, and misconceptions about SAI. It’s not a matter of lacking computing power or that the problem is intractable. Rather, it is a matter of focus and flexibility. We need to be open to new approaches. Most importantly: we need to at least try. There is no major SAI research program on par with some of the other Big Science programs. The objections can no longer be sustained. Our future will be determined by our scientific progress, and there is no greater enhancement to that endeavor than the discovery of strong artificial intelligence. It is my hope that leaders, educators, and researchers will see the immense opportunity this discovery has to offer us and make a decision to invest in our future.
— D.J. (2013.07.22)
Dustin Juliano is an artificial intelligence researcher, science fiction author, and entrepreneur focusing on programming languages, computational linguistics, and digital communications. He lives in Florida with his wife, Viviane.