# Reduction Considered Harmful

The goal of any science and engineering education is to give the student the ability to “perform Reduction”. Some of you may not be familiar with this term, but you have all done it. It is the most commonly used process in science and engineering and we tacitly assume we will use it at every opportunity. Therefore there has been little need to discuss Reduction as a topic outside of epistemology and philosophy of science.

In what follows, I will be making the claim that for the limited purpose of creating an Artificial General Intelligence (AGI) we must avoid this common kind of Reduction. This article (second in a series; see the first article here) will discuss what Reduction is and why it is useless in the domains where AGI is expected to operate. The third article will discuss why it is also unnecessary. The fourth article will discuss available alternatives. As a bonus, we will come to Understand what it means to Understand something.

I use some common words like “understanding” as technical terms with specific and unchanging definitions, and when I do, I spell them with a capital first letter. I define most of these terms here; for others, see Wikipedia.

When you were in grammar school the teacher taught you about numbers and simple arithmetic. This was your first introduction to a Formal System. Then you learned how to do long addition. Take this plus this, write the sum there, and if there’s a carry, you write it there… this was your first Algorithm.

Then you faced your first story problem. If Holly has three boxes of candy and there are twelve candies in a box, how many candies does Holly have?  After thinking about it, perhaps making a drawing, you realized that you needed to “do 3 times 12 equals 36”. This was your first Reduction.

Nobody ever taught you how to solve any and all story problems you would encounter in your life. Nobody can, because Reduction is not an Algorithm; each Reduction differs in the details. Sure, many problems fall into classes of similar problems. We could imagine a formula for this class of story problems; we could call it “A-containers-of-B-items-how-many-items”. It would specify that we get the total number items by multiplying A and B.

A-containers-of-B-items-how-many-items is a “Model” and it is equally valid for candies, eggs, and apples; it doesn’t specify what we are putting in the containers or what kind of containers we have, just how many containers we have and how many items there are in each. The Model is “context free”. You yourself must provide the analysis of the context. If you have cartons of eggs, you must Understand that the cartons are containers and the eggs are items, not the other way around. You must simplify – “Reduce” – the real life situation in your mind so that you can select the appropriate Model, plug in the correct values in the right places, then run the Model (by performing multiplication or whatever operation the Model uses), and finally you must interpret and apply the output of the Model to your real life situation. The answer is not “36”, it is “Holly has 36 candies”.

Models of this kind, like model airplanes, are simplifications of reality. You simplify by ignoring a lot of details; model airplanes don’t have built-in instrumentation. You are in essence cutting out a tiny piece of your reality as a simpler and purer subsystem, trying to make it independent of the context you removed it from. You could say a million things about a box of candy – its size, color, materials, shape, etc. For this Model, you ignore everything except “how many items does it contain”. Reduction discards context as a nuisance.

Let us use as another example one of Newton’s laws, “F = ma” . This is an equation and therefore also a Model, not much different from A-containers-of-B-items-how-many-items : It states that whenever we have an accelerating mass we can compute the force causing this acceleration by multiplying the mass and the acceleration. If you want to use this, you need to Understand forces, masses, accelerations, and how to measure them. You are taught thousands of such Models throughout your education. Formulas. Equations. Rules. You have invented some of your own.

You may notice that some of your own Models work all the time whereas some only work most of the time. We share our most reliable and most useful Models with each other, and we call this sharing Science. The general idea of creating simple Models describing fragments of reality is called Reductionism and while under-appreciated, it is one of the greatest inventions our species has ever made. It has solved innumerable problems since the seventeenth century when people like Descartes, Galileo, Newton, and Bacon refined and formalized the known practices of incremental Model creation into what we now call the Scientific Method. Of course, Reduction itself goes back much further.

Several kinds of Reductive simplifications are possible. Above we discussed “extraction from the environment/context”. If the extracted subproblem is still too large to analyze then we can repeat the process, dividing the extracted system into subsystems. If the scientific discipline we are operating in doesn’t provide answers then we might want to analyze the system in the “next lower discipline” – for instance, biological problems might be analyzed in terms of biochemistry. If we have a number of similar Models, such as similar equations describing several phenomena, we might want to consider whether a more general Model might cover them all. When we have multiple Models describing the same phenomenon we may prefer to use the simplest of them that utilizes the information we have. When considering various structures, such as chemical substances, we might want to find a description of matter that describes them all, such as quarks. The quest for a Theory of Everything is a truly Reductionist endeavor.

All of these flavors of Reduction have been identified by philosophers and epistemologists, and they love giving them names like “Ontological Reduction” and “Methodological Reduction”. I find these names confusing and like to discuss them all as “Reduction”. Context and experience will dictate which kinds are appropriate. This larger scope is what makes “Reduction” different from “Abstraction”. In simple cases we could have used either word, but “Abstraction” is typically used only where we actually have found a correct abstraction, so its use implies the process is perfect. This article is highlighting the problematic cases and I therefore prefer “Reduction” since it won’t let us forget that the process is fallible.

To give some examples, when a Reductionist (my shorthand description of someone attempting to solve a problem using a predominately Reductionist stance) attempts to understand a frog, they take the frog into the laboratory (isolation from environment), dissect it (subdivision) and study separate subsystems such as digestive system or blood circulation separately (again, subdivision). In order to understand what the frog’s blood does they need to drop disciplines from biology to biochemistry and study how hemoglobin transports oxygen. In contrast, in a more Holistic (context utilizing) discipline such as Ecology, we would study how the frog interacts with other frogs and its environment.

When studying a predominately Reductionist discipline, such as physics, you often see textbook phrases like “All else being constant…” or “In a closed system…” which indicate that some degree of Reduction has already been made, so that what follows in the problem statement is all that matters. By giving you a large part of the Reduction, by means of example, the textbook is coaching you to form Intuitions about how to perform Reduction. These intuitions will allow you to do this yourself later, in the real world, in vastly more complex situations. The key intuitions are those that tell you which parts of the real-world context you can safely ignore.

Reduction is an Intuition based skill. We can tell, because we get better with practice as we get more experience, which is a hallmark of any Intuition based skill. More about Intuition later; for now, let’s just say it is a subconscious problem solving method that utilizes our past experiences without doing Reduction. This sounds confusing only because we’re starting to glimpse the core of the problem. At this level, Reduction is the goal, not the means. We cannot implement Reduction by using Reduction. Please stay with me; we’ll circle the issue until we see it clearly.

Models are created and verified by scientists. But if you are a scientist doing research then nobody tells you what Model to create next, or what hypothesis you should invent to be later tested and verified by experimental work. Graduate school teaches you the verification process including things like confidence intervals, chi-squared tests, and peer review but just like when learning to solve story problems in grammar school, nobody can tell you how to invent hypotheses to explore. At best, they can coach you to learn it on your own, from experience. This is another hallmark of Intuition based skills: they cannot be taught as high level rules, they have to be experienced bottom up. This is, incidentally, also the difference between Teaching and Coaching.

Engineers use the Models that scientists create. Scientific Models are very reliable when used correctly in the situations they were designed for. As an engineer – a Model user – you define the borders of your Reduced subsystem when you are cutting it free from its environment, from its context. If you perform this Reduction incorrectly, then the Model you use may well give you a wrong answer. You may have left behind some context that would have affected the result. I call this a Reduction Error. There are many kinds of these: Sampling errors when measuring waveforms, selecting a Model that ignores friction when it actually matters, or ignoring emergent effects in complex systems. And in Models of complex systems, such as in rule-based Expert Systems in the style of AI of decades past, most failures are caused by incompleteness of the Model. When the system does not have rules for the current situation it will fail, and often fail spectacularly and catastrophically. This tendency to fail in surprising ways at the edge of the system’s competence is called “Brittleness”. Brittleness is the main symptom of any attempt to Reduce the irreducible.

And once again, just like in grammar school, nobody can tell you where to make the separating cut, what to discard as irrelevant, or exactly what measurement or value goes where in the Model (the formula or computer program). They can’t even teach you a sure-fire way to decide which Model to use, out of thousands that you know. Perhaps you need to combine several Models or perform some algebraic transformation to some Model to get one that fits the current situation. Perhaps you should split your problem into several simpler parts and use a different Model for each. This kind of analysis is done through conscious Logical Reasoning – you manipulate Models to re-factor them into new Models that are more useful to your real-world situation. But even knowing which direction to re-factor in requires Understanding, experience, and Intuition.

To summarize so far: Reduction, Model creation, and Model use all require Intuitive Understanding of the problem at hand and of the problem domain. This Understanding cannot be achieved (implemented) by using Models of anything in the problem domain.

Computers run programs, which are all Models. What computers cannot currently do is to perform Reductions, because current computers don’t have the necessary Understanding of their target problem domains. This is the sharp line separating computer science from AGI. The difference between these disciplines is, in a nutshell, simply who is doing the Reduction. If a human is doing the Reduction, then the human is just programming (creating a new Model) or running a program (using a Model). If a computer could perform a Reduction without human help, then it would have demonstrated that it actually, truly Understands this problem domain, and thus should be labeled an Artificial Intelligence.

It is very tempting to use these observations to create a formal definition of Intelligence:

Intelligence is the ability to perform Reduction. Artificial General Intelligence is a computer program with the ability to automatically perform Reduction.

Given the discussion above about what Reduction is I have to say this could be a pretty clear-cut definition of Intelligence, at least if we compare it to commonly used informal definitions. We will remember this and put it away into our epistemological toolkit. It makes a worthy goal for AGI research.

We note that many historical AGI projects did not have this as a goal; they used Reduction as the means. This was (and is) a major mistake. If we want to create an AGI capable of doing Reduction, then we should not be doing this Reduction for it while we are writing the code for the AGI. Consider projects like CYC, the most ambitious Reductionist AI project in the world. Their ultimate charter is to create Ontologies describing everything in the world. The CYClists (the Ontological Engineers at CYCorp) have entered millions of human-made Models expressed as statements in a language roughly equivalent to First Order Predicate Calculus into the CYC system. But these high-level concepts are unconnected – not “grounded” – to any web of low level experience that could provide the Understanding that the system would require in order to do Reduction on its own. Without Understanding to support it, CYC will forever be unable to add to its own knowledge base. It will always be dependent on humans doing all necessary Reduction before it can even start Reasoning about a new problem. Any sufficiently Reductionist AI is indistinguishable from programming.

A more recent example: If asked, the Semantic Web enthusiasts are divided about whether they are working on an AGI or not. Some say it’s just a system to streamline web-based commerce. Others work hard at creating snippets of Ontology about the web world (and to some extent, the real world) in OWL and other ontology languages, hoping to create a CYC-like but larger-than-CYC distributed Ontology. This is clearly nothing but Modeling and Reduction. Anyone hoping that the Semantic Web will become a significant enabling component in a future AGI is going to be disappointed.

Note that all programming is Reduction. Even when writing an AGI, we’re using nothing but Reductionist methods. The main issue is “are we modeling learning or are we modeling the world”? In other words, what domain are we Reducing from? If you want to write an AGI, then it is perfectly reasonable to invent a theory for how Intelligences learn about their world from experience, to create a Model of Intelligence itself if you will, and to implement that Model as a program. You can test it by letting it experience some input in its target domain, such as language, vision, or the real world through sensors. But you must let it experience the world, learn from its experiences, and to figure out its own abstractions all by itself. Eventually the system may become experienced enough to start doing Reduction and to make its own naive Models of its target domain. Do not attempt to do any Reductions on its behalf; it would be counterproductive cheating, like cramming questions from past exams rather than trying to learn the subject.

Further, the “Let’s start by giving it enough Models so that it can bootstrap from there” argument used by CYCorp and others is counterproductive. The easiest Reductions are the ones we can do directly on the low level input stream. As an example, if we are writing an AGI program to learn to Understand a language like English, then we must refrain from parsing the text into words before giving it to the learning algorithm. If the system cannot figure out that separators like spaces separate the text into “recurring subsequences” then the system won’t be able to figure out higher level concepts either. Testing and debugging will always be easiest at the lowest levels of complexity. Do not waste this opportunity by Modeling “what words are” just because writing a word detecting parser is something you know how to do. Instead, feed it characters one by one and use the constructed system’s ability to discover the Model of “words are recurring sequences of non-separator characters” as a test case for whether your theories of learning are correct; at Syntience, this typically takes about four pages of reading Jane Austen. Same goes for movement detection in vision and for graceful locomotion in robotics.

Learning can’t be very difficult since frogs can do it. I estimate that the most advanced theory of general learning could be coded in 10,000 lines of Java or C; the code base at Syntience for our experimental AGIs typically clocks in at that magnitude including debugging and evaluation subsystems. The details may be tricky and few people claim to have implementable theories of general learning but it is not going to be a big coding effort. We might learn how learning works from Neuroscience; Numenta is pursuing this. But it will likely be much faster to derive this information directly from epistemology; that’s the path taken at Syntience. Contrast either of these to any attempt to describe the world, a task that by definition will never be done. If a cheap and clever way to create an AGI exists, then it’s going to be to create an Understanding machine that can learn on its own, rather than one that has to be taught what the world looks like. Perhaps I should create a little rule of thumb:

If your AGI requires more than 10,000 lines of code then you are doing it wrong.

Now I’d like to examine the limits of Reduction and Scientific Models, and the limits of Science. Actually, it is not so much about the limits of Science as a whole as it is about the limits of physics, mathematics, and computer science. Physics is the most Reductionist “main” discipline, and the other two are very, very Reductionist “support” disciplines for all of science. The best known examination of the limitations of physics is Erwin Schrödinger’s “What is Life” that observed that physics could not explain biology, in other words, that life and living phenomena could not be Reduced to the simple principles that physics had identified without losing the essence of life itself. I often joke and say that as the Reductionist is cutting apart a living frog to “see what makes it tick”, the “ticking” disappears, but all the pieces of the frog are still there. Schrödinger’s book gets re-read and re-debated by each generation of life scientists. The Life Sciences have found ways to make progress without using Reductionist Models. We will discuss these methods in the fourth article in this series, but for a preview, watch the video of my talk “Science Beyond Reductionism” at http://videos.syntience.com.

This examination of the limits of Reductionist science has been done many times by many people; the result is often a list of meta-types of problem domains where Models cannot be made or reliably used. These lists end up rather similar; see for instance the Wikipedia page about Complex Systems.

I like to divide this list into four parts:

## 1. Chaotic Systems.

Chaotic systems are generally unpredictable, are sensitive to initial conditions, and are easily perturbed at any time. The whole point of a Model is the ability to predict the future behavior of the modeled system, and Chaotic Systems are unpredictable in the long term; this is the definition of a Chaotic System. Chaotic behavior can result from many things but most commonly we find some combination of components with multiple interactions with numerous other components, components with hidden state (memory), and components with non-linear responses to inputs, which may lead to race conditions and indeterminism. Examples of such components could be neurons in the brain, animals in ecologies, cells or even organs in multicellular bodies, humans in societies, or corporations and other agents in market economies. We immediately recognize that all these problem domains are really difficult to model.

## 2. Irreducible Systems

These are systems that behave differently if you attempt to split them up or isolate parts of them from their environment. All Models are always simplifications, but for Irreducible Systems all possible simplifications discard something vital which means the predictions made by the Model will be incorrect. Take a frog out of its habitat into a laboratory and it will behave very differently. Blood in circulation in a body behaves very differently from blood in a test tube. The price of a company’s stock a month from now depends not only on the company and its actions but on the economy at large. Consider the laws of Thermodynamics; they only apply in closed systems. Any attempt to use them in the open world will fail because the energy interactions with the environment cannot be fully tracked.

John McCarthy and Patrick Hayes, two famous pioneers of Reductionist AI, observed in 1969 that Irreducibility was a fact of life and named it “The Frame Problem”. Except for some narrowly defined special cases, no real progress has been made on this problem. This is not surprising, since the problem is the Reductionist stance itself.

## 3. Emergent Effects

Emergent effects are system-level effects that cannot be observed at the component level. As an example, a single water molecule doesn’t have a temperature since temperature is defined only for groups of molecules interacting with each other. And depending on the temperature, these water molecules will form water vapor, liquid water, or solid ice. These three have very different properties. Could these different behaviors be predicted from the properties of individual molecules, such as van der Waals forces? It is difficult, and the reason we would even go looking for the connection is because we would have observed these emergent effects at the system level.

Consider a car. Its quality, lifespan, drive-ability, and beauty are not in any single component of the car. These are all in every component, in their design, the materials used, the design as a whole, the effort, precision, and conscientiousness that went into the manufacture of all parts and their assembly, the experience of the designers creating it, etc. All of these things matter. Same thing for human lifespan and beauty. These are all emergent phenomena that cannot be taken apart or simplified since everything matters. To manipulate them, for instance to improve the quality of the next generation of car, requires a Holistic (context utilizing) stance, Understanding, and experience.

There is also downward causation. Consider a single word, like “like”, alone in the middle of a blank page. What does it mean? “like” has about a dozen major meanings and hundreds of shades of meaning. In language, words rarely stand for unique concepts. Words get their meaning from context – from surrounding words, from the topic of the page, the language used, and the shared experience of the writer and the reader. As you are reading a page, the words you are reading build up a high level context in your mind that influences how you interpret each individual word (low-level observation) that follows. This high level context exerts downward causation on the lower level word disambiguation process. As another (rather silly) example of downward causation, if some set of neurons in your brain decide that you should take a break and walk to Starbucks, then the whole brain will have to come along. The emergent effects, that cannot be observed in the individual components, nevertheless affect the behavior of the components. This makes some Reductionists uncomfortable since in Reductionist systems, all causation is of the upward kind and is known in detail and this downward causation sounds like some kind of ghost in the machine.

Getting back to AGI… intelligence is an emergent phenomenon. It must emerge from the interactions of non-intelligent components. In some sense, this statement is trivially true, since if we had intelligent components we’d be done before we started. But the design of low-level components like simulated neurons to generate emergent effects like intelligence is still largely unexplored territory. Researchers at companies like Syntience and Numenta are exploring this nascent field and are exploring how to work with what I have named “Connectome Algorithms”. The ability to understand and manipulate emergence should be a required skill for any AGI researcher.

## 4. Unreliable Information

This one may be easiest to understand. Reductionist Models, Logic, and in general any scientific approach require good and solid input data. But in many problem domains, such data is not available, and worse, can never be made available to a degree sufficient to allow us to reliably use Reductionist Models. In our everyday life, information is incomplete, ambiguous, incorrect, and sometimes patently misleading. How well is a Reductionist model going to predict the future if we lie to it? In SF novels you sometimes find the term “GIGO” which stands for Garbage In, Garbage Out. It is amazing how seldom you hear programmers and other Reductionists use that term. It is rarely discussed since in many situations, nothing can be done about it.

A cute detail: The brain is internally unreliable. Neural signals are propagated by neurotransmitter diffusion across a synaptic gap. This means there is an indeterminate delay before the receiving end gets enough molecules to notice the signal; given the high parallelism of the brain, we get race conditions everywhere. No sane Reductionist would design a system like this. But apparently the brain has enough Holistic (context utilizing) redundancy and other checks and balances (as opposed to Reductionist solutions like checksums and retransmission) to create an emergent robustness. And this robustness actually extends outside of the brain. If your neurons can sort out their collective mistakes then they are likely to be able to use the same mechanisms to guard against contradictions and lies in the input data such as it is received by the senses. Emergent Robustness is the cure for Reductionist Brittleness. You’ll know you are on the right track when your AGI system makes human-like mistakes.

So we have four flavors of impossible-to-Model systems:  Chaotic Systems, Irreducible Systems, Emergent Effects, and Unreliable Information. Each one of these will prevent Reductionist Models from being made (or in the last case, from providing useful results). But in many of the hardest problem domains we find all four of these at once. Following the lead of Dr. Kirstie Bellman at Aerospace Corporation I call systems that we cannot Model “Bizarre Systems” and their problem domains “Bizarre Domains”. It is good to have a memorable label for these; we need to be on the lookout for telltale signs of these kinds of systems so that we don’t waste our time attempting to create Models of them.

We find examples of Bizarre Domains in many places outside of the Reductionist haven of physics, math, and computer science. Life is Bizarre, and this affects the Life Sciences. Genomics, Physiology, Ecology, Psychology, and Biology in general are full of situations where Models won’t work. Other Bizarre Domains: We gave the Nobel Prize in Economics to Friedrich Hayek for telling us that the Economy cannot be modeled. It doesn’t stop people from trying to make computer based Models of the stock market, and small gains may be possible for a while, but like all Reductionist contraptions these trading programs will fail. And when they do, they fail catastrophically, which we already said is the hallmark of Brittle Reductionist Models.

Learning a human language takes a lot of time and effort because we need to gather a lot of experience before we Understand it. But human languages are Bizarre. This is why the use of word frequency based data mining algorithms like TF-IDF, grammars, taxonomies, and ontologies (which are all Models, at best still somewhat useful today) will never lead to true Understanding of language; this requires a Holistic (context utilizing) approach.

For AGI the most important Bizarre Domain (besides language) is our everyday mundane reality. It is deeply complex, ever changing, and contains many agents with goals at odds with our own. This is where AGI must operate, performing the simple everyday tasks that people with normal intelligences do so effortlessly and that no hypothetical Reductionist AGI could ever analyze. How come humans can do this? We learn Reduction in school, and use it in science and engineering, but in our everyday lives we operate Holistically and Intuitively. We don’t Reduce, we just Understand. This will be the focus of the next article. The point I wanted to make here is:

In the very domains where AGI has to operate, Reduction is impossible.  The confusions about what Understanding is, who should be doing the Reduction, and when Reduction is even required are the main reasons we don’t already have working Artificial General Intelligence.

1. “David you are about the third poster in the past month that ignores (consciously or not) the fact that 10K lines can describe the functionality of a neuron, a synapse, an axonal tree, and a dendritic tree. Then we instantiate those a few billion times, until memory is full. Where’s the conflict?”

My apologies, I didn’t read too far down the comment section. So now I know what 10K of lines looks like, i.e. a synapse, an axonal tree, and a dendritic tree; and the culture inside the structure, i.e. memory, is most likely made up of particle waves coming from experiences (actions).

So AGI must depend on understanding the relationship between all the capabilities of the organization (human), i.e. culture, structure, procedures, and assets.

I mean, how else is the computer going to learn?

2. I have no concept what 10,000 lines of code looks like, but a computer probably does. What is the relationship between structure (what 10,000 lines of code “looks” like to a computer) and culture (how the code is reduced inside that structure), if there is one, and you haven’t answered that question already in your article? I am basically a real-world mechanic, so this is a little over my head.

That said, I was watching a football game on TV and they now use computer generated lines to show where the 10 yard lines are on first downs, so the computer is identifying the structure of the field, reducing it to objects instead of code, or perhaps the code is the object?

After reading a paper on emergent technologies, it looked to me like building a structure with a best guess of the forces that are present and that need to be countered by the structure, would be a useful strategy. Once the structure was built it then would let the technologies emerge as they pounded on the structure. So its more than just hardware, but the difference between the shape and form of what is emerging and what is in place.

Maybe that is just a long form of saying that a picture is worth a thousand words. As the new F35 airplane needs over a million lines of code before it actually works, as it is supposed to, I imagine your reduction-ism theory would be very scary for the people trying to build it, as it sounds to me like it will never “learn” anything.

3. I like you article very much and share your enthusiasm about brain like reasoning and knowledge stores. I work at Saffron Technologies the only true competitor of Numenta. We use associative memories too but our SW is enterprise grade being used by companies like Boeing, GE, national security and the DoD.

I agree the era of reductionist models (outside of physics, chemistry and biology) is over; in the sense that all low hanging fruits have been harvested.

I would like to point out that the world of machine learning and models is not black and white; its not reductionist models here and brain like AGI there. Surely, you have made this distinction for the sake of argument. Models show emerging effects, chaos and all the properties that you have copied from Wikipedia characterizing irreducible systems.

I’d say one of the first mathematicians to attempt to formalize this counter intuitive fact has been Poincare (long before Schroedinger). Next in line are russian mathematicians like Landau, Kolmogorov (K), Arnolld (A), Moser (M) and many more. The famous KAM theorem formalizes different types of chaos. Schroedinger was more concerned with the question how irreversible processes like live (aging) can be explained from equations that are symmetric in time than with reductionism itself. Prigogine gave an answer to how irreversible processes can emerge from reversible (symmetric in time) equations (models). He got the Nobel prize for it. For example, he described a “simple” system (the brusselator a complex autocatalytical chemical reaction ) that shows emerging qualities at its fixed point; colloquially called “order out of chaos”. A more popular but theoretically not yet so well understood example for order out of chaos is the Belousov–Zhabotinsky reaction. This chemical reaction shows something like un-mixing coffee and milk despite stirring the reactants constantly. It even shows fixed points in time, i.e. an oscillation between being mixed and un-mixed. This is an example of brittleness which comes usually from non-linear effects. Brittleness is not something that is to be avoided but it is very important to understand (like the Russian mathematicians have tried some 100 years ago). Life is brittle, customers are too.

We want to understand when a model has stable solution(s) and when it becomes brittle and shows chaos.

In short, there are reductionist models that show chaos and emerging properties. Many of those models cannot be solved analytically but still show properties which remind us of reductionism, i.e. we can characterize such systems with very few constants of time for example. The question is when do models become brittle; this can be a meta question (stemming from the “wrong” reduction) or generis to the model.

Poincare has first recognized this fascinating field of (chaos theory), i.e. models (mathematical equations) that have no analytical solution but can never the less be solved approximately by computers like the 3 body problem (Moon, Sun and Earth connected by gravitational forces).

BTW, Saffron’s Associative Memory is not an opaque approach. Its a fully deterministic white box approach. Modeling the brain can be white box.

4. Making an artificial brain wouldn’t be that hard given the right hardware. What you would need is either a) a Logic based computer with an extremely powerful processing core, or b) a chip that is built with the same principles as a natural brain. For A you might need a planet full of computers. B is your best bet in my opinion. In a neural net chip you would need to allow some kind of natural evolution in an artificial environment because you have NO HOPE of programming a brain yourselves especially considering we do not know, on the macro scale, how the brain works – at all (I can say that with confidence because I am a Neuroscientist). The chip (which actually would have billions of artificial neurons and interconnections) would need functions allowing a programmer to set a starting conditions where a small subset of the network connections were active, resembling the first ever brain. i.e. possibly a simple single cell stimulus response function that allows either avoidance of danger or moving towards food. The next step would be to allow some of the other neurons to randomly switch on, maybe one or two at a time, and for these neurons to take on some useful function in the program. If the brain performs worse than it’s predecessor the brain reverts to its previous best condition. In other words this is a genetic algorithm but the neural network is actually a real chip, not slow programming code. In order to allow switching on and off of the nodes in the network the artificial brain would need to have mediators. This would not only allow the programmer to switch off / on regions but also the brain itself (once it has reached a sufficient complexity). This would eventually resemble the various networks of the brain. Rather than programming this type of brain in the traditional way, the brain would be developed by its environment. It would be a bit like restricting the growth of a plant by pruning branches that grow in the wrong direction – you didn’t tell the plant where to go, you told it where not to go. Eventually you have something more complex and powerful than you could possibly imagine. Not being limited by biological processes would make such a brain much more powerful (given the correct environment to grow) than a human brain (for the same size / number of connections), and I am talking millions of times more powerful. The human brain is hindered by the slow chemical signal processing, an electrical brain would not be. If anyone thinks I am wrong about this let me know why.

5. Hi Monica,

It is hard for me to tell you this but looking at presented AI methodology and your web pages you have simply no chance. Simply speaking creation of the human level intelligence (Real Artificial Intelligence) requires completely different paradigm shift much more dramatic than your model free methods. The good news is that none of the methods presented here at hplusmagazine and singularity lists have any chance as well – so you have still chance.

Sorry if this is too harsh, take-it or leave-it or ignore-it.

AI-Expert.

6. Pingback: Quora

7. Excellent article. Can you give a better example of “downward causation”?

8. This is an excellent article. Thanks. I am not sure that I understand “downward causation.” Can you give a clearer example?

9. At one point in the article, Monica claims that beauty is “irreducible” because it is an emergent property. I’d just like to reference the work done by Jürgen Schmidhuber on that topic. He defines beauty (loosely) as an input that allows a person to simplify his/her method for processing inputs. Check out Jürgen at http://hplusmagazine.com/2010/01/05/build-optimal-scientist-then-retire/

10. I have an issue with the entire idea of emergentism. Generally, we say that a property is emergent when no form of it exists without certain other criteria met, i.e., it is a secondary product of some prior occurence or interaction. This seems particularly problematic when discussing consciousness. After all, because we do not yet know whether it is or isn’t a measurable product of neuronal activity, that is to say we do not know if individual neurons are conscious or not, or further, we do not really know what consciousness is, we cannot accurately ascribe other qualities to consciousness, emergentism being one of them.
As a philosophy student, I can’t deny that I use reductionism to make my life simpler, but I tend to think that when it comes to AGI and consciousness, we simply haven’t gotten to the irreducible components of reality, and therefore cannot fully emulate this thing which is among the most complex structures we know of.
That reductionism implies that nothing is irreducible is a problem for another day….

11. Isn’t an example of a hybrid MFM-reductionist model a kindergarten teacher teaching a child what a sentence is? I mean, probably the kid could have figured it out eventually, but the teacher helps it along.

• Piaget and others have advocated bottom-up experience-based “Constructionist” (sometimes “Constructivist”) learning, based on the student’s own explorations and experience. The kindergarten teacher is using Instructionist (top-down high-level knowledge taught as rules) and those are only effective if there is sufficient knowledge there (acquired by constructionist means) to anchor it to and if this Instructionist scaffold is subsequently replaced by more Constructionist experience. Patrick Winston said “You can only learn what you already almost know”. Consider learning French grammar top-down in beginning French classes. As you learn more French, you internalize and discard these Reductionist/Instructionist crutches and with fluency you no longer bother with the grammar for understanding or production. You have replaced the rules, which take conscious reasoning to apply, with a Holistic understanding of French, interwoven with your understanding of English and your experiences of the world. So yes, Instructionist teaching has some value but unless complemented with discovery based learning (often done as “exercises” in schools) they have no “staying power” in your brain. If you forgot everything you learned in class a week after the exam, then you were likely being taught Instructionistically and failed to replace the scaffold with true (Holistic) Understanding.

12. Suppose one builds a logical symbol manipulation system that behaves with human intelligence. In particular, assume that it has the intelligence to do higher mathematics at the level of Kurt Gödel. Such a system will be able to construct a statement and prove both that the statement is true and that the system itself cannot prove it to be true.

Admittedly, Gödel’s conception of logic worked in a closed system to which computer programs with arbitrary i/o cannot be readily compared, but I think the argument is interesting nonetheless.

• Gödel’s Theorem, P-vs-NP and other interesting discussions only matter where Mathematics works. If we have an Intuition based system, then all of these discussions about limitations are irrelevant. We’re not making Models, we’re not using Mathematics, we’re not doing Reduction. None of the limitations of Reductionism apply. It is true, we trade this freedom for some very important features, such as infallibility, repeatability, optimality, completeness, scrutability and transparency. But human brains don’t use or need any of those either so we’re probably on the right track.

13. I agree with Likeadog that in the case of consciousness or indeed all complex learning and “Understanding” we are more likely to succeed by trying to simulate(by code) or possibly by emulating entire neural networks. The brain in all it’s glory needs roughly 80-90 billion neurons patched together in incredibly complex patterns to produce “Understanding” as you put it. Trying to recreate it through a “holistic” approach using under 10k lines of code is not feasible. In order to even begin serious discussion whether code at all can produce humanlike qualities(ie. intelligence, learning, “Understanding “etc.) we need to know a lot more about how the brain works. I fear that our only hope of strong AI lies with complete simulation of the human brain, something that just isn’t going to happen for a very long while.

• I disagree somewhat with the statement that we cannot create AGI through the under 10k lines of code “Holistic” approach. After thinking about the problem, we know how neurons work. We know how communication through pathways work. The point at which neuron’s and pathway’s create consciousness is unknown to us, however, we can create an algorithm that model’s neurons and pathways and then structures them randomly.

We can take a large selection of randomly generated “neural maps” and then test them for fitness of a desired outcome based on input. We would essentially be doing what biology does in that we will allow the “neural maps” that demonstrate a higher order of understanding to continue on in the genetic algorithm.

You would have to guide the fitness of each generation much in the same way that a “teacher” guides a child to learn to communicate but if you guide the program in the correct way it could result in a system that display’s aspects of intelligence.

The downside is that you can’t just turn on the 10k lines of code and get AGI, but it could be the path that leads to AGI.

• At Syntience, we prefer Epistemology over Neuroscience. Yes, we’ll use hints from Neuroscience but in general, Epistemology yields cleaner implementations since there is no evolutionary baggage to account for. We have a complete and consistent theory, soup-to-nuts, for how learning happens in intelligent systems at the Connectome level and we can implement all neuron level behaviors that are relevant in 10K lines of code. Why does that sound impossible? Neurons signal other neurons via Synapses. The code practically writes itself.

• David you are about the third poster in the past month that ignores (consciously or not) the fact that 10K lines can describe the functionality of a neuron, a synapse, an axonal tree, and a dendritic tree. Then we instantiate those a few billion times, until memory is full. Where’s the conflict?

And Alex the parrot had a brain the size of a hazelnut but knew significant amounts of language. Clearly not everything in the brain is involved in language. If we wanted to *just* understand language, how large a “brain” would we need? Nobody knows. Syntience is betting the number of neurons is small enough to fit in a state of the art computer. Everyone betting against us is in practice sitting on their hands, thereby contributing very little to research in this area.

14. Irreducibility is not a property of a system, it’s a symptom that you did a reduction wrong. Cut reality at the seams, and take the blame when you don’t. A similar argument resolves “emergent effects”.

• Where would you make the cut if the task was to predict the price of Apple stock a month from now? Around the stock price? Around Apple? Around the US economy? Around the earth, including things like global warming and earthquakes? There are practically irreducible systems that cannot be reduced unless you fill a pocket universe with computronium.

• That muddles the issue into the categories of complex systems and unreliable information.

There are many sensible seam choices, any of which might be correct depending on the information available. The fact that it’s not intuitively obvious to us what the correct choices are does not mean that there are not models in which the seam choices (or ’emergent effects’) do not introduce significant error.

• You are claiming that in every system, an error free decomposition exists.

I claim that there exists irreducible systems where no error free decomposition exists; I don’t claim it can’t be found, I claim it isn’t there.

The “muddling” is not intentional but the issues of unreliable information, irreducibility, chaotic unpredictability, and emergence exists simultaneously in all Bizarre systems, by my definition. At some point, it doesn’t matter which we feel is most problematic since we need to solve all of them if we wanted to use Models. So even if you could find a Reduction-error minimizing decomposition, the input data unavailability would get you, etc. Switching to Model Free Methods is much easier than hitting your head against all four walls at once.

• I have other arguments addressing the Unreliable Information and Chaotic System claims. Those would take another form, but there is not point in proceeding to those arguments if we cannot reach agreement on the validity of this one.

The goalpost is not an ‘error free’ decomposition, it is one that works. In the article you cite component systems that behave ‘very differently’ in different environments, and posit models that do not predict this. You claim, it seems, that there is no possible model that does. That every decomposition will make significantly wrong predictions of the behavior of frogs or blood. This is a strong claim.

It remains to be seen whether Bizarre systems exist at all. To support this, you need to meet challenges to each of the four constituent issues. Likewise a reductionist needs to show that each of the four issues are not truly impediments to effective reduction. Being a reductionist, it makes sense to me to do this one issue at a time.

The fact that I’ve not yet done all four does not mean we should abandon establishing #1 on the grounds that reduction would be impossible either way because of 2,3,4. It is conceivable that an agent might overcome any one or two problems, but that no reduction based system can solve four, so a reductionist also needs to show that agent can deal with all 4 problems at the same time. Again it makes sense to me to do this after establishing the solutions to the constituent elements.

15. I’m bothered by this article, as it seems to use a completely wrong definition of reduction. Reducing neither entails “model making”, nor “loss of information”.

In fact, a successful reduction is merely translation from a high-level language (like psychology) to a low-level language (like a computer programming language), without any loss of detail. Therefore, all context in the statements reduced is preserved. Reduction does not simplify. It usually results in much more complex descriptions than the high-level description that was made.

The example reduction I gave above, i.e. from psychology to a computer program, is in fact, the very objective of AI research. Therefore AI can succeed if and only if such a reduction is possible.

The definition of intelligence that you use, that it is a reduction, errr, is also quite unsatisfactory, to say the least.

I repeat, model making is something else entirely. Of course, modeling is a very important ability of an intelligent system. For obvious reasons, I am sure. However, reduction and modeling are quite different concepts. A quick look in the dictionary should reveal.

Holism does not mean “context-aware”.

There is no such thing as Reductionist AI. CYC isn’t reductionist or non-reductionist. Computer programs usually aren’t reductionist, by the way, philosophers are. Since we don’t have artificial philosophers yet, I guess you are using the term in a completely wrong way.

Errr. So what is CYC? CYC is a model of human common sense+reasoning. CYC itself is a model, and obviously a quite limited model. However, it is not AGI, so well, what did you expect? You are making a strawman argument as others pointed out.

If you want to criticize AI projects that do not have a general-purpose learning component, please, by all-means do it, but please, I beg you, do it without making a lot of false assertions and misuse of language. Mainly because it makes it much more difficult to understand what you are trying to say. An inexpert would be thoroughly confused and misled by these, and the expert would suffer from your article.

So I can’t even read this article because it’s built on so many false definitions, assumptions and claims, sorry.

I think you really want to use something else than Reductionist at least. Perhaps something like a static AI? An AI that cannot learn? A simpleton AI? Or you can invent some neology as I suggested elsewhere.

I am really writing these, so that it may have a positive contribution to your future articles.

Best Regards,

Eray

16. How to write a program that solves a Reduction problem:

1. Write a code that structures a neural net.
2. Integrate a genetic algorithm that will evolve the neural net.
3. Test fitness by posing a simple reduction question. (Q: How many candies does Sally have?)
4. When the program successfully comes up with the right answer, pose increasingly more complex questions.
5. Repeat until it can successfully answer the question, “Who are you?”

Happy Gregorian calendar shenanigans day!

• Are you available to hire? Syntience Inc. is a small startup in Silicon Valley, currently in the process of looking for funding. Please get in touch with me in a couple of months. Are there any more where you came from?

• Not really available for hire as it would interfere with my schooling and I’d have to move to California, I’m assuming? Here’s a little something to think about:

Consciousness is like a phone network in that the result is greater than the sum of its parts. We don’t know at what point a bunch of linked neuron’s becomes a conscious being so don’t even try to quantify that point. Write a program that will find that point the same way that nature does.

• A reduction approach, albeit.

17. Monica, I’m tired of your uninsightful articles, your strawmen arguments (yeah, nobody has criticized reductionism and adopted a complex systems perspective before), your capitalized terms, your overgeneralized descriptions which say absolutely nothing about anything, your absent philosophical understanding (time to graduate from the Reductionism vs Holism debate?) that seems to be derived from some shallow reading of GEB.

If all it takes is 10,000 lines of C code, just Show us some Cool Things you can do with your Holistic Intelligence thing already.

• Hehe Yes many have said it all before. How come nobody listens? Because the naive view that Reductionism will work for AGI is too seductive? I will continue to argue for these ideas until I get some purchase and/or funding. Its not easy. I need to put together a short, convincing argument based on a large memepackage that’s largely alien to almost everyone working on AI research today. Most of all, I’d like to prevent newcomers to the field (students, for instance) from falling into the Reductionist AGI trap.

• Doesn’t everything in subsymbolic computation (i.e. neural networks, evolutionary algorithms, etc.) fit your description of model-free methods? Those are decades-old concepts, with huge bodies of research around them. So what do you mean by “how come nobody listens”? What do you think the machine learning people are doing?

• These are indeed (often) Model Free Methods and I approve of all of them for the purpose of creating an AGI and for more specialized purposes (narrow AI). Yes, Subsymbolic is the way to go. What I’ve done is I’ve pointed out that it is exactly the Model Free-ness that makes these work and this will (hopefully) stop people form making mistakes like mixing machine learning results with hand-crafted Reductionist Models. In the short run (before someone demonstrates a true Holistic AGI) my insights can be used to improve the results in data mining and other machine learning tasks.

I have often bought books with promising titles about Subsymbolic computation and I typically get to page ten or so when they introduce their model. Time to put away the book. Hybrids are typicaly a mistake. I’d be happy if someone showed me a system with distinct Model Based and Model Free parts where the parts interacted productively but I’m not holding my breath. I prefer to simplify my life by not introducing domain models at all.

• Hey, Anon…

Have some money? We’ll show you some “cool stuff”. Otherwise, try to steer clear of Ad Hominem attacks which you are precariously teetering on now.

GEB is not even on our reading list, fyi, though it is very interesting.

19. how does this line of thinking apply to reductionist systems like economics?

20. If an AGI can be produced along these lines using less than 10,000 lines of code, then do it. Personally, I think the idea is preposterous.

Moreover, despite three decades now of blather about emergence and irreducibility, neither concept has given rise to any useful result. For example, how do the global warming scientists study climate? Answer: using reductive computer simulations. There is no other way. And what is the source of all progress in biology? Answer: reductive understanding of the molecular machines from which life is built. There is no other way.

Complex systems are, of course, hard to model reductively, and perhaps essentially impossible in some cases. But the fallacy of emergence is that there is something else useful that can be said about these cases. The truth is that systems that cannot be modeled reductively, cannot be fully understood, period.

Sure, we can “understand” them intuitively in terms of some statistical properties – but one could derive those statistical properties from a reductive model. What can’t be derived is the detailed behavior, and no other mode of thought can get around this.

• When you say “understand” you mean understand in the Reductionist’s sense – to be able to reduce the system to a Model of some kind. Statistics provide weak Models, but they are still Models.

But suppose we could somehow, through some Connectome Algorithm that manipulates emergent effects, create a system that nobody understands but that behaves in interesting ways; interesting enough that people would pay money for these effects. Like, being able to tell that two documents discuss the same topic in spite of them using largely different vocabularies. Would you insist that such a system is useless? Clearly not, since people will pay to use it. Do you understand how it does what it does? No, it’s opaque. Well, brains are also opaque. Funny, that.

I don’t think this is impossible; Systems like this can be designed and we’ve seen hints of these effects. And it’s certainly a very different way compared to what Reductionists do.

Reductionist understanding – through Models – is overrated. Results are what counts. We’ll get there shortly.

• To put it more succinctly, it is exactly my point that what we are after is an AGI, i.e. a *system* that Understands some problem domain, Holistically (and if it means the system is going to be opaque, so be it).

The desire that some *human* understand the system (or even the problem domain) in a Reductionist fashion (through Models like equations, formulas, algorithms) is totally unimportant.

• “The desire that some *human* understand the system (or even the problem domain) in a Reductionist fashion (through Models like equations, formulas, algorithms) is totally unimportant.”

Perhaps in some abstract world, but in this one it is very important in many contexts for a human understand how the system came to the conclusions it reached. Are you really suggesting that you can just say “Trust me” a la Han Solo and the human to whom you’ve just given advice should accept it blindly?

I wouldn’t accept that answer from a human expert whose credentials I know; I’m certainly not going to accept it from a piece of software.

21. Pingback: Quora

22. Great article, Monica. I think I really understand the distinction between what you are saying and the typical view of AI. It really makes clear how no one will ever be able to say “Oh, that’s just a calculation” about true AI.

Thanks for your great thinking and sharing!

Dave

23. What does it mean to understand understanding?

I don’t understand!

24. Monica this is a wonderful article!!