X-Message-Number: 4367 From: Date: Sun, 7 May 1995 15:13:19 -0700 Subject: SCI.CRYONICS neural nets revisited In message #4331, Joseph Strout quotes a FAQ that states: "In principle, Neural Nets can compute any computable function, i.e., they can do everything a normal digital computer can do." I believe that the first half of this statement is FALSE, while the second half is TRUE. The two statements are NOT equivalent. To support my claim, I will give more background than I gave in my previous post #4326 on this subject. *Recursive functions: Abstract models of computation.* In the 1930's, the two most important developments of this century in mathematical logic occurred. They were: 1. Proof of the justly celebrated Goedel incompleteness theorems, and the related limitative results of Church, Tarski, and Turing. 2. Finding an adequate definition of the informal notion of "algorithm". At about the same time (ca. 1936), the formal definitions of Church's lambda-definable functions, Post's notion of 1-definability, Herbrand-Goedel-Kleene systems of equations, and Turing machines were developed. These notions were all soon proved to be equivalent; the Church-Turing thesis is that they each capture the intuitive notion of "effectively computable function." Further, each of these definitions led to the description of a *universal* model -- a model that could compute every function that any instance of the definition could compute. Since that time, many other formulations of the notion of "computable function" have been offered; all of them have turned out to be provably equivalent to those listed above. Note that these are all ABSTRACT models of computation. The question of whether any of these models could be implemented on a physical device is quite a different matter. In fact, the first digital computer was not produced for another decade. Of the computation models listed above, the Turing machine is closest to describing an actual physical computer. But an essential feature of a Turing machine is that it has an INFINITELY long tape. The tape is used both for input and for scratch memory. If the machine T is designed to compute the particular recursive function f, the amount of tape required to compute a particular value f(n) is finite, but as n varies, in general there is no upper bound on the amount of tape required -- hence the requirement of an infinite tape. *Definitions* A function that maps words over an alphabet into words over an alphabet is *partial computable* (partial recursive) iff it is partial Turing-computable (equivalently, lambda-definable, or any of the other provably equivalent definitions.) "Partial" means that the function may be undefined for some inputs. A model computer is *computationally universal* iff it will compute every partial computable function. *Neural nets* In 1943, McCulloch and Pitts [1] offered nerve nets as a formal model of the nervous system. Later researchers developed a number of different formulations of essentially the same model. Researchers also focused on formulating these machines as "acceptors", wherein a particular machine (or net) would signal either "yes" or "no" to finite input tapes over an alphabet, thus determining a set of tapes that the machine accepted. Kleene [2] showed that the set of input tapes accepted by his finite automata, and also the McCulloch and Pitts nerve nets, is the set of REGULAR sets of tapes, a set which he defined. Later Medvedev [3] extended this equivalence: Automata defined by semigroups, and Turing machines with FINITE tapes, also accept exactly the class of regular sets of tapes. It is easy to specify McCulloch and Pitts neurons that act as the logic gates AND, OR, and NOT. So we can use neural nets to design any logic circuit, presumably including a Cray computer. This confirms the claim that neural nets can do anything that digital computers can do. But finite automata are considerably weaker in computational power than Turing machines with an infinite tape. Regular sets have fairly simple structures. In particular, the complement of a regular set is regular, whereas of course the complement of a recursively enumerable set need not be recursively enumerable. The set of tapes {01, 0011, 000111, . . . } (n 0's followed by n 1's) is NOT a regular set over the alphabet {0, 1}, and hence there is no finite automaton that accepts only its members. But it is easy to design a Turing machine that accepts it. Let a *language* be a set of words over a given alphabet. In order of increasing power of the machines, we have the Chomsky hierarchy: Computing Model Language Class Accepted --------------- ----------------------- Finite automata Regular languages Pushdown automata Context-free languages Linear bounded automata Context-sensitive languages Turing machines Recursively enumerable languages *Physical computers* Real world digital computers, such as PCs or Crays, are often thought of as being able to compute all recursive functions. But of course that is not true. Every machine that has ever been built has only FINITE memory. That means that for any given piece of hardware, there are recursive functions f for which the shortest program to compute f is too large to even load into the hardware's memory. There are other recursive functions g whose program will load into memory, but for some n will run out of memory in trying to compute g(n). * "Potentially universal" physical computers* Think of your desktop computer as consisting of a central processing unit (CPU) together with a finite amount of random access memory (RAM). Imagine that you live next to a RAM factory. When the machine runs out of memory in a particular computation, it signals "add on another memory cell", which is a register large enough to hold any symbol in the alphabet. Then such an extensible machine is "potentially universal". But suppose a particular computation requires a GOOGOLPLEX of memory cells to complete it. This number of cells is VASTLY more than the number of atoms in the known universe, and clearly as a matter of physics, our RAM factory is long ago "out of stock". NO PHYSICAL COMPUTER CAN COMPUTE ALL COMPUTABLE FUNCTIONS! Thus an actual hardware machine, loaded with an appropriate program, is closer to realizing a finite automaton than a Turing machine. Indeed the Medvedev result cited above shows that the hardware machine is equivalent to a Turing machine with some FINITE tape. There are many other more recent textbooks such as [4] and [5], that cover most of the above ideas. See also the breezier [6, pg. 164], where Dewdney (who used to write the Computer Recreations column in Scientific American) states: "Moreover, as we shall show now, neural nets are no more powerful than finite automata, the humblest inhabitants of the Chomsky hierarchy." *Conclusion* Neural nets are equivalent in computational power to finite automatons. You can do anything with neural nets that you can do with conventional von Neumann digital computers. But neural nets are NOT computationally universal. I wish to thank Prof. Marvin Minsky for several very helpful comments on an earlier draft of this post. Art Quaife Trans Time, Inc. *References* [1] McCulloch, W.S. and Pitts, W.A. Logical Calculus of the Ideas Immanent in Nervous Activity, Bulletin of Mathematical BioPhysics 5: 115-133 (1943). [2] Kleene, S.C. Representation of Events in Nerve Nets and Finite Automata. Automata Studies, C.E. Shannon and J. McCarthy editors, Princeton University Press (1956). [3] Medvedev, Y.T. On the Class of Events Representable in a Finite Automaton. Reprinted in Sequential Machines, E.F. Moore editor, Addison-Wesley (1964). [4] Davis, M.D., and Weyuker, E.J. Computability, Complexity, and Languages. Academic Press (1983). [5] Lewis, H.R., and Papadimitriou, C.H. Elements of the Theory of Computation. Prentice-Hall (1981). [6] Dewdney, A.K. The Turing Omnibus: 61 Excursions in Computer Science. Computer Science Press (1989). Rate This Message: http://www.cryonet.org/cgi-bin/rate.cgi?msg=4367