Tag Archives: Marvin Minsky

An explanation of neural networks from the Massachusetts Institute of Technology (MIT)

I always enjoy the MIT ‘explainers’ and have been a little sad that I haven’t stumbled across one in a while. Until now, that is. Here’s an April 14, 201 neural network ‘explainer’ (in its entirety) by Larry Hardesty (?),

In the past 10 years, the best-performing artificial-intelligence systems — such as the speech recognizers on smartphones or Google’s latest automatic translator — have resulted from a technique called “deep learning.”

Deep learning is in fact a new name for an approach to artificial intelligence called neural networks, which have been going in and out of fashion for more than 70 years. Neural networks were first proposed in 1944 by Warren McCullough and Walter Pitts, two University of Chicago researchers who moved to MIT in 1952 as founding members of what’s sometimes called the first cognitive science department.

Neural nets were a major area of research in both neuroscience and computer science until 1969, when, according to computer science lore, they were killed off by the MIT mathematicians Marvin Minsky and Seymour Papert, who a year later would become co-directors of the new MIT Artificial Intelligence Laboratory.

The technique then enjoyed a resurgence in the 1980s, fell into eclipse again in the first decade of the new century, and has returned like gangbusters in the second, fueled largely by the increased processing power of graphics chips.

“There’s this idea that ideas in science are a bit like epidemics of viruses,” says Tomaso Poggio, the Eugene McDermott Professor of Brain and Cognitive Sciences at MIT, an investigator at MIT’s McGovern Institute for Brain Research, and director of MIT’s Center for Brains, Minds, and Machines. “There are apparently five or six basic strains of flu viruses, and apparently each one comes back with a period of around 25 years. People get infected, and they develop an immune response, and so they don’t get infected for the next 25 years. And then there is a new generation that is ready to be infected by the same strain of virus. In science, people fall in love with an idea, get excited about it, hammer it to death, and then get immunized — they get tired of it. So ideas should have the same kind of periodicity!”

Weighty matters

Neural nets are a means of doing machine learning, in which a computer learns to perform some task by analyzing training examples. Usually, the examples have been hand-labeled in advance. An object recognition system, for instance, might be fed thousands of labeled images of cars, houses, coffee cups, and so on, and it would find visual patterns in the images that consistently correlate with particular labels.

Modeled loosely on the human brain, a neural net consists of thousands or even millions of simple processing nodes that are densely interconnected. Most of today’s neural nets are organized into layers of nodes, and they’re “feed-forward,” meaning that data moves through them in only one direction. An individual node might be connected to several nodes in the layer beneath it, from which it receives data, and several nodes in the layer above it, to which it sends data.

To each of its incoming connections, a node will assign a number known as a “weight.” When the network is active, the node receives a different data item — a different number — over each of its connections and multiplies it by the associated weight. It then adds the resulting products together, yielding a single number. If that number is below a threshold value, the node passes no data to the next layer. If the number exceeds the threshold value, the node “fires,” which in today’s neural nets generally means sending the number — the sum of the weighted inputs — along all its outgoing connections.

When a neural net is being trained, all of its weights and thresholds are initially set to random values. Training data is fed to the bottom layer — the input layer — and it passes through the succeeding layers, getting multiplied and added together in complex ways, until it finally arrives, radically transformed, at the output layer. During training, the weights and thresholds are continually adjusted until training data with the same labels consistently yield similar outputs.

Minds and machines

The neural nets described by McCullough and Pitts in 1944 had thresholds and weights, but they weren’t arranged into layers, and the researchers didn’t specify any training mechanism. What McCullough and Pitts showed was that a neural net could, in principle, compute any function that a digital computer could. The result was more neuroscience than computer science: The point was to suggest that the human brain could be thought of as a computing device.

Neural nets continue to be a valuable tool for neuroscientific research. For instance, particular network layouts or rules for adjusting weights and thresholds have reproduced observed features of human neuroanatomy and cognition, an indication that they capture something about how the brain processes information.

The first trainable neural network, the Perceptron, was demonstrated by the Cornell University psychologist Frank Rosenblatt in 1957. The Perceptron’s design was much like that of the modern neural net, except that it had only one layer with adjustable weights and thresholds, sandwiched between input and output layers.

Perceptrons were an active area of research in both psychology and the fledgling discipline of computer science until 1959, when Minsky and Papert published a book titled “Perceptrons,” which demonstrated that executing certain fairly common computations on Perceptrons would be impractically time consuming.

“Of course, all of these limitations kind of disappear if you take machinery that is a little more complicated — like, two layers,” Poggio says. But at the time, the book had a chilling effect on neural-net research.

“You have to put these things in historical context,” Poggio says. “They were arguing for programming — for languages like Lisp. Not many years before, people were still using analog computers. It was not clear at all at the time that programming was the way to go. I think they went a little bit overboard, but as usual, it’s not black and white. If you think of this as this competition between analog computing and digital computing, they fought for what at the time was the right thing.”

Periodicity

By the 1980s, however, researchers had developed algorithms for modifying neural nets’ weights and thresholds that were efficient enough for networks with more than one layer, removing many of the limitations identified by Minsky and Papert. The field enjoyed a renaissance.

But intellectually, there’s something unsatisfying about neural nets. Enough training may revise a network’s settings to the point that it can usefully classify data, but what do those settings mean? What image features is an object recognizer looking at, and how does it piece them together into the distinctive visual signatures of cars, houses, and coffee cups? Looking at the weights of individual connections won’t answer that question.

In recent years, computer scientists have begun to come up with ingenious methods for deducing the analytic strategies adopted by neural nets. But in the 1980s, the networks’ strategies were indecipherable. So around the turn of the century, neural networks were supplanted by support vector machines, an alternative approach to machine learning that’s based on some very clean and elegant mathematics.

The recent resurgence in neural networks — the deep-learning revolution — comes courtesy of the computer-game industry. The complex imagery and rapid pace of today’s video games require hardware that can keep up, and the result has been the graphics processing unit (GPU), which packs thousands of relatively simple processing cores on a single chip. It didn’t take long for researchers to realize that the architecture of a GPU is remarkably like that of a neural net.

Modern GPUs enabled the one-layer networks of the 1960s and the two- to three-layer networks of the 1980s to blossom into the 10-, 15-, even 50-layer networks of today. That’s what the “deep” in “deep learning” refers to — the depth of the network’s layers. And currently, deep learning is responsible for the best-performing systems in almost every area of artificial-intelligence research.

Under the hood

The networks’ opacity is still unsettling to theorists, but there’s headway on that front, too. In addition to directing the Center for Brains, Minds, and Machines (CBMM), Poggio leads the center’s research program in Theoretical Frameworks for Intelligence. Recently, Poggio and his CBMM colleagues have released a three-part theoretical study of neural networks.

The first part, which was published last month in the International Journal of Automation and Computing, addresses the range of computations that deep-learning networks can execute and when deep networks offer advantages over shallower ones. Parts two and three, which have been released as CBMM technical reports, address the problems of global optimization, or guaranteeing that a network has found the settings that best accord with its training data, and overfitting, or cases in which the network becomes so attuned to the specifics of its training data that it fails to generalize to other instances of the same categories.

There are still plenty of theoretical questions to be answered, but CBMM researchers’ work could help ensure that neural networks finally break the generational cycle that has brought them in and out of favor for seven decades.

This image from MIT illustrates a ‘modern’ neural network,

Most applications of deep learning use “convolutional” neural networks, in which the nodes of each layer are clustered, the clusters overlap, and each cluster feeds data to multiple nodes (orange and green) of the next layer. Image: Jose-Luis Olivares/MIT

h/t phys.org April 17, 2017

One final note, I wish the folks at MIT had an ‘explainer’ archive. I’m not sure how to find any more ‘explainers on MIT’s website.

Events announced for 2010 World Science Festival

The program for the 2010 World Science Festival in New York City which runs June 2, 05 2010 is available here. Do check regularly as it is being added to and changed. Here’s a sampling of what’s available. For the astronomy buff,

The James Webb Space Telescope

FREE

Tuesday, June 1, 2010, 9:00 AM – Sunday, June 6, 2010, 9:00 PM
Battery Park

The world’s most powerful future space telescope is coming to New York City as part of the World Science Festival. NASA’s James Webb Space Telescope will allow us to unveil the very first galaxies formed in the Universe and discover hidden world’s around distant stars when the mission launches in 2014. For six days in June, a full-scale model of this successor to the famed Hubble Space Telescope will be on public view in Battery Park.

This detailed scale model, at 80 feet long, 37 feet wide and nearly 40 feet high, is as big as a tennis court. It’s as close to a first-hand look at the telescope as most people will ever get.

There’s more to do than just marvel. Once you’ve taken in the awe-inspiring sight, play with interactive exhibits, watch videos showing what we will learn from the Webb and ask scientists on-hand about how the telescope works.

And don’t miss our Friday June 4th party, “From the City to the Stars,” at the base of a spectacularly lit telescope, where leading scientists will join us to talk about the anticipated discoveries. Bring your telescope if you have one, or just yourself, and come congregate with amateur astronomers and novices alike for a festive evening of marveling at the wonders of the cosmos.

This program is made possible with the support of Northrop Grumman, and presented in collaboration with The Battery Conservancy.

If music and artificial intelligence interest you,

Machover and Minsky: Making Music in the Dome

Tickets must be purchased.

Thursday, June 3, 2010, 6:00 PM – 7:30 PM

Hayden Planetarium Space Theater

How does music help order emerge from the mind’s chaos? How does it create and conjure thoughts, emotions and memories? Legendary composer and inventor Tod Machover will explore these mysteries with Artificial Intelligence visionary Marvin Minsky. The two iconoclasts will revisit their landmark musical experiment, the Brain Opera, and offer an exclusive sneak peak at Machover’s upcoming opera, Death and the Powers, a groundbreaking MIT Media Lab production that explores what we leave behind for the world and our loved ones, using specially designed technology, including a chorus of robots.

This program is presented in collaboration with the American Museum of Natural History.

Participants:

Tod Machover

Tod Machover, called “America’s Most Wired Composer” by the Los Angeles Times, is celebrated for creating music that breaks traditional artistic and cultural boundaries. He is acclaimed for inventing new technologies for music, such as his Hyperinstruments which augment musical expression for everyone, from virtuosi like Yo-Yo Ma and Prince to players of Guitar Hero, which grew out of his lab.read more

Marvin Minsky

Minsky is one of the pioneers of artificial intelligence and had made numerous contributions to the fields of AI, cognitive science, mathematics and robotics. His current work focuses on trying to imbue machines with a capacity for common sense. Minsky is a professor at MIT, where he co-founded the artificial intelligence lab.

This one seems pretty self-explanatory,

Eye Candy: Science, Sight, Art

Tickets must be purchased.

Thursday, June 3, 2010, 7:00 PM – 8:30 PM

Are you drawn to Impressionism? Or more toward 3D computer art? Beauty is in the eye of the beholder. Or is it? Contrary to the old adage, there may be universal biological principles that drive art’s appeal, and its capacity to engage our brains and our interest. Through artworks ranging from post-modernism to political caricature to 3D film, we’ll examine newly understood principles of visual perception.

Participants:

Patrick Cavanagh

Cavanagh helped change vision research by creating the Vision Sciences Lab at Harvard and the Centre of Attention & Vision in Paris. He is currently researching the problems of attention as a frequent component of mental illnesses, learning difficulties at school, and workplace accidents.read more

Ken Nakayama

No details.

Jules Feiffer

Cartoonist, playwright, screenwriter and children’s book author & illustrator Jules Feiffer has had a remarkable creative career turning contemporary urban anxiety into witty and revealing commentary for over fifty years. From his Village Voice editorial cartoons to his plays and screenplays, including Little Murders and Carnal Knowledge, Feiffer’s satirical outlook has helped define us politically, sexually and socially.

Buzz Hays

No details

Margaret S. Livingstone

Livingstone is best known for her work on visual processing, which has led to a deeper understanding of how we see color, motion, and depth, and how these processes are involved in generating percepts of objects as distinct from their background.

Christopher W. Tyler

Tyler has spent his research career exploring how the eyes and brain work together to produce meaningful vision. Dr. Tyler, director of The Smith Kettlewell Brain Imaging Center, has developed rapid tests for the diagnosis of diseases of this visual processing in infants and of retinal and optic nerve diseases in adults. He has also studied visual processing and photoreceptor dynamics in other species such as monkeys, butterflies and fish.

Finally, there’s the importance of sound,

Good Vibrations
The Science of Sound

Tickets must be purchased.

Thursday, June 3, 2010, 8:00 PM – 9:30 PM

The Kaye Playhouse at Hunter College

We look around us – constantly. But how often do we listen around us? Sound is critically important to our bodies and brains, and to the wider natural world. In the womb, we hear before we see. Join neuroscientists, biophysicists, astrophysicists, composers and musicians for a fascinating journey through the nature of sound—how we perceive it, how it acts upon us and how it profoundly affects our well-being—including a demonstration of sounds produced by sources as varied as the human inner ear and gargantuan black holes in space.

Moderator: John Schaefer

Participants:

Jamshed Bharucha

Bharucha conducts research in cognitive psychology and neuroscience, focusing on the cognitive and neural basis of the perception of music. He is a past editor of the interdisciplinary journal Music Perception.

Jacob Kirkegaard

Danish sound artist Jacob Kirkegaard explores sound in art with a scientific approach. He focuses on the scientific and aesthetic aspects of resonance, time, sound and hearing. His installations, compositions and performances deal with acoustic spaces and phenomena that usually remain imperceptible.

John Schaefer

John Schaefer is the host of WNYC’s innovative music/talk show Soundcheck, which features live performances and interviews with a variety of guests. Schaefer, Executive Producer, Music Programming, WNYC Radio, has also hosted and produced WNYC’s radio series New Sounds since 1982 (which Billboard called “The #1 radio show for the Global Village”) and the New Sounds Live concert series since 1986.

Christopher Shera

Shera has done extensive research in solving fundamental problems in the mechanics and physiology of the peripheral auditory system. His work focuses on how the ear amplifies, analyzes, and emits sound, and his research combines physiological measurements with theoretical modeling of the peripheral auditory system.read more

Michael Turner

Turner is the Bruce V. and Diana M. Rauner Distinguished Service Professor at the University of Chicago. He is a theoretical cosmologist who coined the term, “dark energy.” He has made seminal contributions to the understanding of inflationary cosmology, particle dark matter, and the theory of the Big Bang.

Mark Whittle

Whittle uses large optical and radio telescopes, including the Hubble Space Telescope, to study processes occurring within 1,000 light years of the central supermassive black hole in Active Galaxies. His most recent interests focus on the way in which fast moving jets of gas, which are driven out of the active nucleus, subsequently crash into, accelerate, and generally “damage” the surrounding galactic material.

If you can’t make it to the festival in June, there are always the videos.