Tag Archives: Garrett Kenyon

Trying to understand how neural networks work

It seems no one (not even the experts) really understands how ‘artificial intelligence that learns’ actually works. This is the second time I’ve stumbled onto a similar statement made by experts. Last time (see my July 28, 2022 posting [part 1 of 2] and scroll down to the “A deep dive into AI?” subhead for a quote from a recent Council of Canadian Academies report on AI for Science and Engineering) they were unable to gave a stable definition for artificial intelligence.

Researchers at Los Alamos are looking at new ways to compare neural networks. This image was created with an artificial intelligence software called Stable Diffusion, using the prompt “Peeking into the black box of neural networks.” Courtesy: Los Alamos National Laboratorory

A September 13, 2022 Los Alamos National Laboratory (LANL) news release (also on EurekAlert) provides detail about how researchers are addressing the problem,

A team at Los Alamos National Laboratory has developed a novel approach for comparing neural networks that looks within the “black box” of artificial intelligence to help researchers understand neural network behavior. Neural networks recognize patterns in datasets; they are used everywhere in society, in applications such as virtual assistants, facial recognition systems and self-driving cars.

“The artificial intelligence research community doesn’t necessarily have a complete understanding of what neural networks are doing; they give us good results, but we don’t know how or why,” said Haydn Jones, a researcher in the Advanced Research in Cyber Systems group at Los Alamos. “Our new method does a better job of comparing neural networks, which is a crucial step toward better understanding the mathematics behind AI.”

Jones is the lead author of the paper “If You’ve Trained One You’ve Trained Them All: Inter-Architecture Similarity Increases With Robustness,” which was presented recently at the Conference on Uncertainty in Artificial Intelligence. In addition to studying network similarity, the paper is a crucial step toward characterizing the behavior of robust neural networks.

Neural networks are high performance, but fragile. For example, self-driving cars use neural networks to detect signs. When conditions are ideal, they do this quite well. However, the smallest aberration — such as a sticker on a stop sign — can cause the neural network to misidentify the sign and never stop.

To improve neural networks, researchers are looking at ways to improve network robustness. One state-of-the-art approach involves “attacking” networks during their training process. Researchers intentionally introduce aberrations and train the AI to ignore them. This process is called adversarial training and essentially makes it harder to fool the networks.

Jones, Los Alamos collaborators Jacob Springer and Garrett Kenyon, and Jones’ mentor Juston Moore, applied their new metric of network similarity to adversarially trained neural networks, and found, surprisingly, that adversarial training causes neural networks in the computer vision domain to converge to very similar data representations, regardless of network architecture, as the magnitude of the attack increases.

“We found that when we train neural networks to be robust against adversarial attacks, they begin to do the same things,” Jones said.

There has been extensive effort in industry and in the academic community searching for the “right architecture” for neural networks, but the Los Alamos team’s findings indicate that the introduction of adversarial training narrows this search space substantially. As a result, the AI research community may not need to spend as much time exploring new architectures, knowing that adversarial training causes diverse architectures to converge to similar solutions.

“By finding that robust neural networks are similar to each other, we’re making it easier to understand how robust AI might really work. We might even be uncovering hints as to how perception occurs in humans and other animals,” Jones said.

Should you be curious about future events, the Association for Uncertainty in Artificial Intelligence (AUAI), a non-profit, organizes an annual conference.

Dr. Wei Lu and bio-inspired ‘memristor’ chips

It’s been a while since I’ve featured Dr. Wei Lu’s work here. This April  15, 2010 posting features Lu’s most relevant previous work.) Here’s his latest ‘memristor’ work , from a May 22, 2017 news item on Nanowerk (Note: A link has been removed),

Inspired by how mammals see, a new “memristor” computer circuit prototype at the University of Michigan has the potential to process complex data, such as images and video orders of magnitude, faster and with much less power than today’s most advanced systems.

Faster image processing could have big implications for autonomous systems such as self-driving cars, says Wei Lu, U-M professor of electrical engineering and computer science. Lu is lead author of a paper on the work published in the current issue of Nature Nanotechnology (“Sparse coding with memristor networks”).

Lu’s next-generation computer components use pattern recognition to shortcut the energy-intensive process conventional systems use to dissect images. In this new work, he and his colleagues demonstrate an algorithm that relies on a technique called “sparse coding” to coax their 32-by-32 array of memristors to efficiently analyze and recreate several photos.

A May 22, 2017 University of Michigan news release (also on EurekAlert), which originated the news item, provides more information about memristors and about the research,

Memristors are electrical resistors with memory—advanced electronic devices that regulate current based on the history of the voltages applied to them. They can store and process data simultaneously, which makes them a lot more efficient than traditional systems. In a conventional computer, logic and memory functions are located at different parts of the circuit.

“The tasks we ask of today’s computers have grown in complexity,” Lu said. “In this ‘big data’ era, computers require costly, constant and slow communications between their processor and memory to retrieve large amounts data. This makes them large, expensive and power-hungry.”

But like neural networks in a biological brain, networks of memristors can perform many operations at the same time, without having to move data around. As a result, they could enable new platforms that process a vast number of signals in parallel and are capable of advanced machine learning. Memristors are good candidates for deep neural networks, a branch of machine learning, which trains computers to execute processes without being explicitly programmed to do so.

“We need our next-generation electronics to be able to quickly process complex data in a dynamic environment. You can’t just write a program to do that. Sometimes you don’t even have a pre-defined task,” Lu said. “To make our systems smarter, we need to find ways for them to process a lot of data more efficiently. Our approach to accomplish that is inspired by neuroscience.”

A mammal’s brain is able to generate sweeping, split-second impressions of what the eyes take in. One reason is because they can quickly recognize different arrangements of shapes. Humans do this using only a limited number of neurons that become active, Lu says. Both neuroscientists and computer scientists call the process “sparse coding.”

“When we take a look at a chair we will recognize it because its characteristics correspond to our stored mental picture of a chair,” Lu said. “Although not all chairs are the same and some may differ from a mental prototype that serves as a standard, each chair retains some of the key characteristics necessary for easy recognition. Basically, the object is correctly recognized the moment it is properly classified—when ‘stored’ in the appropriate category in our heads.”

Image of a memristor chip Image of a memristor chip Similarly, Lu’s electronic system is designed to detect the patterns very efficiently—and to use as few features as possible to describe the original input.

In our brains, different neurons recognize different patterns, Lu says.

“When we see an image, the neurons that recognize it will become more active,” he said. “The neurons will also compete with each other to naturally create an efficient representation. We’re implementing this approach in our electronic system.”

The researchers trained their system to learn a “dictionary” of images. Trained on a set of grayscale image patterns, their memristor network was able to reconstruct images of famous paintings and photos and other test patterns.

If their system can be scaled up, they expect to be able to process and analyze video in real time in a compact system that can be directly integrated with sensors or cameras.

The project is titled “Sparse Adaptive Local Learning for Sensing and Analytics.” Other collaborators are Zhengya Zhang and Michael Flynn of the U-M Department of Electrical Engineering and Computer Science, Garrett Kenyon of the Los Alamos National Lab and Christof Teuscher of Portland State University.

The work is part of a $6.9 million Unconventional Processing of Signals for Intelligent Data Exploitation project that aims to build a computer chip based on self-organizing, adaptive neural networks. It is funded by the [US] Defense Advanced Research Projects Agency [DARPA].

Here’s a link to and a citation for the paper,

Sparse coding with memristor networks by Patrick M. Sheridan, Fuxi Cai, Chao Du, Wen Ma, Zhengya Zhang, & Wei D. Lu. Nature Nanotechnology (2017) doi:10.1038/nnano.2017.83 Published online 22 May 2017

This paper is behind a paywall.

For the interested, there are a number of postings featuring memristors here (just use ‘memristor’ as your search term in the blog search engine). You might also want to check out ‘neuromorphic engineeering’ and ‘neuromorphic computing’ and ‘artificial brain’.