Tag Archives: He Qian

New chip for neuromorphic computing runs at a fraction of the energy of today’s systems

An August 17, 2022 news item on Nanowerk announces big (so to speak) claims from a team researching neuromorphic (brainlike) computer chips,

An international team of researchers has designed and built a chip that runs computations directly in memory and can run a wide variety of artificial intelligence (AI) applications–all at a fraction of the energy consumed by computing platforms for general-purpose AI computing.

The NeuRRAM neuromorphic chip brings AI a step closer to running on a broad range of edge devices, disconnected from the cloud, where they can perform sophisticated cognitive tasks anywhere and anytime without relying on a network connection to a centralized server. Applications abound in every corner of the world and every facet of our lives, and range from smart watches, to VR headsets, smart earbuds, smart sensors in factories and rovers for space exploration.

The NeuRRAM chip is not only twice as energy efficient as the state-of-the-art “compute-in-memory” chips, an innovative class of hybrid chips that runs computations in memory, it also delivers results that are just as accurate as conventional digital chips. Conventional AI platforms are a lot bulkier and typically are constrained to using large data servers operating in the cloud.

In addition, the NeuRRAM chip is highly versatile and supports many different neural network models and architectures. As a result, the chip can be used for many different applications, including image recognition and reconstruction as well as voice recognition.

..

An August 17, 2022 University of California at San Diego (UCSD) news release (also on EurekAlert), which originated the news item, provides more detail than usually found in a news release,

“The conventional wisdom is that the higher efficiency of compute-in-memory is at the cost of versatility, but our NeuRRAM chip obtains efficiency while not sacrificing versatility,” said Weier Wan, the paper’s first corresponding author and a recent Ph.D. graduate of Stanford University who worked on the chip while at UC San Diego, where he was co-advised by Gert Cauwenberghs in the Department of Bioengineering. 

The research team, co-led by bioengineers at the University of California San Diego, presents their results in the Aug. 17 [2022] issue of Nature.

Currently, AI computing is both power hungry and computationally expensive. Most AI applications on edge devices involve moving data from the devices to the cloud, where the AI processes and analyzes it. Then the results are moved back to the device. That’s because most edge devices are battery-powered and as a result only have a limited amount of power that can be dedicated to computing. 

By reducing power consumption needed for AI inference at the edge, this NeuRRAM chip could lead to more robust, smarter and accessible edge devices and smarter manufacturing. It could also lead to better data privacy as the transfer of data from devices to the cloud comes with increased security risks. 

On AI chips, moving data from memory to computing units is one major bottleneck. 

“It’s the equivalent of doing an eight-hour commute for a two-hour work day,” Wan said. 

To solve this data transfer issue, researchers used what is known as resistive random-access memory, a type of non-volatile memory that allows for computation directly within memory rather than in separate computing units. RRAM and other emerging memory technologies used as synapse arrays for neuromorphic computing were pioneered in the lab of Philip Wong, Wan’s advisor at Stanford and a main contributor to this work. Computation with RRAM chips is not necessarily new, but generally it leads to a decrease in the accuracy of the computations performed on the chip and a lack of flexibility in the chip’s architecture. 

“Compute-in-memory has been common practice in neuromorphic engineering since it was introduced more than 30 years ago,” Cauwenberghs said.  “What is new with NeuRRAM is that the extreme efficiency now goes together with great flexibility for diverse AI applications with almost no loss in accuracy over standard digital general-purpose compute platforms.”

A carefully crafted methodology was key to the work with multiple levels of “co-optimization” across the abstraction layers of hardware and software, from the design of the chip to its configuration to run various AI tasks. In addition, the team made sure to account for various constraints that span from memory device physics to circuits and network architecture. 

“This chip now provides us with a platform to address these problems across the stack from devices and circuits to algorithms,” said Siddharth Joshi, an assistant professor of computer science and engineering at the University of Notre Dame , who started working on the project as a Ph.D. student and postdoctoral researcher in Cauwenberghs lab at UC San Diego. 

Chip performance

Researchers measured the chip’s energy efficiency by a measure known as energy-delay product, or EDP. EDP combines both the amount of energy consumed for every operation and the amount of times it takes to complete the operation. By this measure, the NeuRRAM chip achieves 1.6 to 2.3 times lower EDP (lower is better) and 7 to 13 times higher computational density than state-of-the-art chips. 

Researchers ran various AI tasks on the chip. It achieved 99% accuracy on a handwritten digit recognition task; 85.7% on an image classification task; and 84.7% on a Google speech command recognition task. In addition, the chip also achieved a 70% reduction in image-reconstruction error on an image-recovery task. These results are comparable to existing digital chips that perform computation under the same bit-precision, but with drastic savings in energy. 

Researchers point out that one key contribution of the paper is that all the results featured are obtained directly on the hardware. In many previous works of compute-in-memory chips, AI benchmark results were often obtained partially by software simulation. 

Next steps include improving architectures and circuits and scaling the design to more advanced technology nodes. Researchers also plan to tackle other applications, such as spiking neural networks.

“We can do better at the device level, improve circuit design to implement additional features and address diverse applications with our dynamic NeuRRAM platform,” said Rajkumar Kubendran, an assistant professor for the University of Pittsburgh, who started work on the project while a Ph.D. student in Cauwenberghs’ research group at UC San Diego.

In addition, Wan is a founding member of a startup that works on productizing the compute-in-memory technology. “As a researcher and  an engineer, my ambition is to bring research innovations from labs into practical use,” Wan said. 

New architecture 

The key to NeuRRAM’s energy efficiency is an innovative method to sense output in memory. Conventional approaches use voltage as input and measure current as the result. But this leads to the need for more complex and more power hungry circuits. In NeuRRAM, the team engineered a neuron circuit that senses voltage and performs analog-to-digital conversion in an energy efficient manner. This voltage-mode sensing can activate all the rows and all the columns of an RRAM array in a single computing cycle, allowing higher parallelism. 

In the NeuRRAM architecture, CMOS neuron circuits are physically interleaved with RRAM weights. It differs from conventional designs where CMOS circuits are typically on the peripheral of RRAM weights.The neuron’s connections with the RRAM array can be configured to serve as either input or output of the neuron. This allows neural network inference in various data flow directions without incurring overheads in area or power consumption. This in turn makes the architecture easier to reconfigure. 

To make sure that accuracy of the AI computations can be preserved across various neural network architectures, researchers developed a set of hardware algorithm co-optimization techniques. The techniques were verified on various neural networks including convolutional neural networks, long short-term memory, and restricted Boltzmann machines. 

As a neuromorphic AI chip, NeuroRRAM performs parallel distributed processing across 48 neurosynaptic cores. To simultaneously achieve high versatility and high efficiency, NeuRRAM supports data-parallelism by mapping a layer in the neural network model onto multiple cores for parallel inference on multiple data. Also, NeuRRAM offers model-parallelism by mapping different layers of a model onto different cores and performing inference in a pipelined fashion.

An international research team

The work is the result of an international team of researchers. 

The UC San Diego team designed the CMOS circuits that implement the neural functions interfacing with the RRAM arrays to support the synaptic functions in the chip’s architecture, for high efficiency and versatility. Wan, working closely with the entire team, implemented the design; characterized the chip; trained the AI models; and executed the experiments. Wan also developed a software toolchain that maps AI applications onto the chip. 

The RRAM synapse array and its operating conditions were extensively characterized and optimized at Stanford University. 

The RRAM array was fabricated and integrated onto CMOS at Tsinghua University. 

The Team at Notre Dame contributed to both the design and architecture of the chip and the subsequent machine learning model design and training.

The research started as part of the National Science Foundation funded Expeditions in Computing project on Visual Cortex on Silicon at Penn State University, with continued funding support from the Office of Naval Research Science of AI program, the Semiconductor Research Corporation and DARPA [{US} Defense Advanced Research Projects Agency] JUMP program, and Western Digital Corporation. 

Here’s a link to and a citation for the paper,

A compute-in-memory chip based on resistive random-access memory by Weier Wan, Rajkumar Kubendran, Clemens Schaefer, Sukru Burc Eryilmaz, Wenqiang Zhang, Dabin Wu, Stephen Deiss, Priyanka Raina, He Qian, Bin Gao, Siddharth Joshi, Huaqiang Wu, H.-S. Philip Wong & Gert Cauwenberghs. Nature volume 608, pages 504–512 (2022) DOI: https://doi.org/10.1038/s41586-022-04992-8 Published: 17 August 2022 Issue Date: 18 August 2022

This paper is open access.

Of sleep, electric sheep, and thousands of artificial synapses on a chip

A close-up view of a new neuromorphic “brain-on-a-chip” that includes tens of thousands of memristors, or memory transistors. Credit: Peng Lin Courtesy: MIT

It’s hard to believe that a brain-on-a-chip might need sleep but that seems to be the case as far as the US Dept. of Energy’s Los Alamos National Laboratory is concerned. Before pursuing that line of thought, here’s some work from the Massachusetts Institute of Technology (MIT) involving memristors and a brain-on-a-chip. From a June 8, 2020 news item on ScienceDaily,

MIT engineers have designed a “brain-on-a-chip,” smaller than a piece of confetti, that is made from tens of thousands of artificial brain synapses known as memristors — silicon-based components that mimic the information-transmitting synapses in the human brain.

The researchers borrowed from principles of metallurgy to fabricate each memristor from alloys of silver and copper, along with silicon. When they ran the chip through several visual tasks, the chip was able to “remember” stored images and reproduce them many times over, in versions that were crisper and cleaner compared with existing memristor designs made with unalloyed elements.

Their results, published today in the journal Nature Nanotechnology, demonstrate a promising new memristor design for neuromorphic devices — electronics that are based on a new type of circuit that processes information in a way that mimics the brain’s neural architecture. Such brain-inspired circuits could be built into small, portable devices, and would carry out complex computational tasks that only today’s supercomputers can handle.

This ‘metallurgical’ approach differs somewhat from the protein nanowire approach used by the University of Massachusetts at Amherst team mentioned in my June 15, 2020 posting. Scientists are pursuing multiple pathways and we may find that we arrive with not ‘a single artificial brain but with many types of artificial brains.

A June 8, 2020 MIT news release (also on EurekAlert) provides more detail about this brain-on-a-chip,

“So far, artificial synapse networks exist as software. We’re trying to build real neural network hardware for portable artificial intelligence systems,” says Jeehwan Kim, associate professor of mechanical engineering at MIT. “Imagine connecting a neuromorphic device to a camera on your car, and having it recognize lights and objects and make a decision immediately, without having to connect to the internet. We hope to use energy-efficient memristors to do those tasks on-site, in real-time.”

Wandering ions

Memristors, or memory transistors [Note: Memristors are usually described as memory resistors; this is the first time I’ve seen ‘memory transistor’], are an essential element in neuromorphic computing. In a neuromorphic device, a memristor would serve as the transistor in a circuit, though its workings would more closely resemble a brain synapse — the junction between two neurons. The synapse receives signals from one neuron, in the form of ions, and sends a corresponding signal to the next neuron.

A transistor in a conventional circuit transmits information by switching between one of only two values, 0 and 1, and doing so only when the signal it receives, in the form of an electric current, is of a particular strength. In contrast, a memristor would work along a gradient, much like a synapse in the brain. The signal it produces would vary depending on the strength of the signal that it receives. This would enable a single memristor to have many values, and therefore carry out a far wider range of operations than binary transistors.

Like a brain synapse, a memristor would also be able to “remember” the value associated with a given current strength, and produce the exact same signal the next time it receives a similar current. This could ensure that the answer to a complex equation, or the visual classification of an object, is reliable — a feat that normally involves multiple transistors and capacitors.

Ultimately, scientists envision that memristors would require far less chip real estate than conventional transistors, enabling powerful, portable computing devices that do not rely on supercomputers, or even connections to the Internet.

Existing memristor designs, however, are limited in their performance. A single memristor is made of a positive and negative electrode, separated by a “switching medium,” or space between the electrodes. When a voltage is applied to one electrode, ions from that electrode flow through the medium, forming a “conduction channel” to the other electrode. The received ions make up the electrical signal that the memristor transmits through the circuit. The size of the ion channel (and the signal that the memristor ultimately produces) should be proportional to the strength of the stimulating voltage.

Kim says that existing memristor designs work pretty well in cases where voltage stimulates a large conduction channel, or a heavy flow of ions from one electrode to the other. But these designs are less reliable when memristors need to generate subtler signals, via thinner conduction channels.

The thinner a conduction channel, and the lighter the flow of ions from one electrode to the other, the harder it is for individual ions to stay together. Instead, they tend to wander from the group, disbanding within the medium. As a result, it’s difficult for the receiving electrode to reliably capture the same number of ions, and therefore transmit the same signal, when stimulated with a certain low range of current.

Borrowing from metallurgy

Kim and his colleagues found a way around this limitation by borrowing a technique from metallurgy, the science of melding metals into alloys and studying their combined properties.

“Traditionally, metallurgists try to add different atoms into a bulk matrix to strengthen materials, and we thought, why not tweak the atomic interactions in our memristor, and add some alloying element to control the movement of ions in our medium,” Kim says.

Engineers typically use silver as the material for a memristor’s positive electrode. Kim’s team looked through the literature to find an element that they could combine with silver to effectively hold silver ions together, while allowing them to flow quickly through to the other electrode.

The team landed on copper as the ideal alloying element, as it is able to bind both with silver, and with silicon.

“It acts as a sort of bridge, and stabilizes the silver-silicon interface,” Kim says.

To make memristors using their new alloy, the group first fabricated a negative electrode out of silicon, then made a positive electrode by depositing a slight amount of copper, followed by a layer of silver. They sandwiched the two electrodes around an amorphous silicon medium. In this way, they patterned a millimeter-square silicon chip with tens of thousands of memristors.

As a first test of the chip, they recreated a gray-scale image of the Captain America shield. They equated each pixel in the image to a corresponding memristor in the chip. They then modulated the conductance of each memristor that was relative in strength to the color in the corresponding pixel.

The chip produced the same crisp image of the shield, and was able to “remember” the image and reproduce it many times, compared with chips made of other materials.

The team also ran the chip through an image processing task, programming the memristors to alter an image, in this case of MIT’s Killian Court, in several specific ways, including sharpening and blurring the original image. Again, their design produced the reprogrammed images more reliably than existing memristor designs.

“We’re using artificial synapses to do real inference tests,” Kim says. “We would like to develop this technology further to have larger-scale arrays to do image recognition tasks. And some day, you might be able to carry around artificial brains to do these kinds of tasks, without connecting to supercomputers, the internet, or the cloud.”

Here’s a link to and a citation for the paper,

Alloying conducting channels for reliable neuromorphic computing by Hanwool Yeon, Peng Lin, Chanyeol Choi, Scott H. Tan, Yongmo Park, Doyoon Lee, Jaeyong Lee, Feng Xu, Bin Gao, Huaqiang Wu, He Qian, Yifan Nie, Seyoung Kim & Jeehwan Kim. Nature Nanotechnology (2020 DOI: https://doi.org/10.1038/s41565-020-0694-5 Published: 08 June 2020

This paper is behind a paywall.

Electric sheep and sleeping androids

I find it impossible to mention that androids might need sleep without reference to Philip K. Dick’s 1968 novel, “Do Androids Dream of Electric Sheep?”; its Wikipedia entry is here.

June 8, 2020 Intelligent machines of the future may need to sleep as much as we do. Intelligent machines of the future may need to sleep as much as we do. Courtesy: Los Alamos National Laboratory

As it happens, I’m not the only one who felt the need to reference the novel, from a June 8, 2020 news item on ScienceDaily,

No one can say whether androids will dream of electric sheep, but they will almost certainly need periods of rest that offer benefits similar to those that sleep provides to living brains, according to new research from Los Alamos National Laboratory.

“We study spiking neural networks, which are systems that learn much as living brains do,” said Los Alamos National Laboratory computer scientist Yijing Watkins. “We were fascinated by the prospect of training a neuromorphic processor in a manner analogous to how humans and other biological systems learn from their environment during childhood development.”

Watkins and her research team found that the network simulations became unstable after continuous periods of unsupervised learning. When they exposed the networks to states that are analogous to the waves that living brains experience during sleep, stability was restored. “It was as though we were giving the neural networks the equivalent of a good night’s rest,” said Watkins.

A June 8, 2020 Los Alamos National Laboratory (LANL) news release (also on EurekAlert), which originated the news item, describes the research team’s presentation,

The discovery came about as the research team worked to develop neural networks that closely approximate how humans and other biological systems learn to see. The group initially struggled with stabilizing simulated neural networks undergoing unsupervised dictionary training, which involves classifying objects without having prior examples to compare them to.

“The issue of how to keep learning systems from becoming unstable really only arises when attempting to utilize biologically realistic, spiking neuromorphic processors or when trying to understand biology itself,” said Los Alamos computer scientist and study coauthor Garrett Kenyon. “The vast majority of machine learning, deep learning, and AI researchers never encounter this issue because in the very artificial systems they study they have the luxury of performing global mathematical operations that have the effect of regulating the overall dynamical gain of the system.”

The researchers characterize the decision to expose the networks to an artificial analog of sleep as nearly a last ditch effort to stabilize them. They experimented with various types of noise, roughly comparable to the static you might encounter between stations while tuning a radio. The best results came when they used waves of so-called Gaussian noise, which includes a wide range of frequencies and amplitudes. They hypothesize that the noise mimics the input received by biological neurons during slow-wave sleep. The results suggest that slow-wave sleep may act, in part, to ensure that cortical neurons maintain their stability and do not hallucinate.

The groups’ next goal is to implement their algorithm on Intel’s Loihi neuromorphic chip. They hope allowing Loihi to sleep from time to time will enable it to stably process information from a silicon retina camera in real time. If the findings confirm the need for sleep in artificial brains, we can probably expect the same to be true of androids and other intelligent machines that may come about in the future.

Watkins will be presenting the research at the Women in Computer Vision Workshop on June 14 [2020] in Seattle.

The 2020 Women in Computer Vition Workshop (WICV) website is here. As is becoming standard practice for these times, the workshop was held in a virtual environment. Here’s a link to and a citation for the poster presentation paper,

Using Sinusoidally-Modulated Noise as a Surrogate for Slow-Wave Sleep to
Accomplish Stable Unsupervised Dictionary Learning in a Spike-Based Sparse Coding Model
by Yijing Watkins, Edward Kim, Andrew Sornborger and Garrett T. Kenyon. Women in Computer Vision Workshop on June 14, 2020 in Seattle, Washington (state)

This paper is open access for now.