Tag Archives: H.-S. Philip Wong

New chip for neuromorphic computing runs at a fraction of the energy of today’s systems

An August 17, 2022 news item on Nanowerk announces big (so to speak) claims from a team researching neuromorphic (brainlike) computer chips,

An international team of researchers has designed and built a chip that runs computations directly in memory and can run a wide variety of artificial intelligence (AI) applications–all at a fraction of the energy consumed by computing platforms for general-purpose AI computing.

The NeuRRAM neuromorphic chip brings AI a step closer to running on a broad range of edge devices, disconnected from the cloud, where they can perform sophisticated cognitive tasks anywhere and anytime without relying on a network connection to a centralized server. Applications abound in every corner of the world and every facet of our lives, and range from smart watches, to VR headsets, smart earbuds, smart sensors in factories and rovers for space exploration.

The NeuRRAM chip is not only twice as energy efficient as the state-of-the-art “compute-in-memory” chips, an innovative class of hybrid chips that runs computations in memory, it also delivers results that are just as accurate as conventional digital chips. Conventional AI platforms are a lot bulkier and typically are constrained to using large data servers operating in the cloud.

In addition, the NeuRRAM chip is highly versatile and supports many different neural network models and architectures. As a result, the chip can be used for many different applications, including image recognition and reconstruction as well as voice recognition.

..

An August 17, 2022 University of California at San Diego (UCSD) news release (also on EurekAlert), which originated the news item, provides more detail than usually found in a news release,

“The conventional wisdom is that the higher efficiency of compute-in-memory is at the cost of versatility, but our NeuRRAM chip obtains efficiency while not sacrificing versatility,” said Weier Wan, the paper’s first corresponding author and a recent Ph.D. graduate of Stanford University who worked on the chip while at UC San Diego, where he was co-advised by Gert Cauwenberghs in the Department of Bioengineering. 

The research team, co-led by bioengineers at the University of California San Diego, presents their results in the Aug. 17 [2022] issue of Nature.

Currently, AI computing is both power hungry and computationally expensive. Most AI applications on edge devices involve moving data from the devices to the cloud, where the AI processes and analyzes it. Then the results are moved back to the device. That’s because most edge devices are battery-powered and as a result only have a limited amount of power that can be dedicated to computing. 

By reducing power consumption needed for AI inference at the edge, this NeuRRAM chip could lead to more robust, smarter and accessible edge devices and smarter manufacturing. It could also lead to better data privacy as the transfer of data from devices to the cloud comes with increased security risks. 

On AI chips, moving data from memory to computing units is one major bottleneck. 

“It’s the equivalent of doing an eight-hour commute for a two-hour work day,” Wan said. 

To solve this data transfer issue, researchers used what is known as resistive random-access memory, a type of non-volatile memory that allows for computation directly within memory rather than in separate computing units. RRAM and other emerging memory technologies used as synapse arrays for neuromorphic computing were pioneered in the lab of Philip Wong, Wan’s advisor at Stanford and a main contributor to this work. Computation with RRAM chips is not necessarily new, but generally it leads to a decrease in the accuracy of the computations performed on the chip and a lack of flexibility in the chip’s architecture. 

“Compute-in-memory has been common practice in neuromorphic engineering since it was introduced more than 30 years ago,” Cauwenberghs said.  “What is new with NeuRRAM is that the extreme efficiency now goes together with great flexibility for diverse AI applications with almost no loss in accuracy over standard digital general-purpose compute platforms.”

A carefully crafted methodology was key to the work with multiple levels of “co-optimization” across the abstraction layers of hardware and software, from the design of the chip to its configuration to run various AI tasks. In addition, the team made sure to account for various constraints that span from memory device physics to circuits and network architecture. 

“This chip now provides us with a platform to address these problems across the stack from devices and circuits to algorithms,” said Siddharth Joshi, an assistant professor of computer science and engineering at the University of Notre Dame , who started working on the project as a Ph.D. student and postdoctoral researcher in Cauwenberghs lab at UC San Diego. 

Chip performance

Researchers measured the chip’s energy efficiency by a measure known as energy-delay product, or EDP. EDP combines both the amount of energy consumed for every operation and the amount of times it takes to complete the operation. By this measure, the NeuRRAM chip achieves 1.6 to 2.3 times lower EDP (lower is better) and 7 to 13 times higher computational density than state-of-the-art chips. 

Researchers ran various AI tasks on the chip. It achieved 99% accuracy on a handwritten digit recognition task; 85.7% on an image classification task; and 84.7% on a Google speech command recognition task. In addition, the chip also achieved a 70% reduction in image-reconstruction error on an image-recovery task. These results are comparable to existing digital chips that perform computation under the same bit-precision, but with drastic savings in energy. 

Researchers point out that one key contribution of the paper is that all the results featured are obtained directly on the hardware. In many previous works of compute-in-memory chips, AI benchmark results were often obtained partially by software simulation. 

Next steps include improving architectures and circuits and scaling the design to more advanced technology nodes. Researchers also plan to tackle other applications, such as spiking neural networks.

“We can do better at the device level, improve circuit design to implement additional features and address diverse applications with our dynamic NeuRRAM platform,” said Rajkumar Kubendran, an assistant professor for the University of Pittsburgh, who started work on the project while a Ph.D. student in Cauwenberghs’ research group at UC San Diego.

In addition, Wan is a founding member of a startup that works on productizing the compute-in-memory technology. “As a researcher and  an engineer, my ambition is to bring research innovations from labs into practical use,” Wan said. 

New architecture 

The key to NeuRRAM’s energy efficiency is an innovative method to sense output in memory. Conventional approaches use voltage as input and measure current as the result. But this leads to the need for more complex and more power hungry circuits. In NeuRRAM, the team engineered a neuron circuit that senses voltage and performs analog-to-digital conversion in an energy efficient manner. This voltage-mode sensing can activate all the rows and all the columns of an RRAM array in a single computing cycle, allowing higher parallelism. 

In the NeuRRAM architecture, CMOS neuron circuits are physically interleaved with RRAM weights. It differs from conventional designs where CMOS circuits are typically on the peripheral of RRAM weights.The neuron’s connections with the RRAM array can be configured to serve as either input or output of the neuron. This allows neural network inference in various data flow directions without incurring overheads in area or power consumption. This in turn makes the architecture easier to reconfigure. 

To make sure that accuracy of the AI computations can be preserved across various neural network architectures, researchers developed a set of hardware algorithm co-optimization techniques. The techniques were verified on various neural networks including convolutional neural networks, long short-term memory, and restricted Boltzmann machines. 

As a neuromorphic AI chip, NeuroRRAM performs parallel distributed processing across 48 neurosynaptic cores. To simultaneously achieve high versatility and high efficiency, NeuRRAM supports data-parallelism by mapping a layer in the neural network model onto multiple cores for parallel inference on multiple data. Also, NeuRRAM offers model-parallelism by mapping different layers of a model onto different cores and performing inference in a pipelined fashion.

An international research team

The work is the result of an international team of researchers. 

The UC San Diego team designed the CMOS circuits that implement the neural functions interfacing with the RRAM arrays to support the synaptic functions in the chip’s architecture, for high efficiency and versatility. Wan, working closely with the entire team, implemented the design; characterized the chip; trained the AI models; and executed the experiments. Wan also developed a software toolchain that maps AI applications onto the chip. 

The RRAM synapse array and its operating conditions were extensively characterized and optimized at Stanford University. 

The RRAM array was fabricated and integrated onto CMOS at Tsinghua University. 

The Team at Notre Dame contributed to both the design and architecture of the chip and the subsequent machine learning model design and training.

The research started as part of the National Science Foundation funded Expeditions in Computing project on Visual Cortex on Silicon at Penn State University, with continued funding support from the Office of Naval Research Science of AI program, the Semiconductor Research Corporation and DARPA [{US} Defense Advanced Research Projects Agency] JUMP program, and Western Digital Corporation. 

Here’s a link to and a citation for the paper,

A compute-in-memory chip based on resistive random-access memory by Weier Wan, Rajkumar Kubendran, Clemens Schaefer, Sukru Burc Eryilmaz, Wenqiang Zhang, Dabin Wu, Stephen Deiss, Priyanka Raina, He Qian, Bin Gao, Siddharth Joshi, Huaqiang Wu, H.-S. Philip Wong & Gert Cauwenberghs. Nature volume 608, pages 504–512 (2022) DOI: https://doi.org/10.1038/s41586-022-04992-8 Published: 17 August 2022 Issue Date: 18 August 2022

This paper is open access.

3-D integration of nanotechnologies on a single computer chip

By integrating nanomaterials , a new technique for a 3D computer chip capable of handling today’s huge amount of data has been developed. Weirdly, the first two paragraphs of a July 5, 2017 news item on Nanowerk do not convey the main point (Note: A link has been removed),

As embedded intelligence is finding its way into ever more areas of our lives, fields ranging from autonomous driving to personalized medicine are generating huge amounts of data. But just as the flood of data is reaching massive proportions, the ability of computer chips to process it into useful information is stalling.

Now, researchers at Stanford University and MIT have built a new chip to overcome this hurdle. The results are published today in the journal Nature (“Three-dimensional integration of nanotechnologies for computing and data storage on a single chip”), by lead author Max Shulaker, an assistant professor of electrical engineering and computer science at MIT. Shulaker began the work as a PhD student alongside H.-S. Philip Wong and his advisor Subhasish Mitra, professors of electrical engineering and computer science at Stanford. The team also included professors Roger Howe and Krishna Saraswat, also from Stanford.

This image helps to convey the main points,

Instead of relying on silicon-based devices, a new chip uses carbon nanotubes and resistive random-access memory (RRAM) cells. The two are built vertically over one another, making a new, dense 3-D computer architecture with interleaving layers of logic and memory. Courtesy MIT

As I hove been quite impressed with their science writing, it was a bit surprising to find that the Massachusetts Institute of Technology (MIT) had issued this news release (news item) as it didn’t follow the ‘rules’, i.e., cover as many of the journalistic questions (Who, What, Where, When, Why, and, sometimes, How) as possible in the first sentence/paragraph. This is written more in the style of a magazine article and so the details take a while to emerge, from a July 5, 2017 MIT news release, which originated the news item,

Computers today comprise different chips cobbled together. There is a chip for computing and a separate chip for data storage, and the connections between the two are limited. As applications analyze increasingly massive volumes of data, the limited rate at which data can be moved between different chips is creating a critical communication “bottleneck.” And with limited real estate on the chip, there is not enough room to place them side-by-side, even as they have been miniaturized (a phenomenon known as Moore’s Law).

To make matters worse, the underlying devices, transistors made from silicon, are no longer improving at the historic rate that they have for decades.

The new prototype chip is a radical change from today’s chips. It uses multiple nanotechnologies, together with a new computer architecture, to reverse both of these trends.

Instead of relying on silicon-based devices, the chip uses carbon nanotubes, which are sheets of 2-D graphene formed into nanocylinders, and resistive random-access memory (RRAM) cells, a type of nonvolatile memory that operates by changing the resistance of a solid dielectric material. The researchers integrated over 1 million RRAM cells and 2 million carbon nanotube field-effect transistors, making the most complex nanoelectronic system ever made with emerging nanotechnologies.

The RRAM and carbon nanotubes are built vertically over one another, making a new, dense 3-D computer architecture with interleaving layers of logic and memory. By inserting ultradense wires between these layers, this 3-D architecture promises to address the communication bottleneck.

However, such an architecture is not possible with existing silicon-based technology, according to the paper’s lead author, Max Shulaker, who is a core member of MIT’s Microsystems Technology Laboratories. “Circuits today are 2-D, since building conventional silicon transistors involves extremely high temperatures of over 1,000 degrees Celsius,” says Shulaker. “If you then build a second layer of silicon circuits on top, that high temperature will damage the bottom layer of circuits.”

The key in this work is that carbon nanotube circuits and RRAM memory can be fabricated at much lower temperatures, below 200 C. “This means they can be built up in layers without harming the circuits beneath,” Shulaker says.

This provides several simultaneous benefits for future computing systems. “The devices are better: Logic made from carbon nanotubes can be an order of magnitude more energy-efficient compared to today’s logic made from silicon, and similarly, RRAM can be denser, faster, and more energy-efficient compared to DRAM,” Wong says, referring to a conventional memory known as dynamic random-access memory.

“In addition to improved devices, 3-D integration can address another key consideration in systems: the interconnects within and between chips,” Saraswat adds.

“The new 3-D computer architecture provides dense and fine-grained integration of computating and data storage, drastically overcoming the bottleneck from moving data between chips,” Mitra says. “As a result, the chip is able to store massive amounts of data and perform on-chip processing to transform a data deluge into useful information.”

To demonstrate the potential of the technology, the researchers took advantage of the ability of carbon nanotubes to also act as sensors. On the top layer of the chip they placed over 1 million carbon nanotube-based sensors, which they used to detect and classify ambient gases.

Due to the layering of sensing, data storage, and computing, the chip was able to measure each of the sensors in parallel, and then write directly into its memory, generating huge bandwidth, Shulaker says.

Three-dimensional integration is the most promising approach to continue the technology scaling path set forth by Moore’s laws, allowing an increasing number of devices to be integrated per unit volume, according to Jan Rabaey, a professor of electrical engineering and computer science at the University of California at Berkeley, who was not involved in the research.

“It leads to a fundamentally different perspective on computing architectures, enabling an intimate interweaving of memory and logic,” Rabaey says. “These structures may be particularly suited for alternative learning-based computational paradigms such as brain-inspired systems and deep neural nets, and the approach presented by the authors is definitely a great first step in that direction.”

“One big advantage of our demonstration is that it is compatible with today’s silicon infrastructure, both in terms of fabrication and design,” says Howe.

“The fact that this strategy is both CMOS [complementary metal-oxide-semiconductor] compatible and viable for a variety of applications suggests that it is a significant step in the continued advancement of Moore’s Law,” says Ken Hansen, president and CEO of the Semiconductor Research Corporation, which supported the research. “To sustain the promise of Moore’s Law economics, innovative heterogeneous approaches are required as dimensional scaling is no longer sufficient. This pioneering work embodies that philosophy.”

The team is working to improve the underlying nanotechnologies, while exploring the new 3-D computer architecture. For Shulaker, the next step is working with Massachusetts-based semiconductor company Analog Devices to develop new versions of the system that take advantage of its ability to carry out sensing and data processing on the same chip.

So, for example, the devices could be used to detect signs of disease by sensing particular compounds in a patient’s breath, says Shulaker.

“The technology could not only improve traditional computing, but it also opens up a whole new range of applications that we can target,” he says. “My students are now investigating how we can produce chips that do more than just computing.”

“This demonstration of the 3-D integration of sensors, memory, and logic is an exceptionally innovative development that leverages current CMOS technology with the new capabilities of carbon nanotube field–effect transistors,” says Sam Fuller, CTO emeritus of Analog Devices, who was not involved in the research. “This has the potential to be the platform for many revolutionary applications in the future.”

This work was funded by the Defense Advanced Research Projects Agency [DARPA], the National Science Foundation, Semiconductor Research Corporation, STARnet SONIC, and member companies of the Stanford SystemX Alliance.

Here’s a link to and a citation for the paper,

Three-dimensional integration of nanotechnologies for computing and data storage on a single chip by Max M. Shulaker, Gage Hills, Rebecca S. Park, Roger T. Howe, Krishna Saraswat, H.-S. Philip Wong, & Subhasish Mitra. Nature 547, 74–78 (06 July 2017) doi:10.1038/nature22994 Published online 05 July 2017

This paper is behind a paywall.

Boosting chip speeds with graphene

There’s a certain hysteria associated with chip speeds as engineers and computer scientists try to achieve the ever improved speed times that consumers have enjoyed for some decades. The question looms, is there some point at which we can no longer improve the speed? Well, we haven’t reached that point yet according to a June 18, 2015 news item on Nanotechnology Now,

Stanford engineers find a simple yet clever way to boost chip speeds: Inside each chip are millions of tiny wires to transport data; wrapping them in a protective layer of graphene could boost speeds by up to 30 percent. [emphasis mine]

A June 16, 2015 Stanford University news release by Tom Abate (also on EurekAlert but dated June 17, 2015), which originated the news item, describes how computer chips are currently designed and the redesign which yields more speed,

A typical computer chip includes millions of transistors connected with an extensive network of copper wires. Although chip wires are unimaginably short and thin compared to household wires both have one thing in common: in each case the copper is wrapped within a protective sheath.

For years a material called tantalum nitride has formed protective layer in chip wires.

Now Stanford-led experiments demonstrate that a different sheathing material, graphene, can help electrons scoot through tiny copper wires in chips more quickly.

Graphene is a single layer of carbon atoms arranged in a strong yet thin lattice. Stanford electrical engineer H.-S. Philip Wong says this modest fix, using graphene to wrap wires, could allow transistors to exchange data faster than is currently possible. And the advantages of using graphene would become greater in the future as transistors continue to shrink.

Wong led a team of six researchers, including two from the University of Wisconsin-Madison, who will present their findings at the Symposia of VLSI Technology and Circuits in Kyoto, a leading venue for the electronics industry.

Ling Li, a graduate student in electrical engineering at Stanford and first author of the research paper, explained why changing the exterior wrapper on connecting wires can have such a big impact on chip performance.

It begins with understanding the dual role of this protective layer: it isolates the copper from the silicon on the chip and also serve to conduct electricity.

On silicon chips, the transistors act like tiny gates to switch electrons on or off. That switching function is how transistors process data.

The copper wires between the transistors transport this data once it is processed.

The isolating material–currently tantalum nitride–keeps the copper from migrating into the silicon transistors and rendering them non-functional.

Why switch to graphene?

Two reasons, starting with the ceaseless desire to keep making electronic components smaller.

When the Stanford team used the thinnest possible layer of tantalum nitride needed to perform this isolating function, they found that the industry-standard was eight times thicker than the graphene layer that did the same work.

Graphene had a second advantage as a protective sheathing and here it’s important to differentiate how this outer layer functions in chip wires versus a household wires.

In house wires the outer layer insulates the copper to prevent electrocution or fires.

In a chip the layer around the wires is a barrier to prevent copper atoms from infiltrating the silicon. Were that to happen the transistors would cease to function. So the protective layer isolates the copper from the silicon

The Stanford experiment showed that graphene could perform this isolating role while also serving as an auxiliary conductor of electrons. Its lattice structure allows electrons to leap from carbon atom to carbon atom straight down the wire, while effectively containing the copper atoms within the copper wire.

These benefits–the thinness of the graphene layer and its dual role as isolator and auxiliary conductor–allow this new wire technology to carry more data between transistors, speeding up overall chip performance in the process.

In today’s chips the benefits are modest; a graphene isolator would boost wire speeds from four percent to 17 percent, depending on the length of the wire. [emphasis mine]

But as transistors and wires continue to shrink in size, the benefits of the ultrathin yet conductive graphene isolator become greater. [emphasis mine] The Stanford engineers estimate that their technology could increase wire speeds by 30 percent in the next two generations

The Stanford researchers think the promise of faster computing will induce other researchers to get interested in wires, and help to overcome some of the hurdles needed to take this proof of principle into common practice.

This would include techniques to grow graphene, especially growing it directly onto wires while chips are being mass-produced. In addition to his University of Wisconsin collaborator Professor Michael Arnold, Wong cited Purdue University Professor Zhihong Chen. Wong noted that the idea of using graphene as an isolator was inspired by Cornell University Professor Paul McEuen and his pioneering research on the basic properties of this marvelous material. Alexander Balandin of the University of California-Riverside has also made contributions to using graphene in chips.

“Graphene has been promised to benefit the electronics industry for a long time, and using it as a copper barrier is perhaps the first realization of this promise,” Wong said.

I gather they’ve decided to highlight the most optimistic outcomes.

Carbon nanotubes a second way: Cedric, the carbon nanotube computer

On the heels of yesterday’s(Sept. 26, 2013) posting about carbon nnanotubes as flexible gas sensors, I have this item about a computer fashioned from carbon nanotubes.

This wafer contains tiny computers using carbon nanotubes, a material that could lead to smaller, more energy-efficient processors. Courtesy Standford University

This wafer contains tiny computers using carbon nanotubes, a material that could lead to smaller, more energy-efficient processors. Courtesy Stanford University

To me this looks more like a ping pong bat than a computer wafer. Regardless, here’s more about it from a Sept. 25, 2013 news item by James Morgan for BBC (British Broadcasting Corporation) News online,

The first computer built entirely with carbon nanotubes has been unveiled, opening the door to a new generation of digital devices.

“Cedric” is only a basic prototype but could be developed into a machine which is smaller, faster and more efficient than today’s silicon models.

Nanotubes have long been touted as the heir to silicon’s throne, but building a working computer has proven awkward.

Cedric is the most complex carbon-based electronic system yet realised.

So is it fast? Not at all. It might have been in 1955.
The computer operates on just one bit of information, and can only count to 32.

“In human terms, Cedric can count on his hands and sort the alphabet. But he is, in the full sense of the word, a computer,” says co-author [of the paper published in Nature] Max Shulaker.

Tom Abate’s Sept. 26, 2013 article for Stanford Report provides more detail about carbon nanotubes, their potential for replacing silicon chips and associated problems,

“Carbon nanotubes [CNTs] have long been considered as a potential successor to the silicon transistor,” said Professor Jan Rabaey, a world expert on electronic circuits and systems at the University of California-Berkeley.

Why worry about a successor to silicon?

Such concerns arise from the demands that designers place upon semiconductors and their fundamental workhorse unit, those on-off switches known as transistors.

For decades, progress in electronics has meant shrinking the size of each transistor to pack more transistors on a chip. But as transistors become tinier, they waste more power and generate more heat – all in a smaller and smaller space, as evidenced by the warmth emanating from the bottom of a laptop.

Many researchers believe that this power-wasting phenomenon could spell the end of Moore’s Law, named for Intel Corp. co-founder Gordon Moore, who predicted in 1965 that the density of transistors would double roughly every two years, leading to smaller, faster and, as it turned out, cheaper electronics.

But smaller, faster and cheaper has also meant smaller, faster and hotter.

“CNTs could take us at least an order of magnitude in performance beyond where you can project silicon could take us,” Wong [another co-author of the paper]  said.

But inherent imperfections have stood in the way of putting this promising material to practical use.

First, CNTs do not necessarily grow in neat parallel lines, as chipmakers would like.

Over time, researchers have devised tricks to grow 99.5 percent of CNTs in straight lines. But with billions of nanotubes on a chip, even a tiny degree of misaligned tubes could cause errors, so that problem remained.

A second type of imperfection has also stymied CNT technology.

Depending on how the CNTs grow, a fraction of these carbon nanotubes can end up behaving like metallic wires that always conduct electricity, instead of acting like semiconductors that can be switched off.

Since mass production is the eventual goal, researchers had to find ways to deal with misaligned and/or metallic CNTs without having to hunt for them like needles in a haystack.

“We needed a way to design circuits without having to look for imperfections or even know where they were,” Mitra said.

The researchers have dubbed their solution an “imperfection-immune design,” from the Abate article,

To eliminate the wire-like or metallic nanotubes, the Stanford team switched off all the good CNTs. Then they pumped the semiconductor circuit full of electricity. All of that electricity concentrated in the metallic nanotubes, which grew so hot that they burned up and literally vaporized into tiny puffs of carbon dioxide. This sophisticated technique eliminated the metallic CNTs in the circuit.

Bypassing the misaligned nanotubes required even greater subtlety.

The Stanford researchers created a powerful algorithm that maps out a circuit layout that is guaranteed to work no matter whether or where CNTs might be askew.

“This ‘imperfections-immune design’ [technique] makes this discovery truly exemplary,” said Sankar Basu, a program director at the National Science Foundation.

The Stanford team used this imperfection-immune design to assemble a basic computer with 178 transistors, a limit imposed by the fact that they used the university’s chip-making facilities rather than an industrial fabrication process.

Their CNT computer performed tasks such as counting and number sorting. It runs a basic operating system that allows it to swap between these processes. In a demonstration of its potential, the researchers also showed that the CNT computer could run MIPS, a commercial instruction set developed in the early 1980s by then Stanford engineering professor and now university President John Hennessy.

Though it could take years to mature, the Stanford approach points toward the possibility of industrial-scale production of carbon nanotube semiconductors, according to Naresh Shanbhag, a professor at the University of Illinois at Urbana-Champaign and director of SONIC, a consortium of next-generation chip design research.

Here’s a link to and a citation for the paper,

Carbon nanotube computer by Max M. Shulaker, Gage Hills, Nishant Patil, Hai Wei, Hong-Yu Chen, H.-S. Philip Wong, & Subhasish Mitra. Nature 501, 526–530 (26 September 2013) doi:10.1038/nature12502

This article is behind a paywall but you can gain temporary access via ReadCube.