Tag Archives: Kevin G. Yager

Chatbot with expertise in nanomaterials

This December 1, 2023 news item on phys.org starts with a story,

A researcher has just finished writing a scientific paper. She knows her work could benefit from another perspective. Did she overlook something? Or perhaps there’s an application of her research she hadn’t thought of. A second set of eyes would be great, but even the friendliest of collaborators might not be able to spare the time to read all the required background publications to catch up.

Kevin Yager—leader of the electronic nanomaterials group at the Center for Functional Nanomaterials (CFN), a U.S. Department of Energy (DOE) Office of Science User Facility at DOE’s Brookhaven National Laboratory—has imagined how recent advances in artificial intelligence (AI) and machine learning (ML) could aid scientific brainstorming and ideation. To accomplish this, he has developed a chatbot with knowledge in the kinds of science he’s been engaged in.

A December 1, 2023 DOE/Brookhaven National Laboratory news release by Denise Yazak (also on EurekAlert), which originated the news item, describes a research project with a chatbot that has nanomaterial-specific knowledge, Note: Links have been removed,

Rapid advances in AI and ML have given way to programs that can generate creative text and useful software code. These general-purpose chatbots have recently captured the public imagination. Existing chatbots—based on large, diverse language models—lack detailed knowledge of scientific sub-domains. By leveraging a document-retrieval method, Yager’s bot is knowledgeable in areas of nanomaterial science that other bots are not. The details of this project and how other scientists can leverage this AI colleague for their own work have recently been published in Digital Discovery.

Rise of the Robots

“CFN has been looking into new ways to leverage AI/ML to accelerate nanomaterial discovery for a long time. Currently, it’s helping us quickly identify, catalog, and choose samples, automate experiments, control equipment, and discover new materials. Esther Tsai, a scientist in the electronic nanomaterials group at CFN, is developing an AI companion to help speed up materials research experiments at the National Synchrotron Light Source II (NSLS-II).” NSLS-II is another DOE Office of Science User Facility at Brookhaven Lab.

At CFN, there has been a lot of work on AI/ML that can help drive experiments through the use of automation, controls, robotics, and analysis, but having a program that was adept with scientific text was something that researchers hadn’t explored as deeply. Being able to quickly document, understand, and convey information about an experiment can help in a number of ways—from breaking down language barriers to saving time by summarizing larger pieces of work.

Watching Your Language

To build a specialized chatbot, the program required domain-specific text—language taken from areas the bot is intended to focus on. In this case, the text is scientific publications. Domain-specific text helps the AI model understand new terminology and definitions and introduces it to frontier scientific concepts. Most importantly, this curated set of documents enables the AI model to ground its reasoning using trusted facts.

To emulate natural human language, AI models are trained on existing text, enabling them to learn the structure of language, memorize various facts, and develop a primitive sort of reasoning. Rather than laboriously retrain the AI model on nanoscience text, Yager gave it the ability to look up relevant information in a curated set of publications. Providing it with a library of relevant data was only half of the battle. To use this text accurately and effectively, the bot would need a way to decipher the correct context.

“A challenge that’s common with language models is that sometimes they ‘hallucinate’ plausible sounding but untrue things,” explained Yager. “This has been a core issue to resolve for a chatbot used in research as opposed to one doing something like writing poetry. We don’t want it to fabricate facts or citations. This needed to be addressed. The solution for this was something we call ‘embedding,’ a way of categorizing and linking information quickly behind the scenes.”

Embedding is a process that transforms words and phrases into numerical values. The resulting “embedding vector” quantifies the meaning of the text. When a user asks the chatbot a question, it’s also sent to the ML embedding model to calculate its vector value. This vector is used to search through a pre-computed database of text chunks from scientific papers that were similarly embedded. The bot then uses text snippets it finds that are semantically related to the question to get a more complete understanding of the context.

The user’s query and the text snippets are combined into a “prompt” that is sent to a large language model, an expansive program that creates text modeled on natural human language, that generates the final response. The embedding ensures that the text being pulled is relevant in the context of the user’s question. By providing text chunks from the body of trusted documents, the chatbot generates answers that are factual and sourced.

“The program needs to be like a reference librarian,” said Yager. “It needs to heavily rely on the documents to provide sourced answers. It needs to be able to accurately interpret what people are asking and be able to effectively piece together the context of those questions to retrieve the most relevant information. While the responses may not be perfect yet, it’s already able to answer challenging questions and trigger some interesting thoughts while planning new projects and research.”

Bots Empowering Humans

CFN is developing AI/ML systems as tools that can liberate human researchers to work on more challenging and interesting problems and to get more out of their limited time while computers automate repetitive tasks in the background. There are still many unknowns about this new way of working, but these questions are the start of important discussions scientists are having right now to ensure AI/ML use is safe and ethical.

“There are a number of tasks that a domain-specific chatbot like this could clear from a scientist’s workload. Classifying and organizing documents, summarizing publications, pointing out relevant info, and getting up to speed in a new topical area are just a few potential applications,” remarked Yager. “I’m excited to see where all of this will go, though. We never could have imagined where we are now three years ago, and I’m looking forward to where we’ll be three years from now.”

For researchers interested in trying this software out for themselves, the source code for CFN’s chatbot and associated tools can be found in this github repository.

Brookhaven National Laboratory is supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit science.energy.gov.

Here’s a link to and a citation for the paper,

Domain-specific chatbots for science using embeddings by Kevin G. Yager.
Digital Discovery, 2023,2, 1850-1861 DOI: https://doi.org/10.1039/D3DD00112A
First published 10 Oct 2023

This paper appears to be open access.

DNA (deoxyribonucleic acid) scaffolding for nonbiological construction

DNA (deoxyribonucleic acid) is being exploited in ways that would have seemed unimaginable to me when I was in high school. Earlier today (June 3, 2015), I ran a piece about DNA and data storage as imagined in an art/science project (DNA (deoxyribonucleic acid), music, and data storage) and now I have this work from the US Department of Energy’s (DOE) Brookhaven National Laboratory, from a June 1, 2015 news item on Nanowerk,

You’re probably familiar with the role of DNA as the blueprint for making every protein on the planet and passing genetic information from one generation to the next. But researchers at Brookhaven Lab’s Center for Functional Nanomaterials have shown that the twisted ladder molecule made of complementary matching strands can also perform a number of decidedly non-biological construction jobs: serving as a scaffold and programmable “glue” for linking up nanoparticles. This work has resulted in a variety of nanoparticle assemblies, including composite structures with switchable phases whose optical, magnetic, or other properties might be put to use in dynamic energy-harvesting or responsive optical materials. Three recent studies showcase different strategies for using synthetic strands of this versatile building material to link and arrange different types of nanoparticles in predictable ways.

The researchers have provided an image of the DNA building blocks,

Controlling the self-assembly of nanoparticles into superlattices is an important approach to build functional materials. The Brookhaven team used nanosized building blocks—cubes or octahedrons—decorated with DNA tethers to coordinate the assembly of spherical nanoparticles coated with complementary DNA strands.

Controlling the self-assembly of nanoparticles into superlattices is an important approach to build functional materials. The Brookhaven team used nanosized building blocks—cubes or octahedrons—decorated with DNA tethers to coordinate the assembly of spherical nanoparticles coated with complementary DNA strands.

A June 1, 2015 article (which originated the news item) in DOE Pulse Number 440 goes on to highlight three recent DNA papers published by researchers at Brookhaven National Laboratory,

The first [leads to a news release], published in Nature Communications, describes how scientists used the shape of nanoscale building blocks decorated with single strands of DNA to orchestrate the arrangement of spheres decorated with complementary strands (where bases on the two strands pair up according to the rules of DNA binding, A to T, G to C). For example, nano-cubes coated with DNA tethers on all six sides formed regular arrays of cubes surrounded by six nano-spheres. The attractive force of the DNA “glue” keeps these two dissimilar objects from self-separating to give scientists a reliable way to assemble composite materials in which the synergistic properties of different types of nanoparticles might be put to use.

In another study [leads to a news release], published in Nature Nanotechnology, the team used ropelike configurations of the DNA double helix to form a rigid geometrical framework, and added dangling pieces of single-stranded DNA to glue nanoparticles in place on the vertices of the scaffold. Controlling the code of the dangling strands and adding complementary strands to the nanoparticles gives scientists precision control over particle placement. These arrays of nanoparticles with predictable geometric configurations are somewhat analogous to molecules made of atoms, and can even be linked end-to-end to form polymer-like chains, or arrayed as flat sheets. Using this approach, the scientists can potentially orchestrate the arrangements of different types of nanoparticles to design materials that regulate energy flow, rotate light, or deliver biomolecules.

“We may be able to design materials that mimic nature’s machinery to harvest solar energy, or manipulate light for telecommunications applications, or design novel catalysts for speeding up a variety of chemical reactions,” said Oleg Gang, the Brookhaven physicist who leads this work on DNA-mediated nano-assembly.

Perhaps most exciting is a study [leads to a news release] published in Nature Materials in which the scientists added “reprogramming” strands of DNA after assembly to rearrange and change the phase of nanoparticle arrays. This is a change at the nanoscale that in some ways resembles an atomic phase change—like the shift in the atomic crystal lattice of carbon that transforms graphite into diamond—potentially producing a material with completely new properties from the same already assembled nanoparticle array. Inputting different types of attractive and repulsive reprogramming DNA strands, scientists could selectively trigger the transformation to the different resulting structures.

“The ability to dynamically switch the phase of an entire superlattice array will allow the creation of reprogrammable and switchable materials wherein multiple, different functions can be activated on demand,” Gang said.

Here are links to and citation for all three papers,

Superlattices assembled through shape-induced directional binding by Fang Lu, Kevin G. Yager, Yugang Zhang, Huolin Xin, & Oleg Gang. Nature Communications 6, Article number: 6912 doi:10.1038/ncomms7912 Published 23 April 2015

Prescribed nanoparticle cluster architectures and low-dimensional arrays built using octahedral DNA origami frames by Ye Tian, Tong Wang, Wenyan Liu, Huolin L. Xin, Huilin Li, Yonggang Ke, William M. Shih, & Oleg Gang. Nature Nanotechnology (2015) doi:10.1038/nnano.2015.105 Published online 25 May 2015

Selective transformations between nanoparticle superlattices via the reprogramming of DNA-mediated interactions by Yugang Zhang, Suchetan Pal, Babji Srinivasan, Thi Vo, Sanat Kumar & Oleg Gang. Nature Materials (2015) doi:10.1038/nmat4296 Published online 25 May 2015

The first study is open access, the second is behind a paywall but there is a free preview via ReadCube Acces, and the third is behind a paywall.

Mixing and matching your nanoparticles

An Oct. 20, 2013 Brookhaven National Laboratory (BNL; US Dept. of Energy) news release (also on EurekAlert) describes a technique for combining different kinds of nanoparticles into a single nanocomposite,

Scientists at the U.S. Department of Energy’s Brookhaven National Laboratory have developed a general approach for combining different types of nanoparticles to produce large-scale composite materials. The technique, described in a paper published online by Nature Nanotechnology on October 20, 2013, opens many opportunities for mixing and matching particles with different magnetic, optical, or chemical properties to form new, multifunctional materials or materials with enhanced performance for a wide range of potential applications.

The approach takes advantage of the attractive pairing of complementary strands of synthetic DNA—based on the molecule that carries the genetic code in its sequence of matched bases known by the letters A, T, G, and C. After coating the nanoparticles with a chemically standardized “construction platform” and adding extender molecules to which DNA can easily bind, the scientists attach complementary lab-designed DNA strands to the two different kinds of nanoparticles they want to link up. The natural pairing of the matching strands then “self-assembles” the particles into a three-dimensional array consisting of billions of particles. Varying the length of the DNA linkers, their surface density on particles, and other factors gives scientists the ability to control and optimize different types of newly formed materials and their properties.

The news release details some of the challenges the researchers faced,

… the scientists explored the effect of particle shape. “In principle, differently shaped particles don’t want to coexist in one lattice,” said Gang [Brookhaven physicist Oleg Gang]. “They either tend to separate into different phases like oil and water refusing to mix or form disordered structures.” The scientists discovered that DNA not only helps the particles mix, but it can also improve order for such systems when a thicker DNA shell around the particles is used.

They also investigated how the DNA-pairing mechanism and other intrinsic physical forces, such as magnetic attraction among particles, might compete during the assembly process. For example, magnetic particles tend to clump to form aggregates that can hinder the binding of DNA from another type of particle. “We show that shorter DNA strands are more effective at competing against magnetic attraction,” Gang said.

For the particular composite of gold and magnetic nanoparticles they created, the scientists discovered that applying an external magnetic field could “switch” the material’s phase and affect the ordering of the particles. “This was just a demonstration that it can be done, but it could have an application—perhaps magnetic switches, or materials that might be able to change shape on demand,” said Zhang [[Yugang Zhang, first author of the paper].

The third fundamental factor the scientists explored was how the particles were ordered in the superlattice arrays: Does one type of particle always occupy the same position relative to the other type—like boys and girls sitting in alternating seats in a movie theater—or are they interspersed more randomly? “This is what we call a compositional order, which is important for example for quantum dots because their optical properties—e.g., their ability to glow—depend on how many gold nanoparticles are in the surrounding environment,” said Gang. “If you have compositional disorder, the optical properties would be different.” In the experiments, increasing the thickness of the soft DNA shells around the particles increased compositional disorder.

These fundamental principles give scientists a framework for designing new materials. The specific conditions required for a particular application will be dependent on the particles being used, Zhang emphasized, but the general assembly approach would be the same.

Said Gang, “We can vary the lengths of the DNA strands to change the distance between particles from about 10 nanometers to under 100 nanometers—which is important for applications because many optical, magnetic, and other properties of nanoparticles depend on the positioning at this scale. We are excited by the avenues this research opens up in terms of future directions for engineering novel classes of materials that exploit collective effects and multifunctionality.”

Here’s a link to and a citation for the research paper,

A general strategy for the DNA-mediated self-assembly of functional nanoparticles into heterogeneous systems by Yugang Zhang, Fang Lu, Kevin G. Yager, Daniel van der Lelie, & Oleg Gang. Nature Nanotechnology (2013) doi:10.1038/nnano.2013.209 Published online 20 October 2013.

This article can be viewed/previewed on ReadCube or purchased.