Automated science writing?

It seems that automated science writing is not ready—yet. Still, an April 18, 2019 news item on ScienceDaily suggests that progress is being made,

The work of a science writer, including this one, includes reading journal papers filled with specialized technical terminology, and figuring out how to explain their contents in language that readers without a scientific background can understand.

Now, a team of scientists at MIT [Massachusetts Institute of Technology] and elsewhere has developed a neural network, a form of artificial intelligence (AI), that can do much the same thing, at least to a limited extent: It can read scientific papers and render a plain-English summary in a sentence or two.

An April 17, 2019 MIT news release, which originated the news item, delves into the research and its implications,

Even in this limited form, such a neural network could be useful for helping editors, writers, and scientists [emphasis mine] scan a large number of papers to get a preliminary sense of what they’re about. But the approach the team developed could also find applications in a variety of other areas besides language processing, including machine translation and speech recognition.

The work is described in the journal Transactions of the Association for Computational Linguistics, in a paper by Rumen Dangovski and Li Jing, both MIT graduate students; Marin Soljačić, a professor of physics at MIT; Preslav Nakov, a principal scientist at the Qatar Computing Research Institute, HBKU; and Mićo Tatalović, a former Knight Science Journalism fellow at MIT and a former editor at New Scientist magazine.

From AI for physics to natural language

The work came about as a result of an unrelated project, which involved developing new artificial intelligence approaches based on neural networks, aimed at tackling certain thorny problems in physics. However, the researchers soon realized that the same approach could be used to address other difficult computational problems, including natural language processing, in ways that might outperform existing neural network systems.

“We have been doing various kinds of work in AI for a few years now,” Soljačić says. “We use AI to help with our research, basically to do physics better. And as we got to be  more familiar with AI, we would notice that every once in a while there is an opportunity to add to the field of AI because of something that we know from physics — a certain mathematical construct or a certain law in physics. We noticed that hey, if we use that, it could actually help with this or that particular AI algorithm.”

This approach could be useful in a variety of specific kinds of tasks, he says, but not all. “We can’t say this is useful for all of AI, but there are instances where we can use an insight from physics to improve on a given AI algorithm.”

Neural networks in general are an attempt to mimic the way humans learn certain new things: The computer examines many different examples and “learns” what the key underlying patterns are. Such systems are widely used for pattern recognition, such as learning to identify objects depicted in photos.

But neural networks in general have difficulty correlating information from a long string of data, such as is required in interpreting a research paper. Various tricks have been used to improve this capability, including techniques known as long short-term memory (LSTM) and gated recurrent units (GRU), but these still fall well short of what’s needed for real natural-language processing, the researchers say.

The team came up with an alternative system, which instead of being based on the multiplication of matrices, as most conventional neural networks are, is based on vectors rotating in a multidimensional space. The key concept is something they call a rotational unit of memory (RUM).

Essentially, the system represents each word in the text by a vector in multidimensional space — a line of a certain length pointing in a particular direction. Each subsequent word swings this vector in some direction, represented in a theoretical space that can ultimately have thousands of dimensions. At the end of the process, the final vector or set of vectors is translated back into its corresponding string of words.

“RUM helps neural networks to do two things very well,” Nakov says. “It helps them to remember better, and it enables them to recall information more accurately.”

After developing the RUM system to help with certain tough physics problems such as the behavior of light in complex engineered materials, “we realized one of the places where we thought this approach could be useful would be natural language processing,” says Soljačić,  recalling a conversation with Tatalović, who noted that such a tool would be useful for his work as an editor trying to decide which papers to write about. Tatalović was at the time exploring AI in science journalism as his Knight fellowship project.

“And so we tried a few natural language processing tasks on it,” Soljačić says. “One that we tried was summarizing articles, and that seems to be working quite well.”

The proof is in the reading

As an example, they fed the same research paper through a conventional LSTM-based neural network and through their RUM-based system. The resulting summaries were dramatically different.

The LSTM system yielded this highly repetitive and fairly technical summary: “Baylisascariasis,” kills mice, has endangered the allegheny woodrat and has caused disease like blindness or severe consequences. This infection, termed “baylisascariasis,” kills mice, has endangered the allegheny woodrat and has caused disease like blindness or severe consequences. This infection, termed “baylisascariasis,” kills mice, has endangered the allegheny woodrat.

Based on the same paper, the RUM system produced a much more readable summary, and one that did not include the needless repetition of phrases: Urban raccoons may infect people more than previously assumed. 7 percent of surveyed individuals tested positive for raccoon roundworm antibodies. Over 90 percent of raccoons in Santa Barbara play host to this parasite.

Already, the RUM-based system has been expanded so it can “read” through entire research papers, not just the abstracts, to produce a summary of their contents. The researchers have even tried using the system on their own research paper describing these findings — the paper that this news story is attempting to summarize.

Here is the new neural network’s summary: Researchers have developed a new representation process on the rotational unit of RUM, a recurrent memory that can be used to solve a broad spectrum of the neural revolution in natural language processing.

It may not be elegant prose, but it does at least hit the key points of information.

Çağlar Gülçehre, a research scientist at the British AI company Deepmind Technologies, who was not involved in this work, says this research tackles an important problem in neural networks, having to do with relating pieces of information that are widely separated in time or space. “This problem has been a very fundamental issue in AI due to the necessity to do reasoning over long time-delays in sequence-prediction tasks,” he says. “Although I do not think this paper completely solves this problem, it shows promising results on the long-term dependency tasks such as question-answering, text summarization, and associative recall.”

Gülçehre adds, “Since the experiments conducted and model proposed in this paper are released as open-source on Github, as a result many researchers will be interested in trying it on their own tasks. … To be more specific, potentially the approach proposed in this paper can have very high impact on the fields of natural language processing and reinforcement learning, where the long-term dependencies are very crucial.”

The research received support from the Army Research Office, the National Science Foundation, the MIT-SenseTime Alliance on Artificial Intelligence, and the Semiconductor Research Corporation. The team also had help from the Science Daily website, whose articles were used in training some of the AI models in this research.

As usual, this ‘automated writing system’ is framed as a ‘helper’ not an usurper of anyone’s job. However, its potential for changing the nature of the work is there. About five years ago I featured another ‘automated writing’ story in a July 16, 2014 posting titled: ‘Writing and AI or is a robot writing this blog?’ You may have been reading ‘automated’ news stories for years. At the time, the focus was on sports and business.

Getting back to 2019 and science writing, here’s a link to and a citation for the paper,

Rotational Unit of Memory: A Novel Representation Unit for RNNs with Scalable Applications by Rumen Dangovski, Li Jing, Preslav Nakov, Mićo Tatalović and Marin Soljačić. Transactions of the Association for Computational Linguistics Volume 07, 2019 pp.121-138 DOI: https://doi.org/10.1162/tacl_a_00258 Posted Online 2019

This paper is open access.

2015 Science & You, a science communication conference in France

Science communicators can choose to celebrate June 2015 in Nancy, France and acquaint themselves with the latest and greatest in communication at the Science & You conference being held from June 1 – 6, 2015. Here’s the conference teaser being offered by the organizers,

The 2015 conference home page (ETA May 5, 2015 1045 hours PDT: the home page features change) offers this sampling of the workshops on offer,

No less than 180 communicators will be lined up to hold workshop sessions, from the 2nd to the 5th June in Nancy’s Centre Prouvé. In the meantime, here is an exclusive peek at some of the main themes which will be covered:

– Science communication and journalism. Abdellatif Bensfia will focus on the state of science communication in a country where major social changes are playing out, Morocco, while Olivier Monod will be speaking about “Chercheurs d’actu” (News Researchers), a system linking science with the news. Finally, Matthieu Ravaud and Fabrice Impériali from the CNRS (Centre National de Recherche Scientifique) will be presenting “CNRS Le journal”, the new on-line media for the general public.

– Using animals in biomedical research. This round-table, chaired by Victor Demaria-Pesce, from the Groupement Interprofessionnel de Réflexion et de Communication sur la Recherche (Gircor) will provide an opportunity to spotlight one of society’s great debates: the use of animals in research. Different actors working in biomedical research will present their point of view on the subject, and the results of an analysis of public perception of animal experimentation will be presented. What are the norms in this field? What are the living conditions of the animals in laboratories? How is this research to be made legitimate? This session will centre on all these questions.

– Science communication and the arts. This session will cover questions such as the relational interfaces between art and science, with in particular the presentation of “Pulse Project” with Michelle Lewis-King, and the Semaine du Cerveau (Brain Week) in Grenoble (Isabelle Le Brun).
Music will also be there with the talk by Milla Karvonen from the University of Oulu, who will be speaking about the interaction between science and music, while Philippe Berthelot will talk about the art of telling the story of science as a communication tool.

– Science on television. This workshop will also be in the form of a round table, with representatives from TVV (Vigyan Prasar, Inde), and Irene Lapuente (La Mandarina de Newton), Mico Tatalovic and Elizabeth Vidal (University of Cordoba), discussing how the world of science is represented on a mass media like television. Many questions will be debated, as for example the changing image of science on television, its historical context, or again, the impact these programmes have on audiences’ perceptions of science.

To learn more, you will find the detailed list of all the workshops and plenaries in the provisional programme on-line.

Science & You seems to be an ‘umbrella brand’ for the “Journées Hubert Curien” conference with plenaries and workshops and the “Science and Culture” forum, which may explain the variety of dates (June 1 – 6, June 2 – 5, and June 2 – 6) on the Science & You home page.

Here’s information about the Science & You organizers and more conference dates (from the Patrons page),

At the invitation of the President of the Université de Lorraine, the professors Etienne Klein, Cédric Villani and Brigitte Kieffer accepted to endorse Science & You. It is an honour to be able to associate them with this major event in science communication, in which they are particularly involved.

Cédric Villani, Fields Medal 2010

Cédric Villani is a French mathematician, the Director of the Institut Henri Poincaré and a professor at the Université Claude Bernard Lyon 1.
His main research interests are in kinetic theory (Boltzmann and Vlasov equations and their variants), and optimal transport and its applications (Monge equation).
He has received several national and international awards for his research, in particular the Fields Medal, which he received from the hands of the President of India at the 2010 International Congress of Mathematicians in Hyderabad (India). Since then he has played the role of spokesperson for the French mathematical community in media and political circles.
Cédric Villani regularly invests in science communication aiming at various audiences: conferences in schools, public conferences in France and abroad, regular participation in broadcasts and current affairs programmes and in science festivals.

Etienne Klein, physicist and philosopher

Etienne Klein is a French physicist, Director of Research at the CEA (Commissariat à l’énergie atomique et aux énergies alternatives – Alternative Energies and Atomic Energy Commission) and has a Ph.D. in philosophy of science. He teaches at the Ecole Centrale in Paris and is head of the Laboratoire de Recherche sur les Sciences de la Matière (LARSIM) at the CEA.

He has taken part in several major projects, such as developing a method of isotope separation involving the use of lasers, and the study of a particle accelerator with superconducting cavities. He was involved in the design of the Large Hadron Collider (LHC) at CERN.
He taught quantum physics and particle physics at Ecole Centrale in Paris for several years and currently teaches philosophy of science. He is a specialist on time in physics and is the author of a number of essays.
He is also a member of the OPECST (Conseil de l’Office parlementaire d’évaluation des choix scientifiques et technologiques – Parliamentary Office for the Evaluation of Scientific and Technological Choices), of the French Academy of Technologies, and of the Conseil d’Orientation (Advisory Board) of the Institut Diderot.
Until June 2014, he presented a weekly radio chronicle, Le Monde selon Etienne Klein, on the French national radio France Culture.

Brigitte Kieffer, Campaigner for women in science

B. L. Kieffer is Professor at McGill University and at the Université de Strasbourg France. She is also Visiting Professor at UCLA (Los Angeles, USA). She develops her research activity at IGBMC, one of the leading European centres of biomedical research. She is recipient of the Jules Martin (French Academy of Science, 2001) and the Lounsbery (French and US Academies of Science, 2004) Awards, and has become an EMBO Member in 2009.
In 2012 she received the Lamonica Award of Neurology (French Academy of Science) and was nominated Chevalier de la Légion d’honneur. In December 2013 she was elected as a member of the French Academy of Sciences.
In March 2014, she received the International L’OREAL-UNESCO Award for Women in Science (European Laureate). She started as the Scientific Director of the Douglas Hospital Research Centre, affiliated to McGill University in January 2014, and remains Professor at the University of Strasbourg, France.

Here’s more about the conference at the heart of Science & You (from The Journées Hubert Curien International Conference webpage),

Following on the 2012 conference, this project will bring together all those interested in science communication: researchers, PhD students, science communicators, journalists, professionals from associations and museums, business leaders, politicians… A high-level scientific committee has been set up for this international conference, chaired by Professor Joëlle Le Marec, University of Paris 7, and counting among its members leading figures in science communication such as Bernard Schiele (Canada) or Hester du Plessis (South Africa).

The JHC Conference will take place from June 2nd to 6th at the Centre Prouvé, Nancy. These four days will be dedicated to a various programme of plenary conferences and workshops on the theme of science communication today and tomorrow.

You can find the Registration webpage here where you can get more information about the process and access the registration form.