Tag Archives: David Hirsch

Beer and wine reviews, the American Chemical Society’s (ACS) AI editors, and the Turing Test

The Turing test first known as the ‘Imitation Game’, was designed by scientist Alan Turing in 1950 to see if a machine’s behaviour (in this case, a ‘conversation’) could fool someone into believing it was human. It’s a basic test to help determine true artificial intelligence.

These days ‘artificial intelligence’ seems to be everywhere, although I’m not sure that all these algorithms would pass the Turing test. Some of the latest material I’ve seen suggests that writers and editors may have to rethink their roles in future. Let’s start with the beer and wine reviews.

Writing

An April 25, 2022 Dartmouth College news release by David Hirsch announces the AI reviewer, Note: Links have been removed,

In mid-2020, the computer science team of Keith Carlson, Allen Riddell and Dan Rockmore was stuck on a problem. It wasn’t a technical challenge. The computer code they had developed to write product reviews was working beautifully. But they were struggling with a practical question.

“Getting the code to write reviews was only the first part of the challenge,” says Carlson, Guarini ’21, a doctoral research fellow at the Tuck School of Business, “The remaining challenge was figuring out how and where it could be used.”

The original study took on two challenges: to design code that could write original, human-quality product reviews using a small set of product features and to see if the algorithm could be adapted to write “synthesis reviews” for products from a large number of existing reviews.

Review writing can be challenging because of the overwhelming number of products available. The team wanted to see if artificial intelligence was up to the task of writing opinionated text about vast product classes.

They focused on wine and beer reviews because of the extensive availability of material to train the algorithm. The relatively narrow vocabularies used to describe the products also makes it open to the techniques of AI systems and natural language processing tools.

The project was kickstarted by Riddell, a former fellow at the Neukom Institute for Computational Science, and developed with Carlson under the guidance of Rockmore, the William H. Neukom 1964 Distinguished Professor of Computational Science.

The code couldn’t taste the products, but it did ingest reams of written material. After training the algorithm on hundreds of thousands of published wine and beer reviews, the team found that the code could complete both tasks.

One result read: “This is a sound Cabernet. It’s very dry and a little thin in blackberry fruit, which accentuates the acidity and tannins. Drink up.”

Another read: “Pretty dark for a rosé, and full-bodied, with cherry, raspberry, vanilla and spice flavors. It’s dry with good acidity.”

“But now what?” Carlson explains as a question that often gnaws at scientists. The team wondered, “Who else would care?”

“I didn’t want to quit there,” says Rockmore. “I was sure that this work could be interesting to a wider audience.”

Sensing that the paper could have relevance in marketing, the team walked the study to Tuck Drive to see what others would think.

“Brilliant,” Praveen Kopalle, the Signal Companies’ Professor of Management at Tuck School of Business, recalls thinking when first reviewing the technical study.

Kopalle knew that the research was important. It could even “disrupt” the online review industry, a huge marketplace of goods and services.

“The paper has a lot of marketing applications, particularly in the context of online reviews where we can create reviews or descriptions of products when they may not already exist,” adds Kopalle. “In fact, we can even think about summarizing reviews for products and services as well.”

With the addition of Prasad Vana, assistant professor of business administration at Tuck, the team was complete. Vana reframed the technical feat of creating review-writing code into that of a market-friendly tool that can assist consumers, marketers, and professional reviewers.

“This is a sound Cabernet. It’s very dry and a little thin in blackberry fruit, which accentuates the acidity and tannins. Drink up.” Attribution: Artificial Intelligence review from Dartmouth project

The resulting research, published in International Journal of Research in Marketing, surveyed independent participants to confirm that the AI system wrote human-like reviews in both challenges.

“Using artificial intelligence to write and synthesize reviews can create efficiencies on both sides of the marketplace,” said Vana. “The hope is that AI can benefit reviewers facing larger writing workloads and consumers who have to sort through so much content about products.”

The paper also dwells on the ethical concerns raised by computer-generated content. It notes that marketers could get better acceptance by falsely attributing the reviews to humans. To address this, the team advocates for transparency when computer-generated text is used.

They also address the issue of computers taking human jobs. Code should not replace professional product reviewers, the team insists in the paper. The technology is meant to make the tasks of producing and reading the material more efficient. [emphasis mine]

“It’s interesting to imagine how this could benefit restaurants that cannot afford sommeliers or independent sellers on online platforms who may sell hundreds of products,” says Vana.

According to Carlson, the paper’s first author, the project demonstrates the potential of AI, the power of innovative thinking, and the promise of cross-campus collaboration.

“It was wonderful to work with colleagues with different expertise to take a theoretical idea and bring it closer to the marketplace,” says Carlson. “Together we showed how our work could change marketing and how people could use it. That could only happen with collaboration.”

A revised April 29, 2022 version was published on EurekAlert and some of the differences are interesting (to me, if no one else). As you see, there’s a less ‘friendly’ style and the ‘jobs’ issue has been approached differently. Note: Links have been removed,

Artificial intelligence systems can be trained to write human-like product reviews that assist consumers, marketers and professional reviewers, according to a study from Dartmouth College, Dartmouth’s Tuck School of Business, and Indiana University.

The research, published in the International Journal of Research in Marketing, also identifies ethical challenges raised by the use of the computer-generated content.

“Review writing is challenging for humans and computers, in part, because of the overwhelming number of distinct products,” said Keith Carlson, a doctoral research fellow at the Tuck School of Business. “We wanted to see how artificial intelligence can be used to help people that produce and use these reviews.”

For the research, the Dartmouth team set two challenges. The first was to determine whether a machine can be taught to write original, human-quality reviews using only a small number of product features after being trained on a set of existing content. Secondly, the team set out to see if machine learning algorithms can be used to write syntheses of reviews of products for which many reviews already exist.

“Using artificial intelligence to write and synthesize reviews can create efficiencies on both sides of the marketplace,” said Prasad Vana, assistant professor of business administration at Tuck School of Business. “The hope is that AI can benefit reviewers facing larger writing workloads and consumers that have to sort through so much content about products.”

The researchers focused on wine and beer reviews because of the extensive availability of material to train the computer algorithms. Write-ups of these products also feature relatively focused vocabularies, an advantage when working with AI systems.

To determine whether a machine could write useful reviews from scratch, the researchers trained an algorithm on about 180,000 existing wine reviews. Metadata tags for factors such as product origin, grape variety, rating, and price were also used to train the machine-learning system.

When comparing the machine-generated reviews against human reviews for the same wines, the research team found agreement between the two versions. The results remained consistent even as the team challenged the algorithms by changing the amount of input data that was available for reference.

The machine-written material was then assessed by non-expert study participants to test if they could determine whether the reviews were written by humans or a machine. According to the research paper, the participants were unable to distinguish between the human and AI-generated reviews with any statistical significance. Furthermore, their intent to purchase a wine was similar across human versus machine generated reviews of the wine. 

Having found that artificial intelligence can write credible wine reviews, the research team turned to beer reviews to determine the effectiveness of using AI to write “review syntheses.” Rather than being trained to write new reviews, the algorithm was tasked with aggregating elements from existing reviews of the same product. This tested AI’s ability to identify and provide limited but relevant information about products based on a large volume of varying opinions.

“Writing an original review tests the computer’s expressive ability based on a relatively narrow set of data. Writing a synthesis review is a related but distinct task where the system is expected to produce a review that captures some of the key ideas present in an existing set of reviews for a product,” said Carlson, who conducted the research while a PhD candidate in computer science at Dartmouth.

To test the algorithm’s ability to write review syntheses, researchers trained it on 143,000 existing reviews of over 14,000 beers. As with the wine dataset, the text of each review was paired with metadata including the product name, alcohol content, style, and scores given by the original reviewers.

As with the wine reviews, the research used independent study participants to judge whether the machine-written summaries captured and summarized the opinions of numerous reviews in a useful, human-like manner.

According to the paper, the model was successful at taking the reviews of a product as input and generating a synthesis review for that product as output.

“Our modeling framework could be useful in any situation where detailed attributes of a product are available and a written summary of the product is required,” said Vana. “It’s interesting to imagine how this could benefit restaurants that cannot afford sommeliers or independent sellers on online platforms who may sell hundreds of products.”

Both challenges used a deep learning neural net based on transformer architecture to ingest, process and output review language.

According to the research team, the computer systems are not intended to replace professional writers and marketers, but rather to assist them in their work. A machine-written review, for instance, could serve as a time-saving first draft of a review that a human reviewer could then revise. [emphasis mine]

The research can also help consumers. Syntheses reviews—like those on beer in the study—can be expanded to the constellation of products and services in online marketplaces to assist people who have limited time to read through many product reviews.

In addition to the benefits of machine-written reviews, the research team highlights some of the ethical challenges presented by using computer algorithms to influence human consumer behavior.

Noting that marketers could get better acceptance of machine-generated reviews by falsely attributing them to humans, the team advocates for transparency when computer-generated reviews are offered.

“As with other technology, we have to be cautious about how this advancement is used,” said Carlson. “If used responsibly, AI-generated reviews can be both a productivity tool and can support the availability of useful consumer information.”

Researchers contributing to the study include Praveen Kopalle, Dartmouth’s Tuck School of Business; Allen Riddell, Indiana University, and Daniel Rockmore, Dartmouth College.

I wonder if the second news release was written by an AI agent.

Here’s a link to and a citation for the paper,

Complementing human effort in online reviews: A deep learning approach to automatic content generation and review synthesis by Keith Carlson, Praveen K.Kopal, Allen Ridd, Daniel Rockmore, Prasad Vana. International Journal of Research in Marketing DOI: https://doi.org/10.1016/j.ijresmar.2022.02.004 Available online 12 February 2022 In Press, Corrected Proof

This paper is behind a paywall.

Daniel (Dan) Rockmore was mentioned here in a May 6, 2016 posting about a competition he’d set up through Dartmouth College,’s Neukom Institute. The competition, which doesn’t seem to have been run since 2018, was called Turing Tests in Creative Arts.

Editing

It seems the American Chemical Society (ACS) has decided to further automate some of its editing. From an April 28, 2022 Digital Science business announcement (also on EurekAlert) by David Ellis,

Writefull’s world-leading AI-based language services have been integrated into the American Chemical Society’s (ACS) Publications workflow.

In a partnership that began almost two years ago, ACS has now progressed to a full integration of Writefull’s application programming interfaces (APIs) for three key uses.

One of the world’s largest scientific societies, ACS publishes more than 300,000 research manuscripts in more than 60 scholarly journals per year.

Writefull’s proprietary AI technology is trained on millions of scientific papers using Deep Learning. It identifies potential language issues with written texts, offers solutions to those issues, and automatically assesses texts’ language quality. Thanks to Writefull’s APIs, its tech can be applied at all key points in the editorial workflows.

Writefull’s Manuscript Categorization API is now used by ACS before copyediting to automatically classify all accepted manuscripts by their language quality. Using ACS’s own classification criteria, the API assigns a level-of-edit grade to manuscripts at scale without editors having to open documents and review the text. After thorough benchmarking alongside human editors, Writefull reached more than 95% alignment in grading texts, significantly reducing the time ACS spends on manuscript evaluation.

The same Manuscript Categorization API is now part of ACS’s quality control program, to evaluate the language in manuscripts after copyediting.

Writefull’s Metadata API is also being used to automate aspects of manuscript review, ensuring that all elements of an article are complete prior to publication. The same API is used by Open Access publisher Hindawi as a pre-submission structural checks tool for authors.

Juan Castro, co-founder and CEO of Writefull, says: “Our partnership with the American Chemical Society over the past two years has been aimed at thoroughly vetting and shaping our services to meet ACS’s needs. Writefull’s AI-based language services empower publishers to increase their workflow efficiency and positively impact production costs, while also maintaining the quality and integrity of the manuscript.”

Digital Science is a technology company working to make research more efficient. We invest in, nurture and support innovative businesses and technologies that make all parts of the research process more open and effective. Our portfolio includes admired brands including Altmetric, Dimensions, Figshare, ReadCube, Symplectic, IFI CLAIMS, GRID, Overleaf, Ripeta and Writefull. We believe that together, we can help researchers make a difference. Visit www.digital-science.com and follow @digitalsci on Twitter.

Writefull is a technology startup that creates tools to help researchers improve their writing in English. The first version of the Writefull product allowed researchers to discover patterns in academic language, such as frequent word combinations and synonyms in context. The new version utilises Natural Language Processing and Deep Learning algorithms that will give researchers feedback on their full texts. Visit writefull.com and follow @writefullapp on Twitter.

The American Chemical Society (ACS) is a nonprofit organization chartered by the U.S. Congress. ACS’ mission is to advance the broader chemistry enterprise and its practitioners for the benefit of Earth and all its people. The Society is a global leader in promoting excellence in science education and providing access to chemistry-related information and research through its multiple research solutions, peer-reviewed journals, scientific conferences, eBooks and weekly news periodical Chemical & Engineering News. ACS journals are among the most cited, most trusted and most read within the scientific literature; however, ACS itself does not conduct chemical research. As a leader in scientific information solutions, its CAS division partners with global innovators to accelerate breakthroughs by curating, connecting and analyzing the world’s scientific knowledge. ACS’ main offices are in Washington, D.C., and Columbus, Ohio. Visit www.acs.org and follow @AmerChemSociety on Twitter.

So what?

An artificial intelligence (AI) agent being used for writing assignments is not new (see my July 16, 2014 posting titled, “Writing and AI or is a robot writing this blog?“). The argument that these agents will assist rather than replace (pick an occupation: e.g., writers, doctors, programmers, scientists, etc) is almost always used as scientists explain that AI agents will take over the boring work giving you (the human) more opportunities to do interesting work. The AI-written beer and wine reviews described here support at least part of the argument—for the time being.

It’s true that an AI agent can’t taste beer or wine but that can change as this August 8, 2019 article by Alice Johnston for CNN hints (Note: Links have been removed),

An artificial “tongue” that can taste minute differences between varieties of Scotch whisky could be the key to identifying counterfeit alcohol, scientists say.

Engineers from the universities of Glasgow and Strathclyde in Scotland created a device made of gold and aluminum and measured how it absorbed light when submerged in different kinds of whisky.

Analysis of the results allowed the scientists to identify the samples from Glenfiddich, Glen Marnoch and Laphroaig with more than 99% accuracy

BTW, my earliest piece on artificial tongues is a July 28, 2011 posting, “Bio-inspired electronic tongue replaces sommelier?,” about research in Spain.

For a contrast, this is the first time I can recall seeing anything about an artificial intelligence agent that edits and Writefall’s use at the ACS falls into the ‘doing all the boring work’ category and narrative quite neatly.

Having looked at a definition of the various forms of editing and core skills, I”m guessing that AI will take over every aspect (from the Editors’ Association of Canada, Definitions of Editorial Skills webpage),

CORE SKILLS

Structural Editing

Assessing and shaping draft material to improve its organization and content. Changes may be suggested to or drafted for the writer. Structural editing may include:

revising, reordering, cutting, or expanding material

writing original material

determining whether permissions are necessary for third-party material

recasting material that would be better presented in another form, or revising material for a different medium (such as revising print copy for web copy)

clarifying plot, characterization, or thematic elements

Also known as substantive editing, manuscript editing, content editing, or developmental editing.

Stylistic Editing

Editing to clarify meaning, ensure coherence and flow, and refine the language. It includes:

eliminating jargon, clichés, and euphemisms

establishing or maintaining the language level appropriate for the intended audience, medium, and purpose

adjusting the length and structure of sentences and paragraphs

establishing or maintaining tone, mood, style, and authorial voice or level of formality

Also known as line editing (which may also include copy editing).

Copy Editing

Editing to ensure correctness, accuracy, consistency, and completeness. It includes:

editing for grammar, spelling, punctuation, and usage

checking for consistency and continuity of mechanics and facts, including anachronisms, character names, and relationships

editing tables, figures, and lists

notifying designers of any unusual production requirements

developing a style sheet or following one that is provided

correcting or querying general information that should be checked for accuracy 

It may also include:

marking levels of headings and the approximate placement of art

Canadianizing or other localizing

converting measurements

providing or changing the system of citations

editing indexes

obtaining or listing permissions needed

checking front matter, back matter, and cover copy

checking web links

Note that “copy editing” is often loosely used to include stylistic editing, structural editing, fact checking, or proofreading. Editors Canada uses it only as defined above.

Proofreading

Examining material after layout or in its final format to correct errors in textual and visual elements. The material may be read in isolation or against a previous version. It includes checking for:

adherence to design

minor mechanical errors (such as spelling mistakes or deviations from style sheet)

consistency and accuracy of elements in the material (such as cross-references, running heads, captions, web page heading tags, hyperlinks, and metadata)

It may also include:

distinguishing between printer’s, designer’s, or programmer’s errors and writer’s or editor’s alterations

copyfitting

flagging or checking locations of art

inserting page numbers or checking them against content and page references

Note that proofreading is checking a work after editing; it is not a substitute for editing.

I’m just as happy to get rid of ‘boring’ parts of my work as anyone else but that’s how I learned in the first place and I haven’t seen any discussion about the importance of boring, repetitive tasks for learning.