How to get people to trust artificial intelligence

Vyacheslav Polonski’s (University of Oxford researcher) January 10, 2018 piece (originally published Jan. 9, 2018 on The Conversation) on phys.org isn’t a gossip article although there are parts that could be read that way. Before getting to what I consider the juicy bits (Note: Links have been removed),

Artificial intelligence [AI] can already predict the future. Police forces are using it to map when and where crime is likely to occur [Note: See my Nov. 23, 2017 posting about predictive policing in Vancouver for details about the first Canadian municipality to introduce the technology]. Doctors can use it to predict when a patient is most likely to have a heart attack or stroke. Researchers are even trying to give AI imagination so it can plan for unexpected consequences.

Many decisions in our lives require a good forecast, and AI agents are almost always better at forecasting than their human counterparts. Yet for all these technological advances, we still seem to deeply lack confidence in AI predictions. Recent cases show that people don’t like relying on AI and prefer to trust human experts, even if these experts are wrong.

The part (juicy bits) that satisfied some of my long held curiosity was this section on Watson and its life as a medical adjunct (Note: Links have been removed),

IBM’s attempt to promote its supercomputer programme to cancer doctors (Watson for Onology) was a PR [public relations] disaster. The AI promised to deliver top-quality recommendations on the treatment of 12 cancers that accounted for 80% of the world’s cases. As of today, over 14,000 patients worldwide have received advice based on its calculations.

But when doctors first interacted with Watson they found themselves in a rather difficult situation. On the one hand, if Watson provided guidance about a treatment that coincided with their own opinions, physicians did not see much value in Watson’s recommendations. The supercomputer was simply telling them what they already know, and these recommendations did not change the actual treatment. This may have given doctors some peace of mind, providing them with more confidence in their own decisions. But IBM has yet to provide evidence that Watson actually improves cancer survival rates.

On the other hand, if Watson generated a recommendation that contradicted the experts’ opinion, doctors would typically conclude that Watson wasn’t competent. And the machine wouldn’t be able to explain why its treatment was plausible because its machine learning algorithms were simply too complex to be fully understood by humans. Consequently, this has caused even more mistrust and disbelief, leading many doctors to ignore the seemingly outlandish AI recommendations and stick to their own expertise.

As a result, IBM Watson’s premier medical partner, the MD Anderson Cancer Center, recently announced it was dropping the programme. Similarly, a Danish hospital reportedly abandoned the AI programme after discovering that its cancer doctors disagreed with Watson in over two thirds of cases.

The problem with Watson for Oncology was that doctors simply didn’t trust it. Human trust is often based on our understanding of how other people think and having experience of their reliability. …

It seems to me there might be a bit more to the doctors’ trust issues and I was surprised it didn’t seem to have occurred to Polonski. Then I did some digging (from Polonski’s webpage on the Oxford Internet Institute website),

Vyacheslav Polonski (@slavacm) is a DPhil [PhD] student at the Oxford Internet Institute. His research interests are located at the intersection of network science, media studies and social psychology. Vyacheslav’s doctoral research examines the adoption and use of social network sites, focusing on the effects of social influence, social cognition and identity construction.

Vyacheslav is a Visiting Fellow at Harvard University and a Global Shaper at the World Economic Forum. He was awarded the Master of Science degree with Distinction in the Social Science of the Internet from the University of Oxford in 2013. He also obtained the Bachelor of Science degree with First Class Honours in Management from the London School of Economics and Political Science (LSE) in 2012.

Vyacheslav was honoured at the British Council International Student of the Year 2011 awards, and was named UK’s Student of the Year 2012 and national winner of the Future Business Leader of the Year 2012 awards by TARGETjobs.

Previously, he has worked as a management consultant at Roland Berger Strategy Consultants and gained further work experience at the World Economic Forum, PwC, Mars, Bertelsmann and Amazon.com. Besides, he was involved in several start-ups as part of the 2012 cohort of Entrepreneur First and as part of the founding team of the London office of Rocket Internet. Vyacheslav was the junior editor of the bi-lingual book ‘Inspire a Nation‘ about Barack Obama’s first presidential election campaign. In 2013, he was invited to be a keynote speaker at the inaugural TEDx conference of IE University in Spain to discuss the role of a networked mindset in everyday life.

Vyacheslav is fluent in German, English and Russian, and is passionate about new technologies, social entrepreneurship, philanthropy, philosophy and modern art.

Research interests

Network science, social network analysis, online communities, agency and structure, group dynamics, social interaction, big data, critical mass, network effects, knowledge networks, information diffusion, product adoption

Positions held at the OII

DPhil student, October 2013 –

MSc Student, October 2012 – August 2013

Polonski doesn’t seem to have any experience dealing with, participating in, or studying the medical community. Getting a doctor to admit that his or her approach to a particular patient’s condition was wrong or misguided runs counter to their training and, by extension, the institution of medicine. Also, one of the biggest problems in any field is getting people to change and it’s not always about trust. In this instance, you’re asking a doctor to back someone else’s opinion after he or she has rendered theirs. This is difficult even when the other party is another human doctor let alone a form of artificial intelligence.

If you want to get a sense of just how hard it is to get someone to back down after they’ve committed to a position, read this January 10, 2018 essay by Lara Bazelon, an associate professor at the University of San Francisco School of Law. This is just one of the cases (Note: Links have been removed),

Davontae Sanford was 14 years old when he confessed to murdering four people in a drug house on Detroit’s East Side. Left alone with detectives in a late-night interrogation, Sanford says he broke down after being told he could go home if he gave them “something.” On the advice of a lawyer whose license was later suspended for misconduct, Sanders pleaded guilty in the middle of his March 2008 trial and received a sentence of 39 to 92 years in prison.

Sixteen days after Sanford was sentenced, a hit man named Vincent Smothers told the police he had carried out 12 contract killings, including the four Sanford had pleaded guilty to committing. Smothers explained that he’d worked with an accomplice, Ernest Davis, and he provided a wealth of corroborating details to back up his account. Smothers told police where they could find one of the weapons used in the murders; the gun was recovered and ballistics matched it to the crime scene. He also told the police he had used a different gun in several of the other murders, which ballistics tests confirmed. Once Smothers’ confession was corroborated, it was clear Sanford was innocent. Smothers made this point explicitly in an 2015 affidavit, emphasizing that Sanford hadn’t been involved in the crimes “in any way.”

Guess what happened? (Note: Links have been removed),

But Smothers and Davis were never charged. Neither was Leroy Payne, the man Smothers alleged had paid him to commit the murders. …

Davontae Sanford, meanwhile, remained behind bars, locked up for crimes he very clearly didn’t commit.

Police failed to turn over all the relevant information in Smothers’ confession to Sanford’s legal team, as the law required them to do. When that information was leaked in 2009, Sanford’s attorneys sought to reverse his conviction on the basis of actual innocence. Wayne County Prosecutor Kym Worthy fought back, opposing the motion all the way to the Michigan Supreme Court. In 2014, the court sided with Worthy, ruling that actual innocence was not a valid reason to withdraw a guilty plea [emphasis mine]. Sanford would remain in prison for another two years.

…

Doctors are just as invested in their opinions and professional judgments as lawyers (just like the prosecutor and the judges on the Michigan Supreme Court) are.

There is one more problem. From the doctor’s (or anyone else’s perspective), if the AI is making the decisions, why do he/she need to be there? At best it’s as if AI were turning the doctor into its servant or, at worst, replacing the doctor. Polonski alludes to the problem in one of his solutions to the ‘trust’ issue (Note: A link has been removed),

…

Research suggests involving people more in the AI decision-making process could also improve trust and allow the AI to learn from human experience. For example,one study showed people were given the freedom to slightly modify an algorithm felt more satisfied with its decisions, more likely to believe it was superior and more likely to use it in the future.

…

Having input into the AI decision-making process somewhat addresses one of the problems but the commitment to one’s own judgment even when there is overwhelming evidence to the contrary is a perennially thorny problem. The legal case mentioned here earlier is clearly one where the contrarian is wrong but it’s not always that obvious. As well, sometimes, people who hold out against the majority are right.

US Army

Getting back to building trust, it turns out the US Army Research Laboratory is also interested in transparency where AI is concerned (from a January 11, 2018 US Army news release on EurekAlert),

U.S. Army Research Laboratory [ARL] scientists developed ways to improve collaboration between humans and artificially intelligent agents in two projects recently completed for the Autonomy Research Pilot Initiative supported by the Office of Secretary of Defense. They did so by enhancing the agent transparency [emphasis mine], which refers to a robot, unmanned vehicle, or software agent’s ability to convey to humans its intent, performance, future plans, and reasoning process.

“As machine agents become more sophisticated and independent, it is critical for their human counterparts to understand their intent, behaviors, reasoning process behind those behaviors, and expected outcomes so the humans can properly calibrate their trust [emphasis mine] in the systems and make appropriate decisions,” explained ARL’s Dr. Jessie Chen, senior research psychologist.

The U.S. Defense Science Board, in a 2016 report, identified six barriers to human trust in autonomous systems, with ‘low observability, predictability, directability and auditability’ as well as ‘low mutual understanding of common goals’ being among the key issues.

In order to address these issues, Chen and her colleagues developed the Situation awareness-based Agent Transparency, or SAT, model and measured its effectiveness on human-agent team performance in a series of human factors studies supported by the ARPI. The SAT model deals with the information requirements from an agent to its human collaborator in order for the human to obtain effective situation awareness of the agent in its tasking environment. At the first SAT level, the agent provides the operator with the basic information about its current state and goals, intentions, and plans. At the second level, the agent reveals its reasoning process as well as the constraints/affordances that the agent considers when planning its actions. At the third SAT level, the agent provides the operator with information regarding its projection of future states, predicted consequences, likelihood of success/failure, and any uncertainty associated with the aforementioned projections.

In one of the ARPI projects, IMPACT, a research program on human-agent teaming for management of multiple heterogeneous unmanned vehicles, ARL’s experimental effort focused on examining the effects of levels of agent transparency, based on the SAT model, on human operators’ decision making during military scenarios. The results of a series of human factors experiments collectively suggest that transparency on the part of the agent benefits the human’s decision making and thus the overall human-agent team performance. More specifically, researchers said the human’s trust in the agent was significantly better calibrated — accepting the agent’s plan when it is correct and rejecting it when it is incorrect– when the agent had a higher level of transparency.

The other project related to agent transparency that Chen and her colleagues performed under the ARPI was Autonomous Squad Member, on which ARL collaborated with Naval Research Laboratory scientists. The ASM is a small ground robot that interacts with and communicates with an infantry squad. As part of the overall ASM program, Chen’s group developed transparency visualization concepts, which they used to investigate the effects of agent transparency levels on operator performance. Informed by the SAT model, the ASM’s user interface features an at a glance transparency module where user-tested iconographic representations of the agent’s plans, motivator, and projected outcomes are used to promote transparent interaction with the agent. A series of human factors studies on the ASM’s user interface have investigated the effects of agent transparency on the human teammate’s situation awareness, trust in the ASM, and workload. The results, consistent with the IMPACT project’s findings, demonstrated the positive effects of agent transparency on the human’s task performance without increase of perceived workload. The research participants also reported that they felt the ASM as more trustworthy, intelligent, and human-like when it conveyed greater levels of transparency.

Chen and her colleagues are currently expanding the SAT model into bidirectional transparency between the human and the agent.

“Bidirectional transparency, although conceptually straightforward–human and agent being mutually transparent about their reasoning process–can be quite challenging to implement in real time. However, transparency on the part of the human should support the agent’s planning and performance–just as agent transparency can support the human’s situation awareness and task performance, which we have demonstrated in our studies,” Chen hypothesized.

The challenge is to design the user interfaces, which can include visual, auditory, and other modalities, that can support bidirectional transparency dynamically, in real time, while not overwhelming the human with too much information and burden.

Interesting, yes? Here’s a link and a citation for the paper,

Situation Awareness-based Agent Transparency and Human-Autonomy Teaming Effectiveness by Jessie Y.C. Chen, Shan G. Lakhmani, Kimberly Stowers, Anthony R. Selkowitz, Julia L. Wright, and Michael Barnes. Theoretical Issues in Ergonomics Science May 2018. DOI 10.1080/1463922X.2017.1315750

This paper is behind a paywall.

4 thoughts on “How to get people to trust artificial intelligence”

Susan Baxter April 2, 2018 at 3:03 pm

I found this statement fascinating:
“And the machine wouldn’t be able to explain why its treatment was plausible because its machine learning algorithms were simply too complex to be fully understood by humans. ”
Well, as a human, I would rather have liked to know at least what factors the Watson/some AI system takes into account. For example are patient factors considered? I realize that medicine has apparently become a linear, deductive process in the minds of the tech crowd, but the medical dynamic includes patient, doctor, treatment options, evidence and a host of other things. Like who the patient is, what his or her risk tolerance might be, their socio economic status, etc etc.
And why would anyone – doctor, tailour, tinker, spy – trust a system that refuses to let one in on its reasoning?
Maryse de la Giroday Post authorApril 3, 2018 at 6:37 pm

Hi Susan!
Yes, it does seem astounding that anyone would design a system that is completely opaque to the user while it apparently offering that user an expert opinion. At a guess, some of the designers have unconsciously applied the same design guidelines used to create refrigerators, washing machines, etc. We don’t need to know how those machines work; we press a button and it does what we expect. On the other hand, Watson (and others of its ilk) introduces a completely different paradigm and the developers and designers are being slow to realize how it affects their own work. I have a brief story illustrating my point, a tech firm I was working for hired a graphic design firm to create marketing collateral. The tech firm had developed a process that was going to completely change how collateral was printed. The graphic design firm failed to recognize that if the printing process was going to change then their preparation process would also have to change. And they didn’t have a problem just once; it occurred with several different pieces of collateral.

Thank you for emphasizing that point as I think it got lost in the other things I was trying to say, And, most especially, thank you for dropping by to leave a comment.

Cheers,
Maryse
Shaun January 23, 2019 at 10:11 am

Thanks for the article and for the pointer to the next evolution of the SAT model very helpful. One of the areas that I’ve been looking at is Mary Cummings work on the relative strengths of computers and human information processing with varying uncertainty (the citation to the paper is below) . The model suggests that the line between what humans are better at, and what machines are better at is linear, yet it would appear that the model requires another dimension, that of SA. When then considering how this SA between the human and the computer impacts the linear line drawn, it may remove the linearity, increasing the relative strength of computers, any thoughts?

Thanks Shaun

Cummings, M. (2018) Informing Autonomous System Design Through the Lens of Skills-, Rule-, and Knowledge-Based Behaviours. Journal of Cognitive Engineering and Decision Making 2018, Volume 12, Number 1, March 2018, pp. 58-61.
Maryse de la Giroday Post authorJanuary 23, 2019 at 3:41 pm

Hello Shaun! Thanks for dropping by and leaving this note. I wasn’t familiar with Cummings’ work and it is fascinating. I particularly appreciate this bit, “Moreover, judgment and intuition, which often make traditional engineers uncomfortable because these concepts lack a mathematical formal representation, are the key behaviors that allow experts to quickly assess a situation in a fast and frugal method …” from the paper you cited. On another note, I spent a fair chunk of time trying to find out what ‘SA’ stands for and wasn’t particularly successful. Consequently, I’m unable to respond to your question in any meaningful fashion. If you have the time, I’d be thrilled if you could clear up my confusion about ‘SA’ and, perhaps then, I could respond. Cheers, Maryse

FrogHeart

Commentary about nanotech, science policy and communication, society, and the arts

How to get people to trust artificial intelligence

Research interests

Positions held at the OII

4 thoughts on “How to get people to trust artificial intelligence”

Leave a Reply Cancel reply