Tag Archives: Cathy O’Neil

Predictive policing in Vancouver—the first jurisdiction in Canada to employ a machine learning system for property theft reduction

Predictive policing has come to Canada, specifically, Vancouver. A July 22, 2017 article by Matt Meuse for the Canadian Broadcasting Corporation (CBC) news online describes the new policing tool,

The Vancouver Police Department is implementing a city-wide “predictive policing” system that uses machine learning to prevent break-ins by predicting where they will occur before they happen — the first of its kind in Canada.

Police chief Adam Palmer said that, after a six-month pilot project in 2016, the system is now accessible to all officers via their cruisers’ onboard computers, covering the entire city.

“Instead of officers just patrolling randomly throughout the neighbourhood, this will give them targeted areas it makes more sense to patrol in because there’s a higher likelihood of crime to occur,” Palmer said.

 

Things got off to a slow start as the system familiarized itself [during a 2016 pilot project] with the data, and floundered in the fall due to unexpected data corruption.

But Special Const. Ryan Prox said the system reduced property crime by as much as 27 per cent in areas where it was tested, compared to the previous four years.

The accuracy of the system was also tested by having it generate predictions for a given day, and then watching to see what happened that day without acting on the predictions.

Palmer said the system was getting accuracy rates between 70 and 80 per cent.

When a location is identified by the system, Palmer said officers can be deployed to patrol that location. …

“Quite often … that visible presence will deter people from committing crimes [altogether],” Palmer said.

Though similar systems are used in the United States, Palmer said the system is the first of its kind in Canada, and was developed specifically for the VPD.

While the current focus is on residential break-ins, Palmer said the system could also be tweaked for use with car theft — though likely not with violent crime, which is far less predictable.

Palmer dismissed the inevitable comparison to the 2002 Tom Cruise film Minority Report, in which people are arrested to prevent them from committing crimes in the future.

“We’re not targeting people, we’re targeting locations,” Palmer said. “There’s nothing dark here.”

If you want to get a sense of just how dismissive Chief Palmer was, there’s a July 21, 2017 press conference (run time: approx. 21 mins.) embedded with a media release of the same date. The media release offered these details,

The new model is being implemented after the VPD ran a six-month pilot study in 2016 that contributed to a substantial decrease in residential break-and-enters.

The pilot ran from April 1 to September 30, 2016. The number of residential break-and enters during the test period was compared to the monthly average over the same period for the previous four years (2012 to 2015). The highest drop in property crime – 27 per cent – was measured in June.

The new model provides data in two-hour intervals for locations where residential and commercial break-and-enters are anticipated. The information is for 100-metre and 500-metre zones. Police resources can be dispatched to that area on foot or in patrol cars, to provide a visible presence to deter thieves.

The VPD’s new predictive policing model is built on GEODASH – an advanced machine-learning technology that was implemented by the VPD in 2015. A public version of GEODASH was introduced in December 2015 and is publicly available on vpd.ca. It retroactively plots the location of crimes on a map to provide a general idea of crime trends to the public.

I wish Chief Palmer had been a bit more open to discussion about the implications of ‘predictive policing’. In the US where these systems have been employed in various jurisdictions, there’s some concern arising after an almost euphoric initial response as a Nov. 21, 2016 article by Logan Koepke for the slate.com notes (Note: Links have been removed),

When predictive policing systems began rolling out nationwide about five years ago, coverage was often uncritical and overly reliant on references to Minority Report’s precog system. The coverage made predictive policing—the computer systems that attempt to use data to forecast where crime will happen or who will be involved—seem almost magical.

Typically, though, articles glossed over Minority Report’s moral about how such systems can go awry. Even Slate wasn’t immune, running a piece in 2011 called “Time Cops” that said, when it came to these systems, “Civil libertarians can rest easy.”

This soothsaying language extended beyond just media outlets. According to former New York City Police Commissioner William Bratton, predictive policing is the “wave of the future.” Microsoft agrees. One vendor even markets its system as “better than a crystal ball.” More recent coverage has rightfully been more balanced, skeptical, and critical. But many still seem to miss an important point: When it comes to predictive policing, what matters most isn’t the future—it’s the past.

Some predictive policing systems incorporate information like the weather, a location’s proximity to a liquor store, or even commercial data brokerage information. But at their core, they rely either mostly or entirely on historical crime data held by the police. Typically, these are records of reported crimes—911 calls or “calls for service”—and other crimes the police detect. Software automatically looks for historical patterns in the data, and uses those patterns to make its forecasts—a process known as machine learning.

Intuitively, it makes sense that predictive policing systems would base their forecasts on historical crime data. But historical crime data has limits. Criminologists have long emphasized that crime reports—and other statistics gathered by the police—do not necessarily offer an accurate picture of crime in a community. The Department of Justice’s National Crime Victimization Survey estimates that from 2006 to 2010, 52 percent of violent crime went unreported to police, as did 60 percent of household property crime. Essentially: Historical crime data is a direct record of how law enforcement responds to particular crimes, rather than the true rate of crime. Rather than predicting actual criminal activity, then, the current systems are probably better at predicting future police enforcement.

Koepke goes on to cover other potential issues with ‘predicitive policing’ in this thoughtful piece. He also co-authored an August 2016 report, Stuck in a Pattern; Early evidence on “predictive” policing and civil rights.

There seems to be increasing attention on machine learning and bias as noted in my May 24, 2017 posting where I provide links to other FrogHeart postings on the topic and there’s this Feb. 28, 2017 posting about a new regional big data sharing project, the Cascadia Urban Analytics Cooperative where I mention Cathy O’Neil (author of the book, Weapons of Math Destruction) and her critique in a subsection titled: Algorithms and big data.

I would like to see some oversight and some discussion in Canada about this brave new world of big data.

One final comment, it is possible to get access to the Vancouver Police Department’s data through the City of Vancouver’s Open Data Catalogue (home page).

Robots in Vancouver and in Canada (two of two)

This is the second of a two-part posting about robots in Vancouver and Canada. The first part included a definition, a brief mention a robot ethics quandary, and sexbots. This part is all about the future. (Part one is here.)

Canadian Robotics Strategy

Meetings were held Sept. 28 – 29, 2017 in, surprisingly, Vancouver. (For those who don’t know, this is surprising because most of the robotics and AI research seems to be concentrated in eastern Canada. if you don’t believe me take a look at the speaker list for Day 2 or the ‘Canadian Stakeholder’ meeting day.) From the NSERC (Natural Sciences and Engineering Research Council) events page of the Canadian Robotics Network,

Join us as we gather robotics stakeholders from across the country to initiate the development of a national robotics strategy for Canada. Sponsored by the Natural Sciences and Engineering Research Council of Canada (NSERC), this two-day event coincides with the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017) in order to leverage the experience of international experts as we explore Canada’s need for a national robotics strategy.

Where
Vancouver, BC, Canada

When
Thursday September 28 & Friday September 29, 2017 — Save the date!

Download the full agenda and speakers’ list here.

Objectives

The purpose of this two-day event is to gather members of the robotics ecosystem from across Canada to initiate the development of a national robotics strategy that builds on our strengths and capacities in robotics, and is uniquely tailored to address Canada’s economic needs and social values.

This event has been sponsored by the Natural Sciences and Engineering Research Council of Canada (NSERC) and is supported in kind by the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017) as an official Workshop of the conference.  The first of two days coincides with IROS 2017 – one of the premiere robotics conferences globally – in order to leverage the experience of international robotics experts as we explore Canada’s need for a national robotics strategy here at home.

Who should attend

Representatives from industry, research, government, startups, investment, education, policy, law, and ethics who are passionate about building a robust and world-class ecosystem for robotics in Canada.

Program Overview

Download the full agenda and speakers’ list here.

DAY ONE: IROS Workshop 

“Best practices in designing effective roadmaps for robotics innovation”

Thursday September 28, 2017 | 8:30am – 5:00pm | Vancouver Convention Centre

Morning Program:“Developing robotics innovation policy and establishing key performance indicators that are relevant to your region” Leading international experts share their experience designing robotics strategies and policy frameworks in their regions and explore international best practices. Opening Remarks by Prof. Hong Zhang, IROS 2017 Conference Chair.

Afternoon Program: “Understanding the Canadian robotics ecosystem” Canadian stakeholders from research, industry, investment, ethics and law provide a collective overview of the Canadian robotics ecosystem. Opening Remarks by Ryan Gariepy, CTO of Clearpath Robotics.

Thursday Evening Program: Sponsored by Clearpath Robotics  Workshop participants gather at a nearby restaurant to network and socialize.

Learn more about the IROS Workshop.

DAY TWO: NSERC-Sponsored Canadian Robotics Stakeholder Meeting
“Towards a national robotics strategy for Canada”

Friday September 29, 2017 | 8:30am – 5:00pm | University of British Columbia (UBC)

On the second day of the program, robotics stakeholders from across the country gather at UBC for a full day brainstorming session to identify Canada’s unique strengths and opportunities relative to the global competition, and to align on a strategic vision for robotics in Canada.

Friday Evening Program: Sponsored by NSERC Meeting participants gather at a nearby restaurant for the event’s closing dinner reception.

Learn more about the Canadian Robotics Stakeholder Meeting.

I was glad to see in the agenda that some of the international speakers represented research efforts from outside the usual Europe/US axis.

I have been in touch with one of the organizers (also mentioned in part one with regard to robot ethics), Ajung Moon (her website is here), who says that there will be a white paper available on the Canadian Robotics Network website at some point in the future. I’ll keep looking for it and, in the meantime, I wonder what the 2018 Canadian federal budget will offer robotics.

Robots and popular culture

For anyone living in Canada or the US, Westworld (television series) is probably the most recent and well known ‘robot’ drama to premiere in the last year.As for movies, I think Ex Machina from 2014 probably qualifies in that category. Interestingly, both Westworld and Ex Machina seem quite concerned with sex with Westworld adding significant doses of violence as another  concern.

I am going to focus on another robot story, the 2012 movie, Robot & Frank, which features a care robot and an older man,

Frank (played by Frank Langella), a former jewel thief, teaches a robot the skills necessary to rob some neighbours of their valuables. The ethical issue broached in the film isn’t whether or not the robot should learn the skills and assist Frank in his thieving ways although that’s touched on when Frank keeps pointing out that planning his heist requires he live more healthily. No, the problem arises afterward when the neighbour accuses Frank of the robbery and Frank removes what he believes is all the evidence. He believes he’s going successfully evade arrest until the robot notes that Frank will have to erase its memory in order to remove all of the evidence. The film ends without the robot’s fate being made explicit.

In a way, I find the ethics query (was the robot Frank’s friend or just a machine?) posed in the film more interesting than the one in Vikander’s story, an issue which does have a history. For example, care aides, nurses, and/or servants would have dealt with requests to give an alcoholic patient a drink. Wouldn’t there  already be established guidelines and practices which could be adapted for robots? Or, is this question made anew by something intrinsically different about robots?

To be clear, Vikander’s story is a good introduction and starting point for these kinds of discussions as is Moon’s ethical question. But they are starting points and I hope one day there’ll be a more extended discussion of the questions raised by Moon and noted in Vikander’s article (a two- or three-part series of articles? public discussions?).

How will humans react to robots?

Earlier there was the contention that intimate interactions with robots and sexbots would decrease empathy and the ability of human beings to interact with each other in caring ways. This sounds a bit like the argument about smartphones/cell phones and teenagers who don’t relate well to others in real life because most of their interactions are mediated through a screen, which many seem to prefer. It may be partially true but, arguably,, books too are an antisocial technology as noted in Walter J. Ong’s  influential 1982 book, ‘Orality and Literacy’,  (from the Walter J. Ong Wikipedia entry),

A major concern of Ong’s works is the impact that the shift from orality to literacy has had on culture and education. Writing is a technology like other technologies (fire, the steam engine, etc.) that, when introduced to a “primary oral culture” (which has never known writing) has extremely wide-ranging impacts in all areas of life. These include culture, economics, politics, art, and more. Furthermore, even a small amount of education in writing transforms people’s mentality from the holistic immersion of orality to interiorization and individuation. [emphases mine]

So, robotics and artificial intelligence would not be the first technologies to affect our brains and our social interactions.

There’s another area where human-robot interaction may have unintended personal consequences according to April Glaser’s Sept. 14, 2017 article on Slate.com (Note: Links have been removed),

The customer service industry is teeming with robots. From automated phone trees to touchscreens, software and machines answer customer questions, complete orders, send friendly reminders, and even handle money. For an industry that is, at its core, about human interaction, it’s increasingly being driven to a large extent by nonhuman automation.

But despite the dreams of science-fiction writers, few people enter a customer-service encounter hoping to talk to a robot. And when the robot malfunctions, as they so often do, it’s a human who is left to calm angry customers. It’s understandable that after navigating a string of automated phone menus and being put on hold for 20 minutes, a customer might take her frustration out on a customer service representative. Even if you know it’s not the customer service agent’s fault, there’s really no one else to get mad at. It’s not like a robot cares if you’re angry.

When human beings need help with something, says Madeleine Elish, an anthropologist and researcher at the Data and Society Institute who studies how humans interact with machines, they’re not only looking for the most efficient solution to a problem. They’re often looking for a kind of validation that a robot can’t give. “Usually you don’t just want the answer,” Elish explained. “You want sympathy, understanding, and to be heard”—none of which are things robots are particularly good at delivering. In a 2015 survey of over 1,300 people conducted by researchers at Boston University, over 90 percent of respondents said they start their customer service interaction hoping to speak to a real person, and 83 percent admitted that in their last customer service call they trotted through phone menus only to make their way to a human on the line at the end.

“People can get so angry that they have to go through all those automated messages,” said Brian Gnerer, a call center representative with AT&T in Bloomington, Minnesota. “They’ve been misrouted or been on hold forever or they pressed one, then two, then zero to speak to somebody, and they are not getting where they want.” And when people do finally get a human on the phone, “they just sigh and are like, ‘Thank God, finally there’s somebody I can speak to.’ ”

Even if robots don’t always make customers happy, more and more companies are making the leap to bring in machines to take over jobs that used to specifically necessitate human interaction. McDonald’s and Wendy’s both reportedly plan to add touchscreen self-ordering machines to restaurants this year. Facebook is saturated with thousands of customer service chatbots that can do anything from hail an Uber, retrieve movie times, to order flowers for loved ones. And of course, corporations prefer automated labor. As Andy Puzder, CEO of the fast-food chains Carl’s Jr. and Hardee’s and former Trump pick for labor secretary, bluntly put it in an interview with Business Insider last year, robots are “always polite, they always upsell, they never take a vacation, they never show up late, there’s never a slip-and-fall, or an age, sex, or race discrimination case.”

But those robots are backstopped by human beings. How does interacting with more automated technology affect the way we treat each other? …

“We know that people treat artificial entities like they’re alive, even when they’re aware of their inanimacy,” writes Kate Darling, a researcher at MIT who studies ethical relationships between humans and robots, in a recent paper on anthropomorphism in human-robot interaction. Sure, robots don’t have feelings and don’t feel pain (not yet, anyway). But as more robots rely on interaction that resembles human interaction, like voice assistants, the way we treat those machines will increasingly bleed into the way we treat each other.

It took me a while to realize that what Glaser is talking about are AI systems and not robots as such. (sigh) It’s so easy to conflate the concepts.

AI ethics (Toby Walsh and Suzanne Gildert)

Jack Stilgoe of the Guardian published a brief Oct. 9, 2017 introduction to his more substantive (30 mins.?) podcast interview with Dr. Toby Walsh where they discuss stupid AI amongst other topics (Note: A link has been removed),

Professor Toby Walsh has recently published a book – Android Dreams – giving a researcher’s perspective on the uncertainties and opportunities of artificial intelligence. Here, he explains to Jack Stilgoe that we should worry more about the short-term risks of stupid AI in self-driving cars and smartphones than the speculative risks of super-intelligence.

Professor Walsh discusses the effects that AI could have on our jobs, the shapes of our cities and our understandings of ourselves. As someone developing AI, he questions the hype surrounding the technology. He is scared by some drivers’ real-world experimentation with their not-quite-self-driving Teslas. And he thinks that Siri needs to start owning up to being a computer.

I found this discussion to cast a decidedly different light on the future of robotics and AI. Walsh is much more interested in discussing immediate issues like the problems posed by ‘self-driving’ cars. (Aside: Should we be calling them robot cars?)

One ethical issue Walsh raises is with data regarding accidents. He compares what’s happening with accident data from self-driving (robot) cars to how the aviation industry handles accidents. Hint: accident data involving air planes is shared. Would you like to guess who does not share their data?

Sharing and analyzing data and developing new safety techniques based on that data has made flying a remarkably safe transportation technology.. Walsh argues the same could be done for self-driving cars if companies like Tesla took the attitude that safety is in everyone’s best interests and shared their accident data in a scheme similar to the aviation industry’s.

In an Oct. 12, 2017 article by Matthew Braga for Canadian Broadcasting Corporation (CBC) news online another ethical issue is raised by Suzanne Gildert (a participant in the Canadian Robotics Roadmap/Strategy meetings mentioned earlier here), Note: Links have been removed,

… Suzanne Gildert, the co-founder and chief science officer of Vancouver-based robotics company Kindred. Since 2014, her company has been developing intelligent robots [emphasis mine] that can be taught by humans to perform automated tasks — for example, handling and sorting products in a warehouse.

The idea is that when one of Kindred’s robots encounters a scenario it can’t handle, a human pilot can take control. The human can see, feel and hear the same things the robot does, and the robot can learn from how the human pilot handles the problematic task.

This process, called teleoperation, is one way to fast-track learning by manually showing the robot examples of what its trainers want it to do. But it also poses a potential moral and ethical quandary that will only grow more serious as robots become more intelligent.

“That AI is also learning my values,” Gildert explained during a talk on robot ethics at the Singularity University Canada Summit in Toronto on Wednesday [Oct. 11, 2017]. “Everything — my mannerisms, my behaviours — is all going into the AI.”

At its worst, everything from algorithms used in the U.S. to sentence criminals to image-recognition software has been found to inherit the racist and sexist biases of the data on which it was trained.

But just as bad habits can be learned, good habits can be learned too. The question is, if you’re building a warehouse robot like Kindred is, is it more effective to train those robots’ algorithms to reflect the personalities and behaviours of the humans who will be working alongside it? Or do you try to blend all the data from all the humans who might eventually train Kindred robots around the world into something that reflects the best strengths of all?

I notice Gildert distinguishes her robots as “intelligent robots” and then focuses on AI and issues with bias which have already arisen with regard to algorithms (see my May 24, 2017 posting about bias in machine learning, AI, and .Note: if you’re in Vancouver on Oct. 26, 2017 and interested in algorithms and bias), there’s a talk being given by Dr. Cathy O’Neil, author the Weapons of Math Destruction, on the topic of Gender and Bias in Algorithms. It’s not free but  tickets are here.)

Final comments

There is one more aspect I want to mention. Even as someone who usually deals with nanobots, it’s easy to start discussing robots as if the humanoid ones are the only ones that exist. To recapitulate, there are humanoid robots, utilitarian robots, intelligent robots, AI, nanobots, ‘microscopic bots, and more all of which raise questions about ethics and social impacts.

However, there is one more category I want to add to this list: cyborgs. They live amongst us now. Anyone who’s had a hip or knee replacement or a pacemaker or a deep brain stimulator or other such implanted device qualifies as a cyborg. Increasingly too, prosthetics are being introduced and made part of the body. My April 24, 2017 posting features this story,

This Case Western Reserve University (CRWU) video accompanies a March 28, 2017 CRWU news release, (h/t ScienceDaily March 28, 2017 news item)

Bill Kochevar grabbed a mug of water, drew it to his lips and drank through the straw.

His motions were slow and deliberate, but then Kochevar hadn’t moved his right arm or hand for eight years.

And it took some practice to reach and grasp just by thinking about it.

Kochevar, who was paralyzed below his shoulders in a bicycling accident, is believed to be the first person with quadriplegia in the world to have arm and hand movements restored with the help of two temporarily implanted technologies. [emphasis mine]

A brain-computer interface with recording electrodes under his skull, and a functional electrical stimulation (FES) system* activating his arm and hand, reconnect his brain to paralyzed muscles.

Does a brain-computer interface have an effect on human brain and, if so, what might that be?

In any discussion (assuming there is funding for it) about ethics and social impact, we might want to invite the broadest range of people possible at an ‘earlyish’ stage (although we’re already pretty far down the ‘automation road’) stage or as Jack Stilgoe and Toby Walsh note, technological determinism holds sway.

Once again here are links for the articles and information mentioned in this double posting,

That’s it!

ETA Oct. 16, 2017: Well, I guess that wasn’t quite ‘it’. BBC’s (British Broadcasting Corporation) Magazine published a thoughtful Oct. 15, 2017 piece titled: Can we teach robots ethics?

Simon Fraser University (Vancouver, Canada) and its president’s (Andrew Petter) dream colloquium: women in technology

I’m a little late with this event news (sadly,. I only received the information yesterday, Sept. 20, 2017) but even with two event dates already past (happily, videos for the two events have been posted), there are still several “Women in Technology” events to attend or view live according to the Simon Fraser University (SFU) President’s Dream Colloquium: Women in Technology; Attaining, Retaining, and Promoting Diverse Talent’s webpage text by Wan Yee Lok,

Women in Technology: Attracting, Retaining and Promoting Diverse Talent is a seven-part public [emphasis mine] lecture series beginning on Sept. 13. Key experts from around the world will identify challenges to gender equity and discover solutions for improving recruitment, retention and leadership options for women.

Diversity and inclusion are critical to high-tech corporate success. Yet statistics reveal that less than 25 per cent of those working in the science, technology, engineering and math sectors (STEM) are women, and that they earn seven-and-a-half per cent less than men.

“There is a crucial need to achieve gender equality in the tech sector, especially at a time when it is growing faster than ever,” says colloquium organizer Lesley Shannon, an SFU engineering science professor. She holds the Natural Sciences and Engineering Research Council (NSERC) Chair for Women in Science and Engineering for the B.C. and Yukon region.

“We hope the colloquium will help people engage in a multidisciplinary dialogue about the value of creating more space in technology for women and other under-represented groups.”

Six of the lectures are free, except for Cathy O’Neil’s lecture on Oct. 26.

The President’s Dream Colloquium schedule is as follows:

Sept. 13: SFU KEY presents: We the Data
Juliette Powell, founder, Turing AI and WeTheData.org, author of 33 Million People in the Room

Sept. 14: Diversity 101: The Case for Diversity in Technology
Maria Klawe, president, Harvey Mudd College

Sept. 21: Women in Media and Advertising
Shari Graydon, catalyst, Informed Opinions

Oct. 12: Social Psychological Phenomena
Steven Spencer, the Robert K. and Dale J. Weary Chair in Social Psychology, Ohio State University

Oct. 26: Gender and Bias in Algorithmic Design
Cathy O’Neil, author, Weapons of Math Destruction [tickets are $5 for students; $15 for the rest of us; go here to buy tickets, click on green button in the upper right, below the banner; the event will be held at SFU’s Harbour Centre Vancouver location]

Nov 9: Gendered Language
Danielle Gaucher, associate professor, Department of Psychology, University of Winnipeg

Nov. 23: Women as Leaders and Innovators
Jo Miller, founder, Be Leaderly

Lectures will be webcast live and available on the President’s Dream Colloquium website, www.sfu.ca/womenintech.

SFU engineering science professor Lesley Shannon is the colloquium organizer as well as the Natural Sciences and Engineering Research Council (NSERC) Chair for Women in Science and Engineering for the B.C. and Yukon region.

 

As a part of the colloquium, students can enroll in a graduate course covering a broad range of topics related to diversity in the technology sector. Shannon says the course will focus on women and their role in technology as well as issues that affect other under‐represented groups.

“I hope the course will establish a foundation for future managers, supervisors, sponsors, mentors and others wanting to pursue leadership roles to work towards creating a level playing field in technology and other industries,” says Shannon.

The colloquium course (SAR 897) is still accepting students. Visit go.sfu.ca to enroll.

A reminder after the last few paragraphs of the event text, you don’t actually have to be a student to attend the lectures although for anyone who doesn’t want to make the trek up the hill (SFU is located on a hill in Burnaby, BC) for the majority of the events, there is the livestream video. For those who can’t make the scheduled times, given that both the Sept. 13 and Sept. 14, 2017 event videos have been posted, they are being pretty quick about uploading the videos afterwards.

I have mentioned Cathy O’Neil here a couple of times, more substantively in a Feb. 28, 2017 posting about a major’ big data’ collaboration between the province of BC and the state of Washington (for Cathy O’Neil, scroll down to the subsection titled: Algorithms and big data) and briefly at the end in a May 24, 2017 posting that was chiefly concerned with bias in algorithms.

Machine learning programs learn bias

The notion of bias in artificial intelligence (AI)/algorithms/robots is gaining prominence (links to other posts featuring algorithms and bias are at the end of this post). The latest research concerns machine learning where an artificial intelligence system trains itself with ordinary human language from the internet. From an April 13, 2017 American Association for the Advancement of Science (AAAS) news release on EurekAlert,

As artificial intelligence systems “learn” language from existing texts, they exhibit the same biases that humans do, a new study reveals. The results not only provide a tool for studying prejudicial attitudes and behavior in humans, but also emphasize how language is intimately intertwined with historical biases and cultural stereotypes. A common way to measure biases in humans is the Implicit Association Test (IAT), where subjects are asked to pair two concepts they find similar, in contrast to two concepts they find different; their response times can vary greatly, indicating how well they associated one word with another (for example, people are more likely to associate “flowers” with “pleasant,” and “insects” with “unpleasant”). Here, Aylin Caliskan and colleagues developed a similar way to measure biases in AI systems that acquire language from human texts; rather than measuring lag time, however, they used the statistical number of associations between words, analyzing roughly 2.2 million words in total. Their results demonstrate that AI systems retain biases seen in humans. For example, studies of human behavior show that the exact same resume is 50% more likely to result in an opportunity for an interview if the candidate’s name is European American rather than African-American. Indeed, the AI system was more likely to associate European American names with “pleasant” stimuli (e.g. “gift,” or “happy”). In terms of gender, the AI system also reflected human biases, where female words (e.g., “woman” and “girl”) were more associated than male words with the arts, compared to mathematics. In a related Perspective, Anthony G. Greenwald discusses these findings and how they could be used to further analyze biases in the real world.

There are more details about the research in this April 13, 2017 Princeton University news release on EurekAlert (also on ScienceDaily),

In debates over the future of artificial intelligence, many experts think of the new systems as coldly logical and objectively rational. But in a new study, researchers have demonstrated how machines can be reflections of us, their creators, in potentially problematic ways. Common machine learning programs, when trained with ordinary human language available online, can acquire cultural biases embedded in the patterns of wording, the researchers found. These biases range from the morally neutral, like a preference for flowers over insects, to the objectionable views of race and gender.

Identifying and addressing possible bias in machine learning will be critically important as we increasingly turn to computers for processing the natural language humans use to communicate, for instance in doing online text searches, image categorization and automated translations.

“Questions about fairness and bias in machine learning are tremendously important for our society,” said researcher Arvind Narayanan, an assistant professor of computer science and an affiliated faculty member at the Center for Information Technology Policy (CITP) at Princeton University, as well as an affiliate scholar at Stanford Law School’s Center for Internet and Society. “We have a situation where these artificial intelligence systems may be perpetuating historical patterns of bias that we might find socially unacceptable and which we might be trying to move away from.”

The paper, “Semantics derived automatically from language corpora contain human-like biases,” published April 14  [2017] in Science. Its lead author is Aylin Caliskan, a postdoctoral research associate and a CITP fellow at Princeton; Joanna Bryson, a reader at University of Bath, and CITP affiliate, is a coauthor.

As a touchstone for documented human biases, the study turned to the Implicit Association Test, used in numerous social psychology studies since its development at the University of Washington in the late 1990s. The test measures response times (in milliseconds) by human subjects asked to pair word concepts displayed on a computer screen. Response times are far shorter, the Implicit Association Test has repeatedly shown, when subjects are asked to pair two concepts they find similar, versus two concepts they find dissimilar.

Take flower types, like “rose” and “daisy,” and insects like “ant” and “moth.” These words can be paired with pleasant concepts, like “caress” and “love,” or unpleasant notions, like “filth” and “ugly.” People more quickly associate the flower words with pleasant concepts, and the insect terms with unpleasant ideas.

The Princeton team devised an experiment with a program where it essentially functioned like a machine learning version of the Implicit Association Test. Called GloVe, and developed by Stanford University researchers, the popular, open-source program is of the sort that a startup machine learning company might use at the heart of its product. The GloVe algorithm can represent the co-occurrence statistics of words in, say, a 10-word window of text. Words that often appear near one another have a stronger association than those words that seldom do.

The Stanford researchers turned GloVe loose on a huge trawl of contents from the World Wide Web, containing 840 billion words. Within this large sample of written human culture, Narayanan and colleagues then examined sets of so-called target words, like “programmer, engineer, scientist” and “nurse, teacher, librarian” alongside two sets of attribute words, such as “man, male” and “woman, female,” looking for evidence of the kinds of biases humans can unwittingly possess.

In the results, innocent, inoffensive biases, like for flowers over bugs, showed up, but so did examples along lines of gender and race. As it turned out, the Princeton machine learning experiment managed to replicate the broad substantiations of bias found in select Implicit Association Test studies over the years that have relied on live, human subjects.

For instance, the machine learning program associated female names more with familial attribute words, like “parents” and “wedding,” than male names. In turn, male names had stronger associations with career attributes, like “professional” and “salary.” Of course, results such as these are often just objective reflections of the true, unequal distributions of occupation types with respect to gender–like how 77 percent of computer programmers are male, according to the U.S. Bureau of Labor Statistics.

Yet this correctly distinguished bias about occupations can end up having pernicious, sexist effects. An example: when foreign languages are naively processed by machine learning programs, leading to gender-stereotyped sentences. The Turkish language uses a gender-neutral, third person pronoun, “o.” Plugged into the well-known, online translation service Google Translate, however, the Turkish sentences “o bir doktor” and “o bir hem?ire” with this gender-neutral pronoun are translated into English as “he is a doctor” and “she is a nurse.”

“This paper reiterates the important point that machine learning methods are not ‘objective’ or ‘unbiased’ just because they rely on mathematics and algorithms,” said Hanna Wallach, a senior researcher at Microsoft Research New York City, who was not involved in the study. “Rather, as long as they are trained using data from society and as long as society exhibits biases, these methods will likely reproduce these biases.”

Another objectionable example harkens back to a well-known 2004 paper by Marianne Bertrand of the University of Chicago Booth School of Business and Sendhil Mullainathan of Harvard University. The economists sent out close to 5,000 identical resumes to 1,300 job advertisements, changing only the applicants’ names to be either traditionally European American or African American. The former group was 50 percent more likely to be offered an interview than the latter. In an apparent corroboration of this bias, the new Princeton study demonstrated that a set of African American names had more unpleasantness associations than a European American set.

Computer programmers might hope to prevent cultural stereotype perpetuation through the development of explicit, mathematics-based instructions for the machine learning programs underlying AI systems. Not unlike how parents and mentors try to instill concepts of fairness and equality in children and students, coders could endeavor to make machines reflect the better angels of human nature.

“The biases that we studied in the paper are easy to overlook when designers are creating systems,” said Narayanan. “The biases and stereotypes in our society reflected in our language are complex and longstanding. Rather than trying to sanitize or eliminate them, we should treat biases as part of the language and establish an explicit way in machine learning of determining what we consider acceptable and unacceptable.”

Here’s a link to and a citation for the Princeton paper,

Semantics derived automatically from language corpora contain human-like biases by Aylin Caliskan, Joanna J. Bryson, Arvind Narayanan. Science  14 Apr 2017: Vol. 356, Issue 6334, pp. 183-186 DOI: 10.1126/science.aal4230

This paper appears to be open access.

Links to more cautionary posts about AI,

Aug 5, 2009: Autonomous algorithms; intelligent windows; pretty nano pictures

June 14, 2016:  Accountability for artificial intelligence decision-making

Oct. 25, 2016 Removing gender-based stereotypes from algorithms

March 1, 2017: Algorithms in decision-making: a government inquiry in the UK

There’s also a book which makes some of the current use of AI programmes and big data quite accessible reading: Cathy O’Neil’s ‘Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy’.

Big data in the Cascadia region: a University of British Columbia (Canada) and University of Washington (US state) collaboration

Before moving onto the news and for anyone unfamiliar with the concept of the Cascadia region, it is an informally proposed political region or a bioregion, depending on your perspective. Adding to the lack of clarity, the region generally includes the province of British Columbia in Canada and the two US states, Washington and Oregon but Alaska (another US state) and the Yukon (a Canadian territory) may also be included, as well as, parts of California, Wyoming, Idaho, and Montana. (You can read more about the Cascadia bioregion here and the proposed political region here.)  While it sounds as if more of the US is part of the ‘Cascadia region’, British Columbia and the Yukon cover considerably more territory than all of the mentioned states combined, if you’re taking a landmass perspective.

Cascadia Urban Analytics Cooperative

There was some big news about the smallest version of the Cascadia region on Thursday, Feb. 23, 2017 when the University of British Columbia (UBC) , the University of Washington (state; UW), and Microsoft announced the launch of the Cascadia Urban Analytics Cooperative. From the joint Feb. 23, 2017 news release (read on the UBC website or read on the UW website),

In an expansion of regional cooperation, the University of British Columbia and the University of Washington today announced the establishment of the Cascadia Urban Analytics Cooperative to use data to help cities and communities address challenges from traffic to homelessness. The largest industry-funded research partnership between UBC and the UW, the collaborative will bring faculty, students and community stakeholders together to solve problems, and is made possible thanks to a $1-million gift from Microsoft.

“Thanks to this generous gift from Microsoft, our two universities are poised to help transform the Cascadia region into a technological hub comparable to Silicon Valley and Boston,” said Professor Santa J. Ono, President of the University of British Columbia. “This new partnership transcends borders and strives to unleash our collective brain power, to bring about economic growth that enriches the lives of Canadians and Americans as well as urban communities throughout the world.”

“We have an unprecedented opportunity to use data to help our communities make decisions, and as a result improve people’s lives and well-being. That commitment to the public good is at the core of the mission of our two universities, and we’re grateful to Microsoft for making a community-minded contribution that will spark a range of collaborations,” said UW President Ana Mari Cauce.

Today’s announcement follows last September’s [2016] Emerging Cascadia Innovation Corridor Conference in Vancouver, B.C. The forum brought together regional leaders for the first time to identify concrete opportunities for partnerships in education, transportation, university research, human capital and other areas.

A Boston Consulting Group study unveiled at the conference showed the region between Seattle and Vancouver has “high potential to cultivate an innovation corridor” that competes on an international scale, but only if regional leaders work together. The study says that could be possible through sustained collaboration aided by an educated and skilled workforce, a vibrant network of research universities and a dynamic policy environment.

Microsoft President Brad Smith, who helped convene the conference, said, “We believe that joint research based on data science can help unlock new solutions for some of the most pressing issues in both Vancouver and Seattle. But our goal is bigger than this one-time gift. We hope this investment will serve as a catalyst for broader and more sustainable efforts between these two institutions.”

As part of the Emerging Cascadia conference, British Columbia Premier Christy Clark and Washington Governor Jay Inslee signed a formal agreement that committed the two governments to work closely together to “enhance meaningful and results-driven innovation and collaboration.”  The agreement outlined steps the two governments will take to collaborate in several key areas including research and education.

“Increasingly, tech is not just another standalone sector of the economy, but fully integrated into everything from transportation to social work,” said Premier Clark. “That’s why we’ve invested in B.C.’s thriving tech sector, but committed to working with our neighbours in Washington – and we’re already seeing the results.”

“This data-driven collaboration among some of our smartest and most creative thought-leaders will help us tackle a host of urgent issues,” Gov. Inslee said. “I’m encouraged to see our partnership with British Columbia spurring such interesting cross-border dialogue and excited to see what our students and researchers come up with.”

The Cascadia Urban Analytics Cooperative will revolve around four main programs:

  • The Cascadia Data Science for Social Good (DSSG) Summer Program, which builds on the success of the DSSG program at the UW eScience Institute. The cooperative will coordinate a joint summer program for students across UW and UBC campuses where they work with faculty to create and incubate data-intensive research projects that have concrete benefits for urban communities. One past DSSG project analyzed data from Seattle’s regional transportation system – ORCA – to improve its effectiveness, particularly for low-income transit riders. Another project sought to improve food safety by text mining product reviews to identify unsafe products.
  • Cascadia Data Science for Social Good Scholar Symposium, which will foster innovation and collaboration by bringing together scholars from UBC and the UW involved in projects utilizing technology to advance the social good. The first symposium will be hosted at UW in 2017.
  • Sustained Research Partnerships designed to establish the Pacific Northwest as a center of expertise and activity in urban analytics. The cooperative will support sustained research partnerships between UW and UBC researchers, providing technical expertise, stakeholder engagement and seed funding.
  • Responsible Data Management Systems and Services to ensure data integrity, security and usability. The cooperative will develop new software, systems and services to facilitate data management and analysis, as well as ensure projects adhere to best practices in fairness, accountability and transparency.

At UW, the Cascadia Urban Analytics Collaborative will be overseen by Urbanalytics (urbanalytics.uw.edu), a new research unit in the Information School focused on responsible urban data science. The Collaborative builds on previous investments in data-intensive science through the UW eScience Institute (escience.washington.edu) and investments in urban scholarship through Urban@UW (urban.uw.edu), and also aligns with the UW’s Population Health Initiative (uw.edu/populationhealth) that is addressing the most persistent and emerging challenges in human health, environmental resiliency and social and economic equity. The gift counts toward the UW’s Be Boundless – For Washington, For the World campaign (uw.edu/boundless).

The Collaborative also aligns with the UBC Sustainability Initiative (sustain.ubc.ca) that fosters partnerships beyond traditional boundaries of disciplines, sectors and geographies to address critical issues of our time, as well as the UBC Data Science Institute (dsi.ubc.ca), which aims to advance data science research to address complex problems across domains, including health, science and arts.

Brad Smith, President and Chief Legal Officer of Microsoft, wrote about the joint centre in a Feb. 23, 2017 posting on the Microsoft on the Issues blog (Note:,

The cities of Vancouver and Seattle share many strengths: a long history of innovation, world-class universities and a region rich in cultural and ethnic diversity. While both cities have achieved great success on their own, leaders from both sides of the border realize that tighter partnership and collaboration, through the creation of a Cascadia Innovation Corridor, will expand economic opportunity and prosperity well beyond what each community can achieve separately.

Microsoft supports this vision and today is making a $1 million investment in the Cascadia Urban Analytics Cooperative (CUAC), which is a new joint effort by the University of British Columbia (UBC) and the University of Washington (UW).  It will use data to help local cities and communities address challenges from traffic to homelessness and will be the region’s single largest university-based, industry-funded joint research project. While we recognize the crucial role that universities play in building great companies in the Pacific Northwest, whether it be in computing, life sciences, aerospace or interactive entertainment, we also know research, particularly data science, holds the key to solving some of Vancouver and Seattle’s most pressing issues. This grant will advance this work.

An Oct. 21, 2016 article by Hana Golightly for the Ubyssey newspaper provides a little more detail about the province/state agreement mentioned in the joint UBC/UW news release,

An agreement between BC Premier Christy Clark and Washington Governor Jay Inslee means UBC will be collaborating with the University of Washington (UW) more in the future.

At last month’s [Sept. 2016] Cascadia Conference, Clark and Inslee signed a Memorandum of Understanding with the goal of fostering the growth of the technology sector in both regions. Officially referred to as the Cascadia Innovation Corridor, this partnership aims to reduce boundaries across the region — economic and otherwise.

While the memorandum provides broad goals and is not legally binding, it sets a precedent of collaboration between businesses, governments and universities, encouraging projects that span both jurisdictions. Aiming to capitalize on the cultural commonalities of regional centres Seattle and Vancouver, the agreement prioritizes development in life sciences, clean technology, data analytics and high tech.

Metropolitan centres like Seattle and Vancouver have experienced a surge in growth that sees planners envisioning them as the next Silicon Valleys. Premier Clark and Governor Inslee want to strengthen the ability of their jurisdictions to compete in innovation on a global scale. Accordingly, the memorandum encourages the exploration of “opportunities to advance research programs in key areas of innovation and future technologies among the region’s major universities and institutes.”

A few more questions about the Cooperative

I had a few more questions about the Feb. 23, 2017 announcement, for which (from UBC) Gail C. Murphy, PhD, FRSC, Associate Vice President Research pro tem, Professor, Computer Science of UBC and (from UW) Bill Howe, Associate Professor, Information School, Adjunct Associate Professor, Computer Science & Engineering, Associate Director and Senior Data Science Fellow,, UW eScience Institute Program Director and Faculty Chair, UW Data Science Masters Degree have kindly provided answers (Gail Murphy’s replies are prefaced with [GM] and one indent and Bill Howe’s replies are prefaced with [BH] and two indents),

  • Do you have any projects currently underway? e.g. I see a summer programme is planned. Will there be one in summer 2017? What focus will it have?

[GM] UW and UBC will each be running the Data Science for Social Good program in the summer of 2017. UBC’s announcement of the program is available at: http://dsi.ubc.ca/data-science-social-good-dssg-fellowships

  • Is the $1M from Microsoft going to be given in cash or as ‘in kind goods’ or some combination?

[GM] The $1-million donation is in cash. Microsoft organized the Emerging Cascadia Innovation Corridor Conference in September 2017. It was at the conference that the idea for the partnership was hatched. Through this initiative, UBC and UW will continue to engage with Microsoft to further shared goals in promoting evidence-based innovation to improve life for people in the Cascadia region and beyond.

  • How will the money or goods be disbursed? e.g. Will each institution get 1/2 or is there some sort of joint account?

[GM] The institutions are sharing the funds but will be separately administering the funds they receive.

  • Is data going to be crossing borders? e.g. You mentioned some health care projects. In that case, will data from BC residents be accessed and subject to US rules and regulations? Will BC residents know that there data is being accessed by a 3rd party? What level of consent is required?

[GM] As you point out, there are many issues involved with transferring data across the border. Any projects involving private data will adhere to local laws and ethical frameworks set out by the institutions.

  • Privacy rules vary greatly between the US and Canada. How is that being addressed in this proposed new research?

[No Reply]

  • Will new software and other products be created and who will own them?

[GM] It is too soon for us to comment on whether new software or other products will be created. Any creation of software or other products within the institutions will be governed by institutional policy.

  • Will the research be made freely available?

[GM] UBC researchers must be able to publish the results of research as set out by UBC policy.

[BH] Research output at UW will be made available according to UW policy, but I’ll point out that Microsoft has long been a fantastic partner in advancing our efforts in open and reproducible science, open source software, and open access publishing. 

 UW’s discussion on open access policies is available online.

 

  • What percentage of public funds will be used to enable this project? Will the province of BC and the state of Washington be splitting the costs evenly?

[GM] It is too soon for us to report on specific percentages. At UBC, we will be looking to partner with appropriate funding agencies to support more research with this donation. Applications to funding agencies will involve review of any proposals as per the rules of the funding agency.

  • Will there be any social science and/or ethics component to this collaboration? The press conference referenced data science only.

[GM] We expect, but cannot yet confirm, that some of the projects will involve collaborations with faculty from a broad range of research areas at UBC.

[BH] We are indeed placing a strong emphasis on the intersection between data science, the social sciences, and data ethics.  As examples of activities in this space around UW:

* The Information School at UW (my home school) is actively recruiting a new faculty candidate in data ethics this year

* The Education Working Group at the eScience Institute has created a new campus-wide Data & Society seminar course.

* The Center for Statistics in the Social Sciences (CSSS), which represents the marriage of data science and the social sciences, has been a long-term partner in our activities.

More specifically for this collaboration, we are collecting requirements for new software that emphasizes responsible data science: properly managing sensitive data, combating algorithmic bias, protecting privacy, and more.

Microsoft has been a key partner in this work through their Civic Technology group, for which the Seattle arm is led by Graham Thompson.

  • What impact do you see the new US federal government’s current concerns over borders and immigrants hav[ing] on this project? e.g. Are people whose origins are in Iran, Syria, Yemen, etc. and who are residents of Canada going to be able to participate?

[GM] Students and others eligible to participate in research projects in Canada will be welcomed into the UBC projects. Our hope is that faculty and students working on the Cascadia Urban Analytics Cooperative will be able to exchange ideas freely and move freely back and forth across the border.

  • How will seed funding for Sustained Research Partnerships’ be disbursed? Will there be a joint committee making these decisions?

[GM] We are in the process of elaborating this part of the program. At UBC, we are already experiencing, enjoying and benefitting from increased interaction with the University of Washington and look forward to elaborating more aspects of the program together as the year unfolds.

I had to make a few formatting changes when transferring the answers from emails to this posting: my numbered questions (1-11) became bulleted points and ‘have’ in what was question 10 was changed to ‘having’. The content for the answers has been untouched.

I’m surprised no one answered the privacy question but perhaps they thought the other answers sufficed. Despite an answer to my question *about the disbursement of funds*, I don’t understand how the universities are sharing the funds but that may just mean I’m having a bad day. (Or perhaps the folks at UBC are being overly careful after the scandals rocking the Vancouver campus over the last 18 months to two years (see Sophie Sutcliffe’s Dec. 3, 2015 opinion piece for the Ubyssey for details about the scandals).

Bill Howe’s response about open access (where you can read the journal articles for free) and open source (where you have free access to the software code) was interesting to me as I once worked for a company where the developers complained loud and long about Microsoft’s failure to embrace open source code. Howe’s response is particularly interesting given that Microsoft’s president is also the Chief Legal Officer whose portfolio of responsibilities (I imagine) includes patents.

Matt Day in a Feb. 23, 2017 article for the The Seattle Times provides additional perspective (Note: Links have been removed),

Microsoft’s effort to nudge Seattle and Vancouver, B.C., a bit closer together got an endorsement Thursday [Feb. 23, 2017] from the leading university in each city.

The University of Washington and the University of British Columbia announced the establishment of a joint data-science research unit, called the Cascadia Urban Analytics Cooperative, funded by a $1 million grant from Microsoft.

The collaboration will support study of shared urban issues, from health to transit to homelessness, drawing on faculty and student input from both universities.

The partnership has its roots in a September [2016] conference in Vancouver organized by Microsoft’s public affairs and lobbying unit [emphasis mine.] That gathering was aimed at tying business, government and educational institutions in Microsoft’s home region in the Seattle area closer to its Canadian neighbor.

Microsoft last year [2016]* opened an expanded office in downtown Vancouver with space for 750 employees, an outpost partly designed to draw to the Northwest more engineers than the company can get through the U.S. guest worker system [emphasis mine].

There’s nothing wrong with a business offering to contribute to the social good but it does well to remember that a business’s primary agenda is not the social good.  So in this case, it seems that public affairs and lobbying is really governmental affairs and that Microsoft has anticipated, for some time, greater difficulties with getting workers from all sorts of countries across the US border to work in Washington state making an outpost in British Columbia and closer relations between the constituencies quite advantageous. I wonder what else is on their agenda.

Getting back to UBC and UW, thank you to both Gail Murphy (in particular) and Bill Howe for taking the time to answer my questions. I very much appreciate it as answering 10 questions is a lot of work.

There were one area of interest (cities) that I did not broach with the either academic but will mention here.

Cities and their increasing political heft

Clearly Microsoft is focused on urban issues and that would seem to be the ‘flavour du jour’. There’s a May 31, 2016 piece on the TED website by Robert Muggah and Benjamin Fowler titled: ‘Why cities rule the world‘ (there are video talks embedded in the piece),

Cities are the the 21st century’s dominant form of civilization — and they’re where humanity’s struggle for survival will take place. Robert Muggah and Benjamin Barber spell out the possibilities.

Half the planet’s population lives in cities. They are the world’s engines, generating four-fifths of the global GDP. There are over 2,100 cities with populations of 250,000 people or more, including a growing number of mega-cities and sprawling, networked-city areas — conurbations, they’re called — with at least 10 million residents. As the economist Ed Glaeser puts it, “we are an urban species.”

But what makes cities so incredibly important is not just population or economics stats. Cities are humanity’s most realistic hope for future democracy to thrive, from the grassroots to the global. This makes them a stark contrast to so many of today’s nations, increasingly paralyzed by polarization, corruption and scandal.

In a less hyperbolic vein, Parag Khanna’s April 20,2016 piece for Quartz describes why he (and others) believe that megacities are where the future lies (Note: A link has been removed),

Cities are mankind’s most enduring and stable mode of social organization, outlasting all empires and nations over which they have presided. Today cities have become the world’s dominant demographic and economic clusters.

As the sociologist Christopher Chase-Dunn has pointed out, it is not population or territorial size that drives world-city status, but economic weight, proximity to zones of growth, political stability, and attractiveness for foreign capital. In other words, connectivity matters more than size. Cities thus deserve more nuanced treatment on our maps than simply as homogeneous black dots.

Within many emerging markets such as Brazil, Turkey, Russia, and Indonesia, the leading commercial hub or financial center accounts for at least one-third or more of national GDP. In the UK, London accounts for almost half Britain’s GDP. And in America, the Boston-New York-Washington corridor and greater Los Angeles together combine for about one-third of America’s GDP.

By 2025, there will be at least 40 such megacities. The population of the greater Mexico City region is larger than that of Australia, as is that of Chongqing, a collection of connected urban enclaves in China spanning an area the size of Austria. Cities that were once hundreds of kilometers apart have now effectively fused into massive urban archipelagos, the largest of which is Japan’s Taiheiyo Belt that encompasses two-thirds of Japan’s population in the Tokyo-Nagoya-Osaka megalopolis.

Great and connected cities, Saskia Sassen argues, belong as much to global networks as to the country of their political geography. Today the world’s top 20 richest cities have forged a super-circuit driven by capital, talent, and services: they are home to more than 75% of the largest companies, which in turn invest in expanding across those cities and adding more to expand the intercity network. Indeed, global cities have forged a league of their own, in many ways as denationalized as Formula One racing teams, drawing talent from around the world and amassing capital to spend on themselves while they compete on the same circuit.

The rise of emerging market megacities as magnets for regional wealth and talent has been the most significant contributor to shifting the world’s focal point of economic activity. McKinsey Global Institute research suggests that from now until 2025, one-third of world growth will come from the key Western capitals and emerging market megacities, one-third from the heavily populous middle-weight cities of emerging markets, and one-third from small cities and rural areas in developing countries.

Khanna’s megacities all exist within one country. If Vancouver and Seattle (and perhaps Portland?) were to become a become a megacity it would be one of the only or few to cross national borders.

Khanna has been mentioned here before in a Jan. 27, 2016 posting about cities and technology and a public engagement exercise with the National Research of Council of Canada (scroll down to the subsection titled: Cities rising in important as political entities).

Muggah/Fowler’s and Khanna’s 2016 pieces are well worth reading if you have the time.

For what it’s worth, I’m inclined to agree that cities will be and are increasing in political  importance along with this area of development:

Algorithms and big data

Concerns are being raised about how big data is being utilized so I was happy to see specific initiatives to address ethics issues in Howe’s response. For anyone not familiar with the concerns, here’s an excerpt from Cathy O’Neil’s Oct. 18, 2016 article for Wired magazine,

The age of Big Data has generated new tools and ideas on an enormous scale, with applications spreading from marketing to Wall Street, human resources, college admissions, and insurance. At the same time, Big Data has opened opportunities for a whole new class of professional gamers and manipulators, who take advantage of people using the power of statistics.

I should know. I was one of them.

Information is power, and in the age of corporate surveillance, profiles on every active American consumer means that the system is slanted in favor of those with the data. This data helps build tailor-made profiles that can be used for or against someone in a given situation. Insurance companies, which historically sold car insurance based on driving records, have more recently started using such data-driven profiling methods. A Florida insurance company has been found to charge people with low credit scores and good driving records more than people with high credit scores and a drunk driving conviction. It’s become standard practice for insurance companies to charge people not what they represent as a risk, but what they can get away with. The victims, of course, are those least likely to be able to afford the extra cost, but who need a car to get to work.

Big data profiling techniques are exploding in the world of politics. It’s estimated that over $1 billion will be spent on digital political ads in this election cycle, almost 50 times as much as was spent in 2008; this field is a growing part of the budget for presidential as well as down-ticket races. Political campaigns build scoring systems on potential voters—your likelihood of voting for a given party, your stance on a given issue, and the extent to which you are persuadable on that issue. It’s the ultimate example of asymmetric information, and the politicians can use what they know to manipulate your vote or your donation.

I highly recommend reading O’Neil’s article and, if you have the time, her book ‘Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy’.

Finally

I look forward to hearing more about the Cascadia Urban Analytics Cooperative and the Cascadia Innovation Corridor as they develop. This has the potential to be very exciting although I do have some concerns such as MIcrosoft and its agendas, both stated and unstated. After all, the Sept. 2016 meeting was convened by Microsoft and its public affairs/lobbying group and the topic was innovation, which is code for business and as hinted earlier, business is not synonymous with social good. Having said that I’m not about to demonize business either. I just think a healthy dose of skepticism is called for. Good things can happen but we need to ensure they do.

Thankfully, my concerns regarding algorithms and big data seem to be shared in some quarters, unfortunately none of these quarters appear to be located at the University of British Columbia. I hope that’s over caution with regard to communication rather than a failure to recognize any pitfalls.

ETA Mar. 1, 2017: Interestingly, the UK House of Commons Select Committee on Science and Technology announced an inquiry into the use of algorithms in public and business decision-making on Feb. 28, 2017. As this posting as much too big already, I’ve posted about the UK inquire separately in a Mar. 1, 2017 posting.

*’2016′ added for clarity on March 24, 2017.

*’disbursement of funds’ added for clarity on Sept. 21, 2017.

Sniffing out disease (Na-Nose)

The ‘artificial nose’ is not a newcomer to this blog. The most recent post prior to this is a March 15, 2016 piece about Disney using an artificial nose for art conservation. Today’s (Jan. 9, 2016) piece concerns itself with work from Israel and ‘sniffing out’ disease, according to a Dec. 30, 2016 news item in Sputnik News,

A team from the Israel Institute of Technology has developed a device that from a single breath can identify diseases such as multiple forms of cancer, Parkinson’s disease, and multiple sclerosis. While the machine is still in the experimental stages, it has a high degree of promise for use in non-invasive diagnoses of serious illnesses.

The international team demonstrated that a medical theory first proposed by the Greek physician Hippocrates some 2400 years ago is true, certain diseases leave a “breathprint” on the exhalations of those afflicted. The researchers created a prototype for a machine that can pick up on those diseases using the outgoing breath of a patient. The machine, called the Na-Nose, tests breath samples for the presence of trace amounts of chemicals that are indicative of 17 different illnesses.

A Dec. 22, 2016 Technion Israel Institute of Technology press release offers more detail about the work,

An international team of 56 researchers in five countries has confirmed a hypothesis first proposed by the ancient Greeks – that different diseases are characterized by different “chemical signatures” identifiable in breath samples. …

Diagnostic techniques based on breath samples have been demonstrated in the past, but until now, there has not been scientific proof of the hypothesis that different and unrelated diseases are characterized by distinct chemical breath signatures. And technologies developed to date for this type of diagnosis have been limited to detecting a small number of clinical disorders, without differentiation between unrelated diseases.

The study of more than 1,400 patients included 17 different and unrelated diseases: lung cancer, colorectal cancer, head and neck cancer, ovarian cancer, bladder cancer, prostate cancer, kidney cancer, stomach cancer, Crohn’s disease, ulcerative colitis, irritable bowel syndrome, Parkinson’s disease (two types), multiple sclerosis, pulmonary hypertension, preeclampsia and chronic kidney disease. Samples were collected between January 2011 and June 2014 from in 14 departments at 9 medical centers in 5 countries: Israel, France, the USA, Latvia and China.

The researchers tested the chemical composition of the breath samples using an accepted analytical method (mass spectrometry), which enabled accurate quantitative detection of the chemical compounds they contained. 13 chemical components were identified, in different compositions, in all 17 of the diseases.

According to Prof. Haick, “each of these diseases is characterized by a unique fingerprint, meaning a different composition of these 13 chemical components.  Just as each of us has a unique fingerprint that distinguishes us from others, each disease has a chemical signature that distinguishes it from other diseases and from a normal state of health. These odor signatures are what enables us to identify the diseases using the technology that we developed.”

With a new technology called “artificially intelligent nanoarray,” developed by Prof. Haick, the researchers were able to corroborate the clinical efficacy of the diagnostic technology. The array enables fast and inexpensive diagnosis and classification of diseases, based on “smelling” the patient’s breath, and using artificial intelligence to analyze the data obtained from the sensors. Some of the sensors are based on layers of gold nanoscale particles and others contain a random network of carbon nanotubes coated with an organic layer for sensing and identification purposes.

The study also assessed the efficiency of the artificially intelligent nanoarray in detecting and classifying various diseases using breath signatures. To verify the reliability of the system, the team also examined the effect of various factors (such as gender, age, smoking habits and geographic location) on the sample composition, and found their effect to be negligible, and without impairment on the array’s sensitivity.

“Each of the sensors responds to a wide range of exhalation components,” explain Prof. Haick and his previous Ph.D student, Dr. Morad Nakhleh, “and integration of the information provides detailed data about the unique breath signatures characteristic of the various diseases. Our system has detected and classified various diseases with an average accuracy of 86%.

This is a new and promising direction for diagnosis and classification of diseases, which is characterized not only by considerable accuracy but also by low cost, low electricity consumption, miniaturization, comfort and the possibility of repeating the test easily.”

“Breath is an excellent raw material for diagnosis,” said Prof. Haick. “It is available without the need for invasive and unpleasant procedures, it’s not dangerous, and you can sample it again and again if necessary.”

Here’s a schematic of the study, which the researchers have made available,

Diagram: A schematic view of the study. Two breath samples were taken from each subject, one was sent for chemical mapping using mass spectrometry, and the other was analyzed in the new system, which produced a clinical diagnosis based on the chemical fingerprint of the breath sample. Courtesy: Tech;nion

There is also a video, which covers much of the same ground as the press release but also includes information about the possible use of the Na-Nose technology in the European Union’s SniffPhone project,

Here’s a link to and a citation for the paper,

Diagnosis and Classification of 17 Diseases from 1404 Subjects via Pattern Analysis of Exhaled Molecules by Morad K. Nakhleh, Haitham Amal, Raneen Jeries, Yoav Y. Broza, Manal Aboud, Alaa Gharra, Hodaya Ivgi, Salam Khatib, Shifaa Badarneh, Lior Har-Shai, Lea Glass-Marmor, Izabella Lejbkowicz, Ariel Miller, Samih Badarny, Raz Winer, John Finberg, Sylvia Cohen-Kaminsky, Frédéric Perros, David Montani, Barbara Girerd, Gilles Garcia, Gérald Simonneau, Farid Nakhoul, Shira Baram, Raed Salim, Marwan Hakim, Maayan Gruber, Ohad Ronen, Tal Marshak, Ilana Doweck, Ofer Nativ, Zaher Bahouth, Da-you Shi, Wei Zhang, Qing-ling Hua, Yue-yin Pan, Li Tao, Hu Liu, Amir Karban, Eduard Koifman, Tova Rainis, Roberts Skapars, Armands Sivins, Guntis Ancans, Inta Liepniece-Karele, Ilze Kikuste, Ieva Lasina, Ivars Tolmanis, Douglas Johnson, Stuart Z. Millstone, Jennifer Fulton, John W. Wells, Larry H. Wilf, Marc Humbert, Marcis Leja, Nir Peled, and Hossam Haick. ACS Nano, Article ASAP DOI: 10.1021/acsnano.6b04930 Publication Date (Web): December 21, 2016

Copyright © 2017 American Chemical Society

This paper appears to be open access.

As for SniffPhone, they’re hoping that Na-Nose or something like it will allow them to modify smartphones in a way that will allow diseases to be detected.

I can’t help wondering who will own the data if your smartphone detects a disease. If you think that’s an idle question, here’s an excerpt from Sue Halpern’s Dec. 22, 2016 review of two books (“Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy” by Cathy O’Neil and “Virtual Competition: The Promise and Perils of the Algorithm-Driven Economy” by Ariel Ezrachi and Maurice E. Stucke) for the New York Times Review of Books,

We give our data away. We give it away in drips and drops, not thinking that data brokers will collect it and sell it, let alone that it will be used against us. There are now private, unregulated DNA databases culled, in part, from DNA samples people supply to genealogical websites in pursuit of their ancestry. These samples are available online to be compared with crime scene DNA without a warrant or court order. (Police are also amassing their own DNA databases by swabbing cheeks during routine stops.) In the estimation of the Electronic Frontier Foundation, this will make it more likely that people will be implicated in crimes they did not commit.

Or consider the data from fitness trackers, like Fitbit. As reported in The Intercept:

During a 2013 FTC panel on “Connected Health and Fitness,” University of Colorado law professor Scott Peppet said, “I can paint an incredibly detailed and rich picture of who you are based on your Fitbit data,” adding, “That data is so high quality that I can do things like price insurance premiums or I could probably evaluate your credit score incredibly accurately.”

Halpern’s piece is well worth reading in its entirety.

Removing gender-based stereotypes from algorithms

Most people don’t think of algorithms as having biases and stereotypes but Michael Zou in his Sept. 26, 2016 essay for The Conversation (h/t phys.org Sept. 26, 2016 news item) says different, Note: Links have been removed,

Machine learning is ubiquitous in our daily lives. Every time we talk to our smartphones, search for images or ask for restaurant recommendations, we are interacting with machine learning algorithms. They take as input large amounts of raw data, like the entire text of an encyclopedia, or the entire archives of a newspaper, and analyze the information to extract patterns that might not be visible to human analysts. But when these large data sets include social bias, the machines learn that too.

A machine learning algorithm is like a newborn baby that has been given millions of books to read without being taught the alphabet or knowing any words or grammar. The power of this type of information processing is impressive, but there is a problem. When it takes in the text data, a computer observes relationships between words based on various factors, including how often they are used together.

We can test how well the word relationships are identified by using analogy puzzles. Suppose I ask the system to complete the analogy “He is to King as She is to X.” If the system comes back with “Queen,” then we would say it is successful, because it returns the same answer a human would.

Our research group trained the system on Google News articles, and then asked it to complete a different analogy: “Man is to Computer Programmer as Woman is to X.” The answer came back: “Homemaker.”

Zou explains how a machine (algorithm) learns and then notes this,

Not only can the algorithm reflect society’s biases – demonstrating how much those biases are contained in the input data – but the system can potentially amplify gender stereotypes. Suppose I search for “computer programmer” and the search program uses a gender-biased database that associates that term more closely with a man than a woman.

The search results could come back flawed by the bias. Because “John” as a male name is more closely related to “computer programmer” than the female name “Mary” in the biased data set, the search program could evaluate John’s website as more relevant to the search than Mary’s – even if the two websites are identical except for the names and gender pronouns.

It’s true that the biased data set could actually reflect factual reality – perhaps there are more “Johns” who are programmers than there are “Marys” – and the algorithms simply capture these biases. This does not absolve the responsibility of machine learning in combating potentially harmful stereotypes. The biased results would not just repeat but could even boost the statistical bias that most programmers are male, by moving the few female programmers lower in the search results. It’s useful and important to have an alternative that’s not biased.

There is a way according to Zou that stereotypes can be removed,

Our debiasing system uses real people to identify examples of the types of connections that are appropriate (brother/sister, king/queen) and those that should be removed. Then, using these human-generated distinctions, we quantified the degree to which gender was a factor in those word choices – as opposed to, say, family relationships or words relating to royalty.

Next we told our machine-learning algorithm to remove the gender factor from the connections in the embedding. This removes the biased stereotypes without reducing the overall usefulness of the embedding.

When that is done, we found that the machine learning algorithm no longer exhibits blatant gender stereotypes. We are investigating applying related ideas to remove other types of biases in the embedding, such as racial or cultural stereotypes.

If you have time, I encourage you to read the essay in its entirety and this June 14, 2016 posting about research into algorithms and how they make decisions for you about credit, medical diagnoses, job opportunities and more.

There’s also an Oct. 24, 2016 article by Michael Light on Salon.com on the topic (Note: Links have been removed),

In a recent book that was longlisted for the National Book Award, Cathy O’Neil, a data scientist, blogger and former hedge-fund quant, details a number of flawed algorithms to which we have given incredible power — she calls them “Weapons of Math Destruction.” We have entrusted these WMDs to make important, potentially life-altering decisions, yet in many cases, they embed human race and class biases; in other cases, they don’t function at all.
Among other examples, O’Neil examines a “value-added” model New York City used to decide which teachers to fire, even though, she writes, the algorithm was useless, functioning essentially as a random number generator, arbitrarily ending careers. She looks at models put to use by judges to assign recidivism scores to inmates that ended up having a racist inclination. And she looks at how algorithms are contributing to American partisanship, allowing political operatives to target voters with information that plays to their existing biases and fears.

I recommend reading Light’s article in its entirety.