Tag Archives: Microsoft Research

Ghosts, mechanical turks, and pseudo-AI (artificial intelligence)—Is it all a con game?

There’s been more than one artificial intelligence (AI) story featured here on this blog but the ones featured in this posting are the first I’ve stumbled across that suggest the hype is even more exaggerated than even the most cynical might have thought. (BTW, the 2019 material is later as I have taken a chronological approach to this posting.)

It seems a lot of companies touting their AI algorithms and capabilities are relying on human beings to do the work, from a July 6, 2018 article by Olivia Solon for the Guardian (Note: A link has been removed),

It’s hard to build a service powered by artificial intelligence. So hard, in fact, that some startups have worked out it’s cheaper and easier to get humans to behave like robots than it is to get machines to behave like humans.

“Using a human to do the job lets you skip over a load of technical and business development challenges. It doesn’t scale, obviously, but it allows you to build something and skip the hard part early on,” said Gregory Koberger, CEO of ReadMe, who says he has come across a lot of “pseudo-AIs”.

“It’s essentially prototyping the AI with human beings,” he said.

In 2017, the business expense management app Expensify admitted that it had been using humans to transcribe at least some of the receipts it claimed to process using its “smartscan technology”. Scans of the receipts were being posted to Amazon’s Mechanical Turk crowdsourced labour tool, where low-paid workers were reading and transcribing them.

“I wonder if Expensify SmartScan users know MTurk workers enter their receipts,” said Rochelle LaPlante, a “Turker” and advocate for gig economy workers on Twitter. “I’m looking at someone’s Uber receipt with their full name, pick-up and drop-off addresses.”

Even Facebook, which has invested heavily in AI, relied on humans for its virtual assistant for Messenger, M.

In some cases, humans are used to train the AI system and improve its accuracy. …

The Turk

Fooling people with machines that seem intelligent is not new according to a Sept. 10, 2018 article by Seth Stevenson for Slate.com (Note: Links have been removed),

It’s 1783, and Paris is gripped by the prospect of a chess match. One of the contestants is François-André Philidor, who is considered the greatest chess player in Paris, and possibly the world. Everyone is so excited because Philidor is about to go head-to-head with the other biggest sensation in the chess world at the time.

But his opponent isn’t a man. And it’s not a woman, either. It’s a machine.

This story may sound a lot like Garry Kasparov taking on Deep Blue, IBM’s chess-playing supercomputer. But that was only a couple of decades ago, and this chess match in Paris happened more than 200 years ago. It doesn’t seem like a robot that can play chess would even be possible in the 1780s. This machine playing against Philidor was making an incredible technological leap—playing chess, and not only that, but beating humans at chess.

In the end, it didn’t quite beat Philidor, but the chess master called it one of his toughest matches ever. It was so hard for Philidor to get a read on his opponent, which was a carved wooden figure—slightly larger than life—wearing elaborate garments and offering a cold, mean stare.

It seems like the minds of the era would have been completely blown by a robot that could nearly beat a human chess champion. Some people back then worried that it was black magic, but many folks took the development in stride. …

Debates about the hottest topic in technology today—artificial intelligence—didn’t starts in the 1940s, with people like Alan Turing and the first computers. It turns out that the arguments about AI go back much further than you might imagine. The story of the 18th-century chess machine turns out to be one of those curious tales from history that can help us understand technology today, and where it might go tomorrow.

[In future episodes our podcast, Secret History of the Future] we’re going to look at the first cyberattack, which happened in the 1830s, and find out how the Victorians invented virtual reality.

Philidor’s opponent was known as The Turk or Mechanical Turk and that ‘machine’ was in fact a masterful hoax as The Turk held a hidden compartment from which a human being directed his moves.

People pretending to be AI agents

It seems that today’s AI has something in common with the 18th century Mechanical Turk, there are often humans lurking in the background making things work. From a Sept. 4, 2018 article by Janelle Shane for Slate.com (Note: Links have been removed),

Every day, people are paid to pretend to be bots.

In a strange twist on “robots are coming for my job,” some tech companies that boast about their artificial intelligence have found that at small scales, humans are a cheaper, easier, and more competent alternative to building an A.I. that can do the task.

Sometimes there is no A.I. at all. The “A.I.” is a mockup powered entirely by humans, in a “fake it till you make it” approach used to gauge investor interest or customer behavior. Other times, a real A.I. is combined with human employees ready to step in if the bot shows signs of struggling. These approaches are called “pseudo-A.I.” or sometimes, more optimistically, “hybrid A.I.”

Although some companies see the use of humans for “A.I.” tasks as a temporary bridge, others are embracing pseudo-A.I. as a customer service strategy that combines A.I. scalability with human competence. They’re advertising these as “hybrid A.I.” chatbots, and if they work as planned, you will never know if you were talking to a computer or a human. Every remote interaction could turn into a form of the Turing test. So how can you tell if you’re dealing with a bot pretending to be a human or a human pretending to be a bot?

One of the ways you can’t tell anymore is by looking for human imperfections like grammar mistakes or hesitations. In the past, chatbots had prewritten bits of dialogue that they could mix and match according to built-in rules. Bot speech was synonymous with precise formality. In early Turing tests, spelling mistakes were often a giveaway that the hidden speaker was a human. Today, however, many chatbots are powered by machine learning. Instead of using a programmer’s rules, these algorithms learn by example. And many training data sets come from services like Amazon’s Mechanical Turk, which lets programmers hire humans from around the world to generate examples of tasks like asking and answering questions. These data sets are usually full of casual speech, regionalisms, or other irregularities, so that’s what the algorithms learn. It’s not uncommon these days to get algorithmically generated image captions that read like text messages. And sometimes programmers deliberately add these things in, since most people don’t expect imperfections of an algorithm. In May, Google’s A.I. assistant made headlines for its ability to convincingly imitate the “ums” and “uhs” of a human speaker.

Limited computing power is the main reason that bots are usually good at just one thing at a time. Whenever programmers try to train machine learning algorithms to handle additional tasks, they usually get algorithms that can do many tasks rather badly. In other words, today’s algorithms are artificial narrow intelligence, or A.N.I., rather than artificial general intelligence, or A.G.I. For now, and for many years in the future, any algorithm or chatbot that claims A.G.I-level performance—the ability to deal sensibly with a wide range of topics—is likely to have humans behind the curtain.

Another bot giveaway is a very poor memory. …

Bringing AI to life: ghosts

Sidney Fussell’s April 15, 2019 article for The Atlantic provides more detail about the human/AI interface as found in some Amazon products such as Alexa ( a voice-control system),

… Alexa-enabled speakers can and do interpret speech, but Amazon relies on human guidance to make Alexa, well, more human—to help the software understand different accents, recognize celebrity names, and respond to more complex commands. This is true of many artificial intelligence–enabled products. They’re prototypes. They can only approximate their promised functions while humans help with what Harvard researchers have called “the paradox of automation’s last mile.” Advancements in AI, the researchers write, create temporary jobs such as tagging images or annotating clips, even as the technology is meant to supplant human labor. In the case of the Echo, gig workers are paid to improve its voice-recognition software—but then, when it’s advanced enough, it will be used to replace the hostess in a hotel lobby.

A 2016 paper by researchers at Stanford University used a computer vision system to infer, with 88 percent accuracy, the political affiliation of 22 million people based on what car they drive and where they live. Traditional polling would require a full staff, a hefty budget, and months of work. The system completed the task in two weeks. But first, it had to know what a car was. The researchers paid workers through Amazon’s Mechanical Turk [emphasis mine] platform to manually tag thousands of images of cars, so the system would learn to differentiate between shapes, styles, and colors.

It may be a rude awakening for Amazon Echo owners, but AI systems require enormous amounts of categorized data, before, during, and after product launch. ..,

Isn’t interesting that Amazon also has a crowdsourcing marketplace for its own products. Calling it ‘Mechanical Turk’ after a famous 18th century hoax would suggest a dark sense of humour somewhere in the corporation. (You can find out more about the Amazon Mechanical Turk on this Amazon website and in its Wikipedia entry.0

Anthropologist, Mary L. Gray has coined the phrase ‘ghost work’ for the work that humans perform but for which AI gets the credit. Angela Chan’s May 13, 2019 article for The Verge features Gray as she promotes her latest book with Siddarth Suri ‘Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass’ (Note: A link has been removed),

“Ghost work” is anthropologist Mary L. Gray’s term for the invisible labor that powers our technology platforms. When Gray, a senior researcher at Microsoft Research, first arrived at the company, she learned that building artificial intelligence requires people to manage and clean up data to feed to the training algorithms. “I basically started asking the engineers and computer scientists around me, ‘Who are the people you pay to do this task work of labeling images and classification tasks and cleaning up databases?’” says Gray. Some people said they didn’t know. Others said they didn’t want to know and were concerned that if they looked too closely they might find unsavory working conditions.

So Gray decided to find out for herself. Who are the people, often invisible, who pick up the tasks necessary for these platforms to run? Why do they do this work, and why do they leave? What are their working conditions?

The interview that follows is interesting although it doesn’t seem to me that the question about working conditions is answered in any great detail. However, there is this rather interesting policy suggestion,

If companies want to happily use contract work because they need to constantly churn through new ideas and new aptitudes, the only way to make that a good thing for both sides of that enterprise is for people to be able to jump into that pool. And people do that when they have health care and other provisions. This is the business case for universal health care, for universal education as a public good. It’s going to benefit all enterprise.

I want to get across to people that, in a lot of ways, we’re describing work conditions. We’re not describing a particular type of work. We’re describing today’s conditions for project-based task-driven work. This can happen to everybody’s jobs, and I hate that that might be the motivation because we should have cared all along, as this has been happening to plenty of people. For me, the message of this book is: let’s make this not just manageable, but sustainable and enjoyable. Stop making our lives wrap around work, and start making work serve our lives.

Puts a different spin on AI and work, doesn’t it?

Machine learning software and quantum computers that think

A Sept. 14, 2017 news item on phys.org sets the stage for quantum machine learning by explaining a few basics first,

Language acquisition in young children is apparently connected with their ability to detect patterns. In their learning process, they search for patterns in the data set that help them identify and optimize grammar structures in order to properly acquire the language. Likewise, online translators use algorithms through machine learning techniques to optimize their translation engines to produce well-rounded and understandable outcomes. Even though many translations did not make much sense at all at the beginning, in these past years we have been able to see major improvements thanks to machine learning.

Machine learning techniques use mathematical algorithms and tools to search for patterns in data. These techniques have become powerful tools for many different applications, which can range from biomedical uses such as in cancer reconnaissance, in genetics and genomics, in autism monitoring and diagnosis and even plastic surgery, to pure applied physics, for studying the nature of materials, matter or even complex quantum systems.

Capable of adapting and changing when exposed to a new set of data, machine learning can identify patterns, often outperforming humans in accuracy. Although machine learning is a powerful tool, certain application domains remain out of reach due to complexity or other aspects that rule out the use of the predictions that learning algorithms provide.

Thus, in recent years, quantum machine learning has become a matter of interest because of is vast potential as a possible solution to these unresolvable challenges and quantum computers show to be the right tool for its solution.

A Sept. 14, 2017 Institute of Photonic Sciences ([Catalan] Institut de Ciències Fotòniques] ICFO) press release, which originated the news item, goes on to detail a recently published overview of the state of quantum machine learning,

In a recent study, published in Nature, an international team of researchers integrated by Jacob Biamonte from Skoltech/IQC, Peter Wittek from ICFO, Nicola Pancotti from MPQ, Patrick Rebentrost from MIT, Nathan Wiebe from Microsoft Research, and Seth Lloyd from MIT have reviewed the actual status of classical machine learning and quantum machine learning. In their review, they have thoroughly addressed different scenarios dealing with classical and quantum machine learning. In their study, they have considered different possible combinations: the conventional method of using classical machine learning to analyse classical data, using quantum machine learning to analyse both classical and quantum data, and finally, using classical machine learning to analyse quantum data.

Firstly, they set out to give an in-depth view of the status of current supervised and unsupervised learning protocols in classical machine learning by stating all applied methods. They introduce quantum machine learning and provide an extensive approach on how this technique could be used to analyse both classical and quantum data, emphasizing that quantum machines could accelerate processing timescales thanks to the use of quantum annealers and universal quantum computers. Quantum annealing technology has better scalability, but more limited use cases. For instance, the latest iteration of D-Wave’s [emphasis mine] superconducting chip integrates two thousand qubits, and it is used for solving certain hard optimization problems and for efficient sampling. On the other hand, universal (also called gate-based) quantum computers are harder to scale up, but they are able to perform arbitrary unitary operations on qubits by sequences of quantum logic gates. This resembles how digital computers can perform arbitrary logical operations on classical bits.

However, they address the fact that controlling a quantum system is very complex and analyzing classical data with quantum resources is not as straightforward as one may think, mainly due to the challenge of building quantum interface devices that allow classical information to be encoded into a quantum mechanical form. Difficulties, such as the “input” or “output” problems appear to be the major technical challenge that needs to be overcome.

The ultimate goal is to find the most optimized method that is able to read, comprehend and obtain the best outcomes of a data set, be it classical or quantum. Quantum machine learning is definitely aimed at revolutionizing the field of computer sciences, not only because it will be able to control quantum computers, speed up the information processing rates far beyond current classical velocities, but also because it is capable of carrying out innovative functions, such quantum deep learning, that could not only recognize counter-intuitive patterns in data, invisible to both classical machine learning and to the human eye, but also reproduce them.

As Peter Wittek [emphasis mine] finally states, “Writing this paper was quite a challenge: we had a committee of six co-authors with different ideas about what the field is, where it is now, and where it is going. We rewrote the paper from scratch three times. The final version could not have been completed without the dedication of our editor, to whom we are indebted.”

It was a bit of a surprise to see local (Vancouver, Canada) company D-Wave Systems mentioned but i notice that one of the paper’s authors (Peter Wittek) is mentioned in a May 22, 2017 D-Wave news release announcing a new partnership to foster quantum machine learning,

Today [May 22, 2017] D-Wave Systems Inc., the leader in quantum computing systems and software, announced a new initiative with the Creative Destruction Lab (CDL) at the University of Toronto’s Rotman School of Management. D-Wave will work with CDL, as a CDL Partner, to create a new track to foster startups focused on quantum machine learning. The new track will complement CDL’s successful existing track in machine learning. Applicants selected for the intensive one-year program will go through an introductory boot camp led by Dr. Peter Wittek [emphasis mine], author of Quantum Machine Learning: What Quantum Computing means to Data Mining, with instruction and technical support from D-Wave experts, access to a D-Wave 2000Q™ quantum computer, and the opportunity to use a D-Wave sampling service to enable machine learning computations and applications. D-Wave staff will be a part of the committee selecting up to 40 individuals for the program, which begins in September 2017.

For anyone interested in the paper, here’s a link to and a citation,

Quantum machine learning by Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick Rebentrost, Nathan Wiebe, & Seth Lloyd. Nature 549, 195–202 (14 September 2017) doi:10.1038/nature23474 Published online 13 September 2017

This paper is behind a paywall.

World heritage music stored in DNA

It seems a Swiss team from the École Polytechnique de Lausanne (EPFL) have collaborated with American companies Twist Bioscience and Microsoft, as well as, the University of Washington (state) to preserve two iconic jazz pieces on DNA (deoxyribonucleic acid) according to a Sept. 29, 2017 news item on phys.org,,

Thanks to an innovative technology for encoding data in DNA strands, two items of world heritage – songs recorded at the Montreux Jazz Festival [held in Switzerland] and digitized by EPFL – have been safeguarded for eternity. This marks the first time that cultural artifacts granted UNESCO heritage status have been saved in such a manner, ensuring they are preserved for thousands of years. The method was developed by US company Twist Bioscience and is being unveiled today in a demonstrator created at the EPFL+ECAL Lab.

“Tutu” by Miles Davis and “Smoke on the Water” by Deep Purple have already made their mark on music history. Now they have entered the annals of science, for eternity. Recordings of these two legendary songs were digitized by the Ecole Polytechnique Fédérale de Lausanne (EPFL) as part of the Montreux Jazz Digital Project, and they are the first to be stored in the form of a DNA sequence that can be subsequently decoded and listened to without any reduction in quality.

A Sept. 29, 2017 EPFL press release by Emmanuel Barraud, which originated the news item, provides more details,

This feat was achieved by US company Twist Bioscience working in association with Microsoft Research and the University of Washington. The pioneering technology is actually based on a mechanism that has been at work on Earth for billions of years: storing information in the form of DNA strands. This fundamental process is what has allowed all living species, plants and animals alike, to live on from generation to generation.

The entire world wide web in a shoe box

All electronic data storage involves encoding data in binary format – a series of zeros and ones – and then recording it on a physical medium. DNA works in a similar way, but is composed of long strands of series of four nucleotides (A, T, C and G) that make up a “code.” While the basic principle may be the same, the two methods differ greatly in terms of efficiency: if all the information currently on the internet was stored in the form of DNA, it would fit in a shoe box!

Recent advances in biotechnology now make it possible for humans to do what Mother Nature has always done. Today’s scientists can create artificial DNA strands, “record” any kind of genetic code on them and then analyze them using a sequencer to reconstruct the original data. What’s more, DNA is extraordinarily stable, as evidenced by prehistoric fragments that have been preserved in amber. Artificial strands created by scientists and carefully encapsulated should likewise last for millennia.

To help demonstrate the feasibility of this new method, EPFL’s Metamedia Center provided recordings of two famous songs played at the Montreux Jazz Festival: “Tutu” by Miles Davis, and “Smoke on the Water” by Deep Purple. Twist Bioscience and its research partners encoded the recordings, transformed them into DNA strands and then sequenced and decoded them and played them again – without any reduction in quality.

The amount of artificial DNA strands needed to record the two songs is invisible to the naked eye, and the amount needed to record all 50 years of the Festival’s archives, which have been included in UNESCO’s [United Nations Educational, Scientific and Cultural Organization] Memory of the World Register, would be equal in size to a grain of sand. “Our partnership with EPFL in digitizing our archives aims not only at their positive exploration, but also at their preservation for the next generations,” says Thierry Amsallem, president of the Claude Nobs Foundation. “By taking part in this pioneering experiment which writes the songs into DNA strands, we can be certain that they will be saved on a medium that will never become obsolete!”

A new concept of time

At EPFL’s first-ever ArtTech forum, attendees got to hear the two songs played after being stored in DNA, using a demonstrator developed at the EPFL+ECAL Lab. The system shows that being able to store data for thousands of years is a revolutionary breakthrough that can completely change our relationship with data, memory and time. “For us, it means looking into radically new ways of interacting with cultural heritage that can potentially cut across civilizations,” says Nicolas Henchoz, head of the EPFL+ECAL Lab.

Quincy Jones, a longstanding Festival supporter, is particularly enthusiastic about this technological breakthrough: “With advancements in nanotechnology, I believe we can expect to see people living prolonged lives, and with that, we can also expect to see more developments in the enhancement of how we live. For me, life is all about learning where you came from in order to get where you want to go, but in order to do so, you need access to history! And with the unreliability of how archives are often stored, I sometimes worry that our future generations will be left without such access… So, it absolutely makes my soul smile to know that EPFL, Twist Bioscience and their partners are coming together to preserve the beauty and history of the Montreux Jazz Festival for our future generations, on DNA! I’ve been a part of this festival for decades and it truly is a magnificent representation of what happens when different cultures unite for the sake of music. Absolute magic. And I’m proud to know that the memory of this special place will never be lost.

A Sept. 29, 2017 Twist Bioscience news release is repetitive in some ways but interesting nonetheless,

Twist Bioscience, a company accelerating science and innovation through rapid, high-quality DNA synthesis, today announced that, working with Microsoft and University of Washington researchers, they have successfully stored archival-quality audio recordings of two important music performances from the archives of the world-renowned Montreux Jazz Festival.
These selections are encoded and stored in nature’s preferred storage medium, DNA, for the first time. These tiny specks of DNA will preserve a part of UNESCO’s Memory of the World Archive, where valuable cultural heritage collections are recorded. This is the first time DNA has been used as a long-term archival-quality storage medium.
Quincy Jones, world-renowned Entertainment Executive, Music Composer and Arranger, Musician and Music Producer said, “With advancements in nanotechnology, I believe we can expect to see people living prolonged lives, and with that, we can also expect to see more developments in the enhancement of how we live. For me, life is all about learning where you came from in order to get where you want to go, but in order to do so, you need access to history! And with the unreliability of how archives are often stored, I sometimes worry that our future generations will be left without such access…So, it absolutely makes my soul smile to know that EPFL, Twist Bioscience and others are coming together to preserve the beauty and history of the Montreux Jazz Festival for our future generations, on DNA!…I’ve been a part of this festival for decades and it truly is a magnificent representation of what happens when different cultures unite for the sake of music. Absolute magic. And I’m proud to know that the memory of this special place will never be lost.”
“Our partnership with EPFL in digitizing our archives aims not only at their positive exploration, but also at their preservation for the next generations,” says Thierry Amsallem, president of the Claude Nobs Foundation. “By taking part in this pioneering experiment which writes the songs into DNA strands, we can be certain that they will be saved on a medium that will never become obsolete!”
The Montreux Jazz Digital Project is a collaboration between the Claude Nobs Foundation, curator of the Montreux Jazz Festival audio-visual collection and the École Polytechnique Fédérale de Lausanne (EPFL) to digitize, enrich, store, show, and preserve this notable legacy created by Claude Nobs, the Festival’s founder.
In this proof-of-principle project, two quintessential music performances from the Montreux Jazz Festival – Smoke on the Water, performed by Deep Purple and Tutu, performed by Miles Davis – have been encoded onto DNA and read back with 100 percent accuracy. After being decoded, the songs were played on September 29th [2017] at the ArtTech Forum (see below) in Lausanne, Switzerland. Smoke on the Water was selected as a tribute to Claude Nobs, the Montreux Jazz Festival’s founder. The song memorializes a fire and Funky Claude’s rescue efforts at the Casino Barrière de Montreux during a Frank Zappa concert promoted by Claude Nobs. Miles Davis’ Tutu was selected for the role he played in music history and the Montreux Jazz Festival’s success. Miles Davis died in 1991.
“We archived two magical musical pieces on DNA of this historic collection, equating to 140MB of stored data in DNA,” said Karin Strauss, Ph.D., a Senior Researcher at Microsoft, and one of the project’s leaders.  “The amount of DNA used to store these songs is much smaller than one grain of sand. Amazingly, storing the entire six petabyte Montreux Jazz Festival’s collection would result in DNA smaller than one grain of rice.”
Luis Ceze, Ph.D., a professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, said, “DNA, nature’s preferred information storage medium, is an ideal fit for digital archives because of its durability, density and eternal relevance. Storing items from the Montreux Jazz Festival is a perfect way to show how fast DNA digital data storage is becoming real.”
Nature’s Preferred Storage Medium
Nature selected DNA as its hard drive billions of years ago to encode all the genetic instructions necessary for life. These instructions include all the information necessary for survival. DNA molecules encode information with sequences of discrete units. In computers, these discrete units are the 0s and 1s of “binary code,” whereas in DNA molecules, the units are the four distinct nucleotide bases: adenine (A), cytosine (C), guanine (G) and thymine (T).
“DNA is a remarkably efficient molecule that can remain stable for millennia,” said Bill Peck, Ph.D., chief technology officer of Twist Bioscience.  “This is a very exciting project: we are now in an age where we can use the remarkable efficiencies of nature to archive master copies of our cultural heritage in DNA.   As we develop the economies of this process new performances can be added any time.  Unlike current storage technologies, nature’s media will not change and will remain readable through time. There will be no new technology to replace DNA, nature has already optimized the format.”
DNA: Far More Efficient Than a Computer 
Each cell within the human body contains approximately three billion base pairs of DNA. With 75 trillion cells in the human body, this equates to the storage of 150 zettabytes (1021) of information within each body. By comparison, the largest data centers can be hundreds of thousands to even millions of square feet to hold a comparable amount of stored data.
The Elegance of DNA as a Storage Medium
Like music, which can be widely varied with a finite number of notes, DNA encodes individuality with only four different letters in varied combinations. When using DNA as a storage medium, there are several advantages in addition to the universality of the format and incredible storage density. DNA can be stable for thousands of years when stored in a cool dry place and is easy to copy using polymerase chain reaction to create back-up copies of archived material. In addition, because of PCR, small data sets can be targeted and recovered quickly from a large dataset without needing to read the entire file.
How to Store Digital Data in DNA
To encode the music performances into archival storage copies in DNA, Twist Bioscience worked with Microsoft and University of Washington researchers to complete four steps: Coding, synthesis/storage, retrieval and decoding. First, the digital files were converted from the binary code using 0s and 1s into sequences of A, C, T and G. For purposes of the example, 00 represents A, 10 represents C, 01 represents G and 11 represents T. Twist Bioscience then synthesizes the DNA in short segments in the sequence order provided. The short DNA segments each contain about 12 bytes of data as well as a sequence number to indicate their place within the overall sequence. This is the process of storage. And finally, to ensure that the file is stored accurately, the sequence is read back to ensure 100 percent accuracy, and then decoded from A, C, T or G into a two-digit binary representation.
Importantly, to encapsulate and preserve encoded DNA, the collaborators are working with Professor Dr. Robert Grass of ETH Zurich. Grass has developed an innovative technology inspired by preservation of DNA within prehistoric fossils.  With this technology, digital data encoded in DNA remains preserved for millennia.
About UNESCO’s Memory of the World Register
UNESCO established the Memory of the World Register in 1992 in response to a growing awareness of the perilous state of preservation of, and access to, documentary heritage in various parts of the world.  Through its National Commissions, UNESCO prepared a list of endangered library and archive holdings and a world list of national cinematic heritage.
A range of pilot projects employing contemporary technology to reproduce original documentary heritage on other media began. These included, for example, a CD-ROM of the 13th Century Radzivill Chronicle, tracing the origins of the peoples of Europe, and Memoria de Iberoamerica, a joint newspaper microfilming project involving seven Latin American countries. These projects enhanced access to this documentary heritage and contributed to its preservation.
“We are incredibly proud to be a part of this momentous event, with the first archived songs placed into the UNESCO Memory of the World Register,” said Emily Leproust, Ph.D., CEO of Twist Bioscience.
About ArtTech
The ArtTech Foundation, created by renowned scientists and dignitaries from Crans-Montana, Switzerland, wishes to stimulate reflection and support pioneering and innovative projects beyond the known boundaries of culture and science.
Benefitting from the establishment of a favorable environment for the creation of technology companies, the Foundation aims to position itself as key promoter of ideas and innovative endeavors within a landscape of “Culture and Science” that is still being shaped.
Several initiatives, including our annual global platform launched in the spring of 2017, are helping to create a community that brings together researchers, celebrities in the world of culture and the arts, as well as investors and entrepreneurs from Switzerland and across the globe.
About EPFL
EPFL, one of the two Swiss Federal Institutes of Technology, based in Lausanne, is Europe’s most cosmopolitan technical university with students, professors and staff from over 120 nations. A dynamic environment, open to Switzerland and the world, EPFL is centered on its three missions: teaching, research and technology transfer. EPFL works together with an extensive network of partners including other universities and institutes of technology, developing and emerging countries, secondary schools and colleges, industry and economy, political circles and the general public, to bring about real impact for society.
About Twist Bioscience
At Twist Bioscience, our expertise is accelerating science and innovation by leveraging the power of scale. We have developed a proprietary semiconductor-based synthetic DNA manufacturing process featuring a high throughput silicon platform capable of producing synthetic biology tools, including genes, oligonucleotide pools and variant libraries. By synthesizing DNA on silicon instead of on traditional 96-well plastic plates, our platform overcomes the current inefficiencies of synthetic DNA production, and enables cost-effective, rapid, high-quality and high throughput synthetic gene production, which in turn, expedites the design, build and test cycle to enable personalized medicines, pharmaceuticals, sustainable chemical production, improved agriculture production, diagnostics and biodetection. We are also developing new technologies to address large scale data storage. For more information, please visit www.twistbioscience.com. Twist Bioscience is on Twitter. Sign up to follow our Twitter feed @TwistBioscience at https://twitter.com/TwistBioscience.

If you hadn’t read the EPFL press release first, it might have taken a minute to figure out why EPFL is being mentioned in the Twist Bioscience news release. Presumably someone was rushing to make a deadline. Ah well, I’ve seen and written worse.

I haven’t been able to find any video or audio recordings of the DNA-preserved performances but there is an informational video (originally published July 7, 2016) from Microsoft and the University of Washington describing the DNA-based technology,

I also found this description of listening to the DNA-preserved music in an Oct. 6, 2017 blog posting for the Canadian Broadcasting Corporation’s (CBC) Day 6 radio programme,

To listen to them, one must first suspend the DNA holding the songs in a solution. Next, one can use a DNA sequencer to read the letters of the bases forming the molecules. Then, algorithms can determine the digital code those letters form. From that code, comes the music.

It’s complicated but Ceze says his team performed this process without error.

You can find out more about UNESCO’s Memory of the World and its register here , more about the EPFL+ECAL Lab here, and more about Twist Bioscience here.

Machine learning programs learn bias

The notion of bias in artificial intelligence (AI)/algorithms/robots is gaining prominence (links to other posts featuring algorithms and bias are at the end of this post). The latest research concerns machine learning where an artificial intelligence system trains itself with ordinary human language from the internet. From an April 13, 2017 American Association for the Advancement of Science (AAAS) news release on EurekAlert,

As artificial intelligence systems “learn” language from existing texts, they exhibit the same biases that humans do, a new study reveals. The results not only provide a tool for studying prejudicial attitudes and behavior in humans, but also emphasize how language is intimately intertwined with historical biases and cultural stereotypes. A common way to measure biases in humans is the Implicit Association Test (IAT), where subjects are asked to pair two concepts they find similar, in contrast to two concepts they find different; their response times can vary greatly, indicating how well they associated one word with another (for example, people are more likely to associate “flowers” with “pleasant,” and “insects” with “unpleasant”). Here, Aylin Caliskan and colleagues developed a similar way to measure biases in AI systems that acquire language from human texts; rather than measuring lag time, however, they used the statistical number of associations between words, analyzing roughly 2.2 million words in total. Their results demonstrate that AI systems retain biases seen in humans. For example, studies of human behavior show that the exact same resume is 50% more likely to result in an opportunity for an interview if the candidate’s name is European American rather than African-American. Indeed, the AI system was more likely to associate European American names with “pleasant” stimuli (e.g. “gift,” or “happy”). In terms of gender, the AI system also reflected human biases, where female words (e.g., “woman” and “girl”) were more associated than male words with the arts, compared to mathematics. In a related Perspective, Anthony G. Greenwald discusses these findings and how they could be used to further analyze biases in the real world.

There are more details about the research in this April 13, 2017 Princeton University news release on EurekAlert (also on ScienceDaily),

In debates over the future of artificial intelligence, many experts think of the new systems as coldly logical and objectively rational. But in a new study, researchers have demonstrated how machines can be reflections of us, their creators, in potentially problematic ways. Common machine learning programs, when trained with ordinary human language available online, can acquire cultural biases embedded in the patterns of wording, the researchers found. These biases range from the morally neutral, like a preference for flowers over insects, to the objectionable views of race and gender.

Identifying and addressing possible bias in machine learning will be critically important as we increasingly turn to computers for processing the natural language humans use to communicate, for instance in doing online text searches, image categorization and automated translations.

“Questions about fairness and bias in machine learning are tremendously important for our society,” said researcher Arvind Narayanan, an assistant professor of computer science and an affiliated faculty member at the Center for Information Technology Policy (CITP) at Princeton University, as well as an affiliate scholar at Stanford Law School’s Center for Internet and Society. “We have a situation where these artificial intelligence systems may be perpetuating historical patterns of bias that we might find socially unacceptable and which we might be trying to move away from.”

The paper, “Semantics derived automatically from language corpora contain human-like biases,” published April 14  [2017] in Science. Its lead author is Aylin Caliskan, a postdoctoral research associate and a CITP fellow at Princeton; Joanna Bryson, a reader at University of Bath, and CITP affiliate, is a coauthor.

As a touchstone for documented human biases, the study turned to the Implicit Association Test, used in numerous social psychology studies since its development at the University of Washington in the late 1990s. The test measures response times (in milliseconds) by human subjects asked to pair word concepts displayed on a computer screen. Response times are far shorter, the Implicit Association Test has repeatedly shown, when subjects are asked to pair two concepts they find similar, versus two concepts they find dissimilar.

Take flower types, like “rose” and “daisy,” and insects like “ant” and “moth.” These words can be paired with pleasant concepts, like “caress” and “love,” or unpleasant notions, like “filth” and “ugly.” People more quickly associate the flower words with pleasant concepts, and the insect terms with unpleasant ideas.

The Princeton team devised an experiment with a program where it essentially functioned like a machine learning version of the Implicit Association Test. Called GloVe, and developed by Stanford University researchers, the popular, open-source program is of the sort that a startup machine learning company might use at the heart of its product. The GloVe algorithm can represent the co-occurrence statistics of words in, say, a 10-word window of text. Words that often appear near one another have a stronger association than those words that seldom do.

The Stanford researchers turned GloVe loose on a huge trawl of contents from the World Wide Web, containing 840 billion words. Within this large sample of written human culture, Narayanan and colleagues then examined sets of so-called target words, like “programmer, engineer, scientist” and “nurse, teacher, librarian” alongside two sets of attribute words, such as “man, male” and “woman, female,” looking for evidence of the kinds of biases humans can unwittingly possess.

In the results, innocent, inoffensive biases, like for flowers over bugs, showed up, but so did examples along lines of gender and race. As it turned out, the Princeton machine learning experiment managed to replicate the broad substantiations of bias found in select Implicit Association Test studies over the years that have relied on live, human subjects.

For instance, the machine learning program associated female names more with familial attribute words, like “parents” and “wedding,” than male names. In turn, male names had stronger associations with career attributes, like “professional” and “salary.” Of course, results such as these are often just objective reflections of the true, unequal distributions of occupation types with respect to gender–like how 77 percent of computer programmers are male, according to the U.S. Bureau of Labor Statistics.

Yet this correctly distinguished bias about occupations can end up having pernicious, sexist effects. An example: when foreign languages are naively processed by machine learning programs, leading to gender-stereotyped sentences. The Turkish language uses a gender-neutral, third person pronoun, “o.” Plugged into the well-known, online translation service Google Translate, however, the Turkish sentences “o bir doktor” and “o bir hem?ire” with this gender-neutral pronoun are translated into English as “he is a doctor” and “she is a nurse.”

“This paper reiterates the important point that machine learning methods are not ‘objective’ or ‘unbiased’ just because they rely on mathematics and algorithms,” said Hanna Wallach, a senior researcher at Microsoft Research New York City, who was not involved in the study. “Rather, as long as they are trained using data from society and as long as society exhibits biases, these methods will likely reproduce these biases.”

Another objectionable example harkens back to a well-known 2004 paper by Marianne Bertrand of the University of Chicago Booth School of Business and Sendhil Mullainathan of Harvard University. The economists sent out close to 5,000 identical resumes to 1,300 job advertisements, changing only the applicants’ names to be either traditionally European American or African American. The former group was 50 percent more likely to be offered an interview than the latter. In an apparent corroboration of this bias, the new Princeton study demonstrated that a set of African American names had more unpleasantness associations than a European American set.

Computer programmers might hope to prevent cultural stereotype perpetuation through the development of explicit, mathematics-based instructions for the machine learning programs underlying AI systems. Not unlike how parents and mentors try to instill concepts of fairness and equality in children and students, coders could endeavor to make machines reflect the better angels of human nature.

“The biases that we studied in the paper are easy to overlook when designers are creating systems,” said Narayanan. “The biases and stereotypes in our society reflected in our language are complex and longstanding. Rather than trying to sanitize or eliminate them, we should treat biases as part of the language and establish an explicit way in machine learning of determining what we consider acceptable and unacceptable.”

Here’s a link to and a citation for the Princeton paper,

Semantics derived automatically from language corpora contain human-like biases by Aylin Caliskan, Joanna J. Bryson, Arvind Narayanan. Science  14 Apr 2017: Vol. 356, Issue 6334, pp. 183-186 DOI: 10.1126/science.aal4230

This paper appears to be open access.

Links to more cautionary posts about AI,

Aug 5, 2009: Autonomous algorithms; intelligent windows; pretty nano pictures

June 14, 2016:  Accountability for artificial intelligence decision-making

Oct. 25, 2016 Removing gender-based stereotypes from algorithms

March 1, 2017: Algorithms in decision-making: a government inquiry in the UK

There’s also a book which makes some of the current use of AI programmes and big data quite accessible reading: Cathy O’Neil’s ‘Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy’.

Simon Fraser University (Vancouver, Canada) and its president’s (Andrew Petter) dream colloquium: big data

They have a ‘big data’ start to 2016 planned for the President’s (Andrew Petter at Simon Fraser University [SFU] in Vancouver, Canada) Dream Colloquium according to a Jan. 5, 2016 news release,

Big data explained: SFU launches spring 2016 President’s Dream Colloquium

Speaker series tackles history, use and implications of collecting data


Canadians experience and interact with big data on a daily basis. Some interactions are as simple as buying coffee or as complex as filling out the Canadian government’s mandatory long-form census. But while big data may be one of the most important technological and social shifts in the past five years, many experts are still grappling with what to do with the massive amounts of information being gathered every day.


To help understand the implications of collecting, analyzing and using big data, Simon Fraser University is launching the President’s Dream Colloquium on Engaging Big Data on Tuesday, January 5.


“Big data affects all sectors of society from governments to businesses to institutions to everyday people,” says Peter Chow-White, SFU Associate Professor of Communication. “This colloquium brings together people from industry and scholars in computing and social sciences in a dialogue around one of the most important innovations of our time next to the Internet.”


This spring marks the first President’s Dream Colloquium where all faculty and guest lectures will be available to the public. The speaker series will give a historical overview of big data, specific case studies in how big data is used today and discuss what the implications are for this information’s usage in business, health and government in the future.


The series includes notable guest speakers such as managing director of Microsoft Research, Surajit Chaudhuri, and Tableau co-founder Pat Hanrahan.  


“Pat Hanrahan is a leader in a number of sectors and Tableau is a leader in accessing big data through visual analytics,” says Chow-White. “Rather than big data being available to only a small amount of professionals, Tableau makes it easier for everyday people to access and understand it in a visual way.”


The speaker series is free to attend with registration. Lectures will be webcast live and available on the President’s Dream Colloquium website.



  • By 2020, over 1/3 of all data will live in or pass through the cloud.
  • Data production will be 44 times greater in 2020 than it was in 2009.
  • More than 70 percent of the digital universe is generated by individuals. But enterprises have responsibility for the storage, protection and management of 80 percent of that.

(Statistics provided by CSC)




The course features lectures from notable guest speakers including:

  • Sasha Issenberg, Author and Journalist
    Tuesday, January 12, 2016
  • Surajit ChaudhuriScientist and Managing Director of XCG (Microsoft Research)
    Tuesday, January 19, 2016
  • Pat Hanrahan, Professor at the Stanford Computer Graphics Laboratory, Cofounder and Chief Scientist of Tableau, Founding member of Pixar
    Wednesday, February 3, 2016
  • Sheelagh Carpendale, Professor of Computing Science University of Calgary, Canada Research Chair in Information Visualization
    Tuesday, February 23, 2016, 3:30pm
  • Colin HillCEO of GNS Healthcare
    Tuesday, March 8, 2016
  • Chad Skelton, Award-winning Data Journalist and Consultant
    Tuesday, March 22, 2016

Not to worry, even though the first talk with Sasha Issenberg and Mark Pickup (strangely, he’s [Pickup is an SFU professor of political science] not mentioned in the news release or on the event page) has taken place, a webcast is being posted to the event page here.

I watched the first event live (via a livestream webcast which I accessed by clicking on the link found on the Event’s Speaker’s page) and found it quite interesting although I’m not sure about asking Issenberg to speak extemporaneously. He rambled and offered more detail about things that don’t matter much to a Canadian audience. I couldn’t tell if part of the problem might lie with the fact that his ‘big data’ book (The Victory Lab: The Secret Science of Winning Campaigns) was published a while back and he’s since published one on medical tourism and is about to publish one on same sex marriages and the LGBTQ communities in the US. As someone else who moves from topic to topic, I know it’s an effort to ‘go back in time’ and to remember the details and to recapture the enthusiasm that made the piece interesting.  Also, he has yet to get the latest scoop on big data and politics in the US as embarking on the 2016 campaign trail won’t take place until sometime later in January.

So, thanks to Issenberg for managing to dredge up as much as he did. Happily, he did recognize that there are differences between Canada and the US and the type of election data that is gathered and other data that can accessed. He provided a capsule version of the data situation in the US where they can identify individuals and predict how they might vote, while Pickup focused on the Canadian scene. As one expects from Canadian political parties and Canadian agencies in general, no one really wants to share how much information they can actually access (yes, that’s true of the Liberals and the NDP [New Democrats] too). By contrast, political parties and strategists in the US quite openly shared information with Issenberg about where and how they get data.

Pickup made some interesting points about data and how more data does not lead to better predictions. There was one study done on psychologists which Pickup replicated with undergraduate political science students. The psychologists and the political science students in the two separate studies were given data and asked to predict behaviour. They were then given more data about the same individuals and asked again to predict behaviour. In all. there were four sessions where the subjects were given successively more data and asked to predict behaviour based on that data. You may have already guessed but prediction accuracy decreased each time more information was added. Conversely, the people making the predictions became more confident as their predictive accuracy declined. A little disconcerting, non?

Pickup made another point noting that it may be easier to use big data to predict voting behaviour in a two-party system such as they have in the US but a multi-party system such as we have in Canada offers more challenges.

So, it was a good beginning and I look forward to more in the coming weeks (President’s Dream Colloquium on Engaging Big Data). Remember if you can’t listen to the live session, just click through to the event’s speaker’s page where they have hopefully posted the webcast.

The next dream colloquium takes place Tuesday, Jan. 19, 2016,

Big Data since 1854

Dr. Surajit Chaudhuri, Scientist and Managing Director of XCG (Microsoft Research)
Standford University, PhD
Tuesday, January 19, 2016, 3:30–5 pm
IRMACS Theatre, ASB 10900, Burnaby campus [or by webcast[


Digital life in Estonia and the National Film Board of Canada’s ‘reclaim control of your online identity’ series

Internet access is considered a human right in Estonia (according to a July 1, 2008 story by Colin Woodard for the Christian Science Monitor). That commitment has led to some very interesting developments in Estonia which are being noticed internationally. The Woodrow Wilson International Center for Scholars (Wilson Center) is hosting the president of Estonia, Toomas Hendrik Ilves at an April 21, 2015 event (from the April 15, 2015 event invitation),

The Estonia Model: Why a Free and Secure Internet Matters
After regaining independence in 1991, the Republic of Estonia built a new government from the ground up. The result was the world’s most comprehensive and efficient ‘e-government’: a digital administration with online IDs for every citizen, empowered by a free nationwide Wi-Fi network and a successful school program–called Tiger Leap–that boosts tech competence at every age level. While most nations still struggle to provide comprehensive Internet access, Estonia has made major progress towards a strong digital economy, along with robust protections for citizen rights. E-government services have made Estonia one of the world’s most attractive environments for tech firms and start-ups, incubating online powerhouses like Skype and Transferwise.

An early adopter of information technology, Estonia was also one of the first victims of a cyber attack. In 2007, large-scale Distributed Denial of Service attacks took place, mostly against government websites and financial services. The damages of these attacks were not remarkable, but they did give the country’s security experts  valuable experience and information in dealing with such incidents. Eight years on, the Wilson Center is pleased to welcome Estonia’s President Toomas Hendrik Ilves for a keynote address on the state of cybersecurity, privacy, and the digital economy. [emphasis mine]

The Honorable Jane Harman
Director, President and CEO, The Wilson Center

His Excellency Toomas Hendrik Ilves
President of the Republic of Estonia

The event is being held in Washington, DC from 1 – 2 pm EST on April 21, 2015. There does not seem to be a webcast option for viewing the presentation online (a little ironic, non?). You can register here, should you be able to attend.

I did find a little more information about Estonia and its digital adventures, much of it focused on digital economy, in an Oct. 8, 2014 article by Lily Hay Newman for Slate,

Estonia is planning to be the first country to offer a status called e-residency. The program’s website says, “You can become an e-Estonian!” …

The website says that anyone can apply to become an e-resident and receive an e-Estonian online identity “in order to get secure access to world-leading digital services from wherever you might be.” …

You can’t deny that the program has a compelling marketing pitch, though. It’s “for anybody who wants to run their business and life in the most convenient aka digital way!”

You can find the Estonian e-residency website here. There’s also a brochure describing the benefits,

It is especially useful for entrepreneurs and others who already have some relationship to Estonia: who do business, work, study or visit here but have not become a resident. However, e-residency is also launched as a platform to offer digital services to a global audience with no prior Estonian affiliation – for  anybody  who  wants  to  run their  business  and  life in  the  most convenient aka digital way! We plan to keep adding new useful services from early 2015 onwards.

I also found an Oct. 31, 2013 blog post by Peter Herlihy on the gov.uk website for the UK’s Government Digital Service (GDS). Herlihy offers the perspective of a government bureaucrat (Note: A link has been removed),

I’ve just got back from a few days in the Republic of Estonia, looking at how they deliver their digital services and sharing stories of some of the work we are up to here in the UK. We have an ongoing agreement with the Estonian government to work together and share knowledge and expertise, and that is what brought me to the beautiful city of Tallinn.

I knew they were digitally sophisticated. But even so, I wasn’t remotely prepared for what I learned.

Estonia has probably the most joined up digital government in the world. Its citizens can complete just about every municipal or state service online and in minutes. You can formally register a company and start trading within 18 minutes, all of it from a coffee shop in the town square. You can view your educational record, medical record, address, employment history and traffic offences online – and even change things that are wrong (or at least directly request changes). The citizen is in control of their data.

So we should do whatever they’re doing then, right? Well, maybe. …

National Film Board of Canada

There’s a new series being debuted this week about reclaiming control of your life online and titled: Do Not Track according to an April 14, 2015 post on the National Film Board of Canada (NFB) blog (Note: Links have been removed),

An eye-opening personalized look at how online data is being tracked and sold.

Starting April 14 [2015], the online interactive documentary series Do Not Track will show you just how much the web knows about you―and the results may astonish you.

Conceived and directed by acclaimed Canadian documentary filmmaker and web producer Brett Gaylor, the 7-part series Do Not Track is an eye-opening look at how online behaviour is being tracked, analyzed and sold―an issue affecting each of us, and billions of web users around the world.

Created with the goal of helping users learn how to take back control of their digital identity, Do Not Track goes beyond a traditional documentary film experience: viewers who agree to share their personal data are offered an astounding real-time look at how their online ID is being tracked.

Do Not Track is a collective investigation, bringing together public media broadcasters, writers, developers, thinkers and independent media makers, including Gaylor, Vincent Glad, Zineb Dryef, Richard Gutjahr, Sandra Rodriguez, Virginie Raisson and the digital studio Akufen.

Do Not Track episodes launch every 2 weeks, from April 14 to June 9, 2015, in English, French and German. Roughly 7 minutes in length, each episode has a different focus―from our mobile phones to social networks, targeted advertising to big data with a different voice and a different look, all coupled with sharp and varied humour. Episodes are designed to be clear and accessible to all.

You can find Do Not Track here, episode descriptions from the April 14, 2015 posting,

April 14 | Episode 1: Morning Rituals
This episode introduces viewers to Brett Gaylor and offers a call to action: let’s track the trackers together.

Written and directed by Brett Gaylor

Interviews: danah boyd, principal researcher, Microsoft Research; Nathan Freitas, founder, and Harlo Holmes, software developer, The Guardian Project; Ethan Zuckerman, director, MIT Center for Civic Media*

April 14 | Episode 2: Breaking Ad
We meet the man who invented the Internet pop-up ad―and a woman who’s spent nearly a decade reporting on the web’s original sin: advertising.

Directed by Brett Gaylor | Written by Vincent Glad

Interviews: Ethan Zuckerman; Julia Angwin, journalist and author of Dragnet Nation: A Quest for Privacy, Security, and Freedom in a World of Relentless Surveillance*

April 28 | Episode 3: The Harmless Data We Leave on Social Media
This episode reveals how users can be tracked from Facebook activity and how far-reaching the data trail is.

Directed by Brett Gaylor | Written by Sandra Marsh | Hosted by Richard Gutjahr

Interviews: Constanze Kurz, writer and computer scientist, Chaos Computer Club

May 12 | Episode 4: Your Mobile Phone, the Spy
Your smartphone is spying on you—where does all this data go, what becomes of it, and how is it used?

Directed by Brett Gaylor | Written and hosted by Zineb Dryef

Interviews: Harlo Holmes; Rand Hindi, data scientist and founder of Snips*

May 26 | Episode 5: Big Data and Its Algorithms
There’s an astronomical quantity of data that may or may not be used against us. Based on the information collected since the start of this documentary, users discover the algorithmic interpretation game and its absurdity.

Directed by Sandra Rodriguez and Akufen | Written by Sandra Rodriguez

Interviews: Kate Crawford, principal researcher, Microsoft Research New York City; Matthieu Dejardins, e-commerce entrepreneur and CEO, NextUser; Tyler Vigen, founder, Spurious Correlations, and Joint Degree Candidate, Harvard Law School; Cory Doctorow, science fiction novelist, blogger and technology activist; Alicia Garza, community organizer and co-founder, #BlackLivesMatter; Yves-Alexandre De Montjoye, computational privacy researcher, Massachusetts Institute of Technology Media Lab*

June 9 | Episode 6: Filter Bubble
The Internet uses filters based on your browsing history, narrowing down the information you get―until you’re painted into a digital corner.

Written and directed by Brett Gaylor*

June 9 | Episode 7:  The Future of Tracking
Choosing to protect our privacy online today will dramatically shape our digital future. What are our options?

Directed by Brett Gaylor | Written by Virginie Raisson

Interviews: Cory Doctorow