Book Review: AIQ – How People and Machines Are Smarter Together

41zD70meCHL._SX328_BO1,204,203,200_.jpg

The book is written by Nick Polson and James Scott, professors at the Chicago Booth School of Business and the University of Texas at Austin, respectively. Overall, authors are optimistic about the use of AI, and they share that these technologies will bring immense benefits. But they will also, inevitably, reflect our weak spots as a civilization. As a result, there will be dangers to watch out for, whether to privacy, to equality, to existing institutions, or to something nobody has even thought of yet. As the authors state, we must meet these dangers with smart policy.

The twenty main takeaways I got out of this book are:

When you hear ´AI´, don’t think of a droid. Think of an algorithm. There are two distinctive features of the algorithms used in AI. First, these algorithms typically deal with probabilities rather than certainties. Second, there is the question of how these algorithms ´know´ what instructions to follow. In AI, the role of the programmer isn’t to tell the algorithm what to do. It is to tell the algorithm how to train itself what to do, using data and the rules of probability.
Most of the big ideas in AI are actually old – in many cases, century old – and our ancestors have been using them to solve problems for generations. So, in many ways, the big historical puzzle is not why AI is happening now, but why it did not happen long time ago. There are three technological forces that explain this: 1) the decades-long exponential growth in the speed of computers, usually know as Moore´s law; 2) the new Moore´s law associated to the explosive growth in the amount of data available (more data means smarter algorithms); and 3) cloud computing, which is nearly invisible to consumers, but which has had an enormous democratizing effect on AI. When you put these three trends, together with good ideas, you get a supernova-like explosion in both demand and capacity for using AI to solve real problems.
Availability heuristic – it is the mental shortcut in which people evaluate the plausibility of a claim by relying on whatever immediate examples happen to pop into their minds. In the case of AI, those examples are mostly from science fiction, and they are mostly evil.
The key algorithms of the future are about suggestions, not search. To a learning machine, ´personalization´means ´conditional probability´. And in math, a conditional probability is the chance that one thing happens, given that some other thing has already happened. Personalization runs on conditional probabilities, all of which must be estimated from massive data sets in which you are the conditioning event. If the future of digital life is about suggestions rather than search, then the future is also, inevitably, about conditional probability.
The real problem that Netflix faces nowadays can be explained in three main veins: 1) scale (its ratings matrix has more than a trillion possible entries); 2) missingness (most subscribers have not watched most films), and 3) combinatorial explosion (the varieties of film-liking experience are, for all practical purposes, infinite). The solution to all three issues is careful modeling.
Every citizen in the twenty-first century must understand some basic facts about AI and data science. If education, as Thomas Jefferson said, is the cornerstone of democracy, then when it comes to digital technology, our democratic walls are falling down.
Patient-centered social networks like Chrohnology, Traitos, or PatientsLikeMe, run on personalization algorithms, too, just like Netflix and Facebook. Patients see them as an important resource for suggestions about treatments and lifestyle changes, while researchers see them as valuable repository of real-world medical data that can be used to make those suggestions even better. Perhaps the most exciting work of all is happening in cancer research, specifically something called ´targeted therapy´- it is now common for doctors to test a sample of a patient’s tumor for specific genes and proteins and to choose a cancer drug accordingly.
Whatever the field of knowledge, being smart means knowing lots of patterns – knowing how to match an input with the appropriate output. In AI, a pattern is a prediction rule that maps an input to an expected output, and a ´learning pattern´means fitting a good prediction rule to a data set.
A ´neural network´ is nothing more but a brilliant piece of marketing. In reality, it is just a very complicated equation that is capable of describing very complicated patterns in data – that is, very complicated mappings from inputs to outputs. There are four factors driving this breakthrough: 1) massive models; 2) massive data (which means lots of experience); 3) trial and error, a million times per second (such model-fitting strategy is used everywhere – it allows large retailers, for example, to predict what you want to buy online, before you even want to buy it); and 4) deep learning (a ´deep neural network´- which is just a complicated equation with lots of parameters, structured in a way that extracts as much information as possible from a specific kind of input).
In AI, SLAM stands for ´simultaneous localization and mapping´. This is a crucial concept to understand self-driving cars, which use the Bayes´s rule to operate. All probabilities are really conditional probabilities. In other words, all probabilities are contingent upon what we know. When our knowledge changes, our probabilities must change, too – and Bayes´s rule tells us how to change them. SLAM is an inherently Bayesian problem – a new sensor data arrives and a robot car must update its ´mental map´of the surrounding environment (lane markers, intersections, traffic lights, stop signs, and all other vehicles on the road). Has the last Californian to hold a drivers license already been born?
Bayes´s rule is far bigger – in fact, in terms of its applicability to every day life, it is one of the most powerful equations ever discovered. It is a perfect mathematical dose of antidogmatism that tells us when to be skeptical and when to be open-minded… You might never in your life actually sit down with pencil and paper to work through the math of Bayes´s rule, and that is totally fine. The point is that, even if you don’t, learning to think about the world a bit like a Bayesian car – in terms of priors, data, and how to combine them – can help you be a wiser person (getting a second opinion on a personal medical diagnosis is a good example of that). Biologists use it to help understand the roles of our genes in explaining cancer; astronomers use it to find planets orbiting other stars in the outer fringes of our galaxy; it is used to detect doping at the Olympics; to filter spam from your inbox; and to help quadriplegics control robot arms directly with their minds – just like Luke Skywalker.
Experts in AI use the term ´natural language processing´, or NLP, to describe how we get computers to work with language. So, if you want to understand a future with language-aware AI systems, then the interesting question is not about the sometimes laughable mistakes these machines make. Rather, it is how they have learned a lot to listen, to speak, and even write so effectively… As a field, NLP shifted its focus from understanding to mimicry – from knowing how, to knowing what. Language became a prediction-rule problem based on input-output pairs.
Jorge Luis Borges once wrote the story called ´The Library of Babel´, about a library whose books contained all possible works of prose (all possible ordering of letters of the alphabet and the basic punctuation marks). Our real-life Library of Babel is called the internet, and while we are not in Borges territory yet, we are getting closer… This is not because your phone somehow understand the meaning of words. There is no meaning involved, just a rich set of context-specific probabilities for basically every English word and phrase ever uttered on the internet… Google Ngram Viewer lets you track the popularity of any word or short phrase across all published books in English.
Google´s word2vec model provides a numerical ´vector´description of every word in English. If you understand word2vec, then you understand one of the most fundamental AI ideas of the last decade. Even systems that don’t use this algorithm directly still use the same underlying approach. It answers a simple question: How do we turn words into numbers, so that words with a similar meaning have similar numbers? Once you you have turned words into vectors, you can do math with them. This is essential to building AI systems for language. Computers do not understand words, but they understand math.
Talking machines do not give us a window of some new kind of linguistic genius. They hold a mirror up to our own. Language models will become personalized – the machines around you will adapt to the way you speak, just as they adapt to your movie-watching preferences.
Smart cities: Big N, Big D – The Mayor´s Office of Data Analytics (MODA) in New York City presents an important fact about the intersection of big data with AI. Big data sets do not get that way merely because of the number of data points they have, but also because of a ´big D,´associated to the number of details recorded about each data point.
We are likely still years away from seeing the most advanced AI technologies help real patients in substantial numbers, and the reasons have nothing to do with science or computing power and everything to do with culture, incentives and bureaucracy… Cancer and kidney disease have no nationality; but there is a word for bureaucracy in every language.
The entire system of medical data science was designed only to address questions a the level of a population, but is nearly silent in response to basic statistical questions at the level of an individual patient. For some image-driven diagnoses, you soon may not even need to visit a doctor’s office.
Remote medicine – think of hooking up a cheap stethoscope to your phone, so that a neural network can listen to your heartbeat. Or of staring into the camera to allow an algorithm in the cloud to scan your eyes for symptoms of eye disease… A new generation of AI-based remote medicine wearable sensors could boost the effectiveness of AI-based remote medicine even further… These technologies will not replace sophisticated laboratory diagnostics, and they certainly will not replace in-person care by a highly trained doctor. But for a nontrivial range of conditions, they could recommend simple treatments and funnel you to a doctor if and when you really need one.
Besides technology barriers, there are many cultural barriers that must be addressed for AI in healthcare to fully happen: incentives; data sharing – the best data scientists of our generation could have been working on healthcare for many years now. Many would love to, and to give away the wonders they create. Instead, they have been thinking up better ways to make you click on ads – because that is where data is. Privacy is also a critical issue.

When it comes to the important decisions in life, we can and should combine artificial intelligence with human insight and human values. All it takes is people and machines to work together.

Book Review: AIQ – How People and Machines Are Smarter Together

Share this:

Comments

Leave a comment Cancel reply