Natural Language Processing with Deep Learning

Communicating and understanding are usually taken as a sign of intelligence and is part of the Turing test. Indeed, the machine needs to communicate and appear to understand the questions of the interrogator to appear human and pass the test.  Natural Language Processing (NLP) has made great progress in the past 20 years. Stanford has an excellent course (CS224d) on Natural Language Processing with Deep Learning taught by Chris Manning and Richard Socher (now at Salesforce). It is available here:

The course material is also online here.

An interesting criticism of machine translation engines such as Google Translate (and that use techniques taught in these NLP lectures) appears in the article The Shallowness of Google Translate from the Atlantic.

Machine Learning: an Applied Econometric Approach

Susan Athey’s article discussed machine learning and causal inference. The article Machine learning: an applied econometric approach by Harvard Professor Sendhil Mullainathan and Jann Spiess is focused on machine learning as an econometric tool.

Abstract:

Machines are increasingly doing “intelligent” things. Face recognition algorithms use a large dataset of photos labeled as having a face or not to estimate a function that predicts the presence y of a face from pixels x. This similarity to econometrics raises questions: How do these new empirical tools fit with what we know? As empirical economists, how can we use them? We present a way of thinking about machine learning that gives it its own place in the econometric toolbox. Machine learning not only provides new tools, it solves a different problem. Specifically, machine learning revolves around the problem of prediction, while many economic applications revolve around parameter estimation. So applying machine learning to economics requires finding relevant tasks. Machine learning algorithms are now technically easy to use: you can download convenient packages in R or Python. This also raises the risk that the algorithms are applied naively or their output is misinterpreted. We hope to make them conceptually easier to use by providing a crisper understanding of how these algorithms work, where they excel, and where they can stumble—and thus where they can be most usefully applied.

Sendhil also recently gave an interesting talk at the Stanford Center on Global Poverty and Development on applying machine learning to poverty alleviation:

The Impact of Machine Learning on Economics

Stanford Professor Susan Athey wrote a very detailed survey on the impact of machine learning on economics.

Abstract:

This paper provides an assessment of the early contributions of machine learning to economics, as well as predictions about its future contributions. It begins by briefly overviewing some themes from the literature on machine learning, and then draws some contrasts with traditional approaches to estimating the impact of counterfactual policies in economics. Next, we review some of the initial “off-the-shelf” applications of machine learning to economics, including applications in analyzing text and images. We then describe new types of questions that have been posed surrounding the application of machine learning to policy problems, including “prediction policy problems,” as well as considerations of fairness and manipulability. We present some highlights from the emerging econometric literature combining machine learning and causal inference. Finally, we overview a set of broader predictions about the future impact of machine learning on economics, including its impacts on the nature of collaboration, funding, research tools, and research questions.

She gave a talk at a 2017 NBER conference on the economics of AI:

Machina Economicus

This interesting 2015 paper by Professors Parkes (Harvard) and Wellman (U. of Michigan) discusses some possible synthesis of economic reasoning and artificial intelligence. The abstract:

The field of artificial intelligence (AI) strives to build rational agents, capable of perceiving the world around them and taking actions to advance specified goals. Put another way, AI researchers aim to construct a synthetic homo economicus, the mythical perfectly rational agent of neoclassical economics.
We review progress towards creating this new species of machine, machina economicus, and discuss some challenges in designing AIs that can reason effectively in economic contexts. Supposing that AI succeeds in this quest, or at least comes close enough that it is useful to think about AIs in rationalistic terms, we ask how to design the rules of interaction in multi-agent systems that come to represent an economy of AIs. Theories of normative design from economics may prove more relevant for artificial agents than human agents, with AIs that better respect idealized assumptions of rationality than people, interacting through novel rules and incentive systems quite distinct from those tailored for people.

Indeed, economics often assumes rational economic reasoning (e.g. maximize some utility function under some income constraints) as a  good first approximation of human behavior and AI agents could be the closest to these idealized rational agents. In AI however, the agents are taking inputs and produce outputs without optimizing a utility function. They do minimize a loss function when they try to fit a machine learning model but once in production they mechanically use trained models to produce outcomes that the agent designer (a human) has ordered. The utility function of the agent inherited from the designer could be as irrational as the utility function of a human being. For instance, AI agents could easily reproduce the tragedy of the commons if the designer only optimizes the individual AI agent strategy and ignores the negative externalities.

History of Deep Learning

This paper presents a history of deep learning from Aristotle to the present time. The different milestones are summarized in this table:

So it is clear that many ideas date to several decades ago. In the article, the authors conclude:

This paper could serve two goals: 1) First, it documents the major milestones in the science history that have impacted the current development of deep learning. These milestones are not limited to the development in computer science fields. 2) More importantly, by revisiting the evolutionary path of the major milestone, this paper should be able to suggest the readers that how these remarkable works are developed among thousands of other contemporaneous publications. Here we briefly summarize three directions that many of these milestones pursue:

  • Occam’s razor: While it seems that part of the society tends to favor more complex models by layering up one architecture onto another and hoping backpropagation can find the optimal parameters, history says that masterminds tend to think simple: Dropout is widely recognized not only because of its performance, but more because of its simplicity in implementation and intuitive (tentative) reasoning. From Hopfield Network to Restricted Boltzmann Machine, models are simplified along the iterations until when RBM is ready to be piled-u
  • Be ambitious: If a model is proposed with substantially more parameters than contemporaneous ones, it must solve a problem that no others can solve nicely to be remarkable. LSTM is much more complex than traditional RNN, but it bypasses the vanishing gradient problem nicely. Deep Belief Network is famous not due to the fact the they are the first one to come up with the idea of putting one RBM onto another, but due to that they come up an algorithm that allow deep architectures to be trained effectively.
  • Widely read: Many models are inspired by domain knowledge outside of machine learning or statistics field. Human visual cortex has greatly inspired the development of convolutional neural networks. Even the recent popular Residual Networks can find corresponding mechanism in human visual cortex. Generative Adversarial Network can also find some connection with game theory, which was developed fifty years ago.

Coming from the field of economics and game theory, we cannot agree more especially when we read the literature of reinforcement learning (RL) or generative adversarial network. Once we talk about strategic agents with objectives and payoffs to maximize it is very similar to economics. There are differences between economics and machine learning in the approach of solving these problems and we will discuss some research that studies them.

Why Should I Trust You?

A challenge of complex machine learning models is to develop trust in the models. If it is a black box some users might not be feel comfortable using them. Models need to be interpretable, meaning that users should be able to understand how the outputs (predictions) are generated from the inputs (features).

Different approaches have been suggested. A recent one is a technique called Local Interpretable Model-agnostic Explanations (LIME). LIME approximates a model with an interpretable model locally. An interpretable model is a model such as linear models with a limited number of features.

A short video introduces the approach.

You can read the paper here.

 

Gradient Boosting Machine Learning

Machine learning has a long list of methods to learn from data. Among them is gradient boosting machine learning as taught here by Professor Trevor Hastie from Stanford University.  In this video, he introduces and compares decision trees, bagging, random forests and boosting.

He has authored an excellent book, The Elements of Statistical Learning than you can download here.

AI Code

The UK House of Lords recently published a report on “the economic, ethical and social implications of advances in artificial intelligence.” It suggested an AI Code to reassure the public that AI will not undermine it. The principles are:

(1) Artificial intelligence should be developed for the common good and benefit of humanity.

(2) Artificial intelligence should operate on principles of intelligibility and fairness.

(3) Artificial intelligence should not be used to diminish the data rights or privacy of individuals, families or communities.

(4) All citizens have the right to be educated to enable them to flourish mentally, emotionally and economically alongside artificial intelligence.

(5) The autonomous power to hurt, destroy or deceive human beings should never be vested in artificial intelligence.

This reminds us of course of Asimov’s Three Laws of Robotics:

(1) A robot may not injure a human being or, through inaction, allow a human being to come to harm.

(2) A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.

(3) A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

We believe that it is only the beginning of our reflections on how to regulate AI. There is already some work on legal liabilities of AI. You can read this interesting paper on Artificial Intelligence and Legal Liability.