SHAKIR MOHAMEDML/AI

TALKS AND TUTORIALS

Projects number 7
TALK

Observations and Inspirations: Mutual Inspirations between Cognitive and Statistical Sciences

Observations and Inspirations: Mutual Inspirations between Cognitive and Statistical Sciences

Observations & Inspirations: The Mutual Inspirations between Cognitive and Statistical Sciences

Where do we obtain our inspiration in cognitive science? And in Machine Learning? These questions look at the parallels between these two fields. Fortunately, seeking out the parallels between minds and machines is one of our long-established scientific traditions, and this talk will explore the exchange of ideas between the two fields. The parallels between the cognitive and statistical sciences appear in all aspects of our practice, from how we conceptualise our problems, to the ways in which we test them, and the language we use in communication. One of these mutually useful tools are the conceptual frameworks used in the two fields. In cognitive science the most established frameworks are the classical cognitive architecture and Marr’s levels of analysis, and similarly in machine learning, that of Box’s loop and the model-inference-algorithm paradigm; these will be our starting point. The parallels between our fields appear in other more obvious forms, from cognitive revolutions and dogmas of information processing, to neural networks and embodied robotics. Recurring principles appear: prediction, sparsity, uncertainty, modularity, abduction, complementarity; and we’ll explore several examples of these principles. From my own experience, we’ll explore the probabilistic tools that connect to one-shot generalisation, grounded cognition, intrinsic motivation, and memory. Ultimately, these connections allow us to go from observation to inspiration: to make observations of cognitive and statistical phenomena, and, inspired by them, to strive towards a deeper understanding of the principles of intelligence and plausible reasoning in brains and machines.

TALK

Bayesian Agents

Bayesian Agents

Bayesian Agents: Bayesian Reasoning and Deep Learning in Agent-based Systems

  • Invited talk; NIPS2016 Workshop on Bayesian Deep Learning.
  • Link to slides

Bayesian deep learning allows us to combine two needed components for building intelligent and autonomous systems: Deep learning, which provides  a powerful framework for model building, and Bayesian analysis, which provides tools for optimal inference in these models. The outcome of this convergent thinking is our ability to develop and train a broad set of tools that are important components of systems that can reason and act in the real world. In this talk, we shall explore some of the ways in which Bayesian deep learning can be used in the tasks we expect from intelligent systems, such as scene understanding, concept formation, future-thinking, planning, and acting. These approaches remain far from perfect, and they allow us to unpack some of the challenges that remain for even wider application of Bayesian deep learning, and Bayesian reasoning more generally.

TUTORIAL

Variational Inference: Foundations and Modern Methods

Variational Inference: Foundations and Modern Methods

NIPS 2016 Tutorial

Variational Inference: Foundations and Modern Methods

One of the core problems of modern statistics and machine learning is to approximate difficult-to-compute probability distributions. This problem is especially important in probabilistic modeling, which frames all inference about unknown quantities as a calculation about a conditional distribution. In this tutorial we review and discuss variational inference (VI), a method a that approximates probability distributions through optimization. VI has been used in myriad applications in machine learning and tends to be faster than more traditional methods, such as Markov chain Monte Carlo sampling. Brought into machine learning in the 1990s, recent advances and easier implementation have renewed interest and application of this class of methods. This tutorial aims to provide both an introduction to VI with a modern view of the field, and an overview of the role that probabilistic inference plays in many of the central areas of machine learning.

The tutorial has three parts. First, we provide a broad review of variational inference from several perspectives. This part serves as an introduction (or review) of its central concepts. Second, we develop and connect some of the pivotal tools for VI that have been developed in the last few years, tools like Monte Carlo gradient estimation, black box variational inference, stochastic approximation, and variational auto-encoders. These methods have lead to a resurgence of research and applications of VI. Finally, we discuss some of the unsolved problems in VI and point to promising research directions.

Learning objectives:

  • Gain a well-grounded understanding of modern advances in variational inference.
  • Understand how to implement basic versions for a wide class of models.
  • Understand connections and different names used in other related research areas.
  • Understand important problems in variational inference research.

Target audience:

  • Machine learning researchers across all level of experience from first year grad students to other more experienced researchers
  • Targeted at those who want to understand recent advances in variational inference
  • Basic understanding of probability is sufficient

TUTORIAL

Building Machines that Imagine and Reason

Building Machines that Imagine and Reason

 Building Machines that Imagine and Reason: Principles and Applications of Deep Generative Models

Deep generative models provide a solution to the problem of unsupervised learning, in which a machine learning system is required to discover the structure hidden within unlabelled data streams. Because they are generative, such models can form a rich imagery the world in which they are used: an imagination that can harnessed to explore variations in data, to reason about the structure and behaviour of the world, and ultimately, for decision-making. This tutorial looks at how we can build machine learning systems with a capacity for imagination using deep generative models, the types of probabilistic reasoning that they make possible, and the ways in which they can be used for decision making and acting.

Deep generative models have widespread applications including those in density estimation, image de-noising and in-painting, data compression, scene understanding, representation learning, 3D scene construction, semi-supervised classification, and hierarchical control, amongst many others. After exploring these applications, we'll sketch a landscape of generative models, drawing-out three groups of models: fully-observed models, transformation models, and latent variable models. Different models require different principles for inference and we'll explore the different options available. Different combinations of model and inference give rise to different algorithms, including auto-regressive distribution estimators, variational auto-encoders, and generative adversarial networks. Although we will emphasise deep generative models, and the latent-variable class in particular, the intention of the tutorial will be to explore the general principles, tools and tricks that can be used throughout machine learning. These reusable topics include Bayesian deep learning, variational approximations, memoryless and amortised inference, and stochastic gradient estimation. We'll end by highlighting the topics that were not discussed, and imagine the future of generative models.

TALK

Memory-based Bayesian Reasoning and Deep Learning

Memory-based Bayesian Reasoning and Deep Learning

Deep learning and Bayesian machine learning are currently two of the most active areas of machine learning research. Deep learning provides a powerful class of models and an easy framework for learning that now provides state-of-the-art methods for applications ranging from image classification to speech recognition. Bayesian reasoning provides a powerful approach for knowledge integration, inference, and decision making that has established it as the key tool for data-efficient learning, uncertainty quantification and robust model composition, widely-used in applications ranging from information retrieval to large-scale ranking. Each of these research areas has shortcomings that can be effectively addressed by the other, pointing towards a needed convergence of these two areas of machine learning and one that enhances our machine learning practice.

One powerful outcome of this convergence is our ability to develop systems for probabilistic inference with memory. A memory-based inference amortises the cost of probabilistic reasoning by cleverly reusing prior computations. To explore this, we shall take a statistical tour of deep learning, re-examine latent variable models and approximate Bayesian inference, and make connections to de-noising auto-encoders and other stochastic encoder-decoder systems. In this way, we will make sense of what memory in inference might mean, and highlight the use of amortised inference in many other parts of machine learning.

TUTORIAL

Tutorial on Variational Inference for Machine Learning

Tutorial on Variational Inference for Machine Learning

Variational inference is one of the tools that now lies at the heart of the modern data analysis lifecycle. Variational inference is the term used to encompass approximation techniques for the solution of intractable integrals and complex distributions and operates by transforming the hard problem of integration into one of optimisation. As a result, using variational inference we are now able to derive algorithms that allow us to apply increasingly complex probabilistic models to ever larger data sets on ever more powerful computing resources.

This tutorial is meant as a broad introduction to modern approaches for approximate, large-scale inference and reasoning in probabilistic models. It is designed to be of interest to both new and experienced researchers in machine learning, statistics and engineering and is intended to leave everyone with an understanding of an invaluable tool for probabilistic inference and its connections to a broad range of fields, such as Bayesian analysis, deep learning, information theory, and statistical mechanics.

The tutorial will begin by motivating probabilistic data analysis and the problem of inference for statistical applications, such as density estimation, missing data imputation and model selection, and for industrial problems in search and recommendation, text mining and community discovery. We will then examine importance sampling as one widely-used Monte Carlo inference mechanism and from this begin our journey towards the variational approach for inference. The principle of variational inference and basic tools from variational calculus will be introduced, as well as the class of latent Gaussian models that will be used throughout the tutorial as a running example. Using this foundation, we shall discuss different approaches for approximating posterior distributions, the smorgasbord of techniques for optimising the variational objective function, a discussion of implementation and large-scale applications, a brief look at the available theory for variational methods, and an overview of other variational problems in machine learning and statistics.

Link to slides

TALK

Bayesian Reasoning and Deep Learning

Bayesian Reasoning and Deep Learning

Deep learning and Bayesian machine learning are currently two of the most active areas of machine learning research. Deep learning provides a powerful class of models and an easy framework for learning that now provides state-of-the-art methods for applications ranging from image classification to speech recognition. Bayesian reasoning provides a powerful approach for information integration, inference and decision making that has established it as the key tool for data-efficient learning, uncertainty quantification and robust model composition that is widely used in applications ranging from information retrieval to large-scale ranking. Each of these research areas has shortcomings that can be effectively addressed by the other, pointing towards a needed convergence of these two areas of machine learning; the complementary aspects of these two research areas is the focus of this talk. Using the tools of auto-encoders and latent variable models, we shall discuss some of the ways in which our machine learning practice is enhanced by combining deep learning with Bayesian reasoning. This is an essential, and ongoing, convergence that will only continue to accelerate and provides some of the most exciting prospects, some of which we shall discuss, for contemporary machine learning research.

Link to slides