Introduction and Course Overview

What is RL

Deep learning helps us handle unstructured environments.

Reinforcement learning provides a formalism for behavior.

Robotic control pipeline:

Observations -> State Estimation(e.g. vision) -> Modeling & Prediction -> Planning -> Low-level Control -> Controls

Deep models are what allow reinforcement learning algorithms to solve complex problems end to end!

Other Forms of Supervision

Learning from demonstrations

  • Directly copying observed behavior
    • Inferring rewards from observed behavior (inverse reinforcement learning)
    • Learning from observing the world
  • Learning to predict
    • Unsupervised learning
    • Learning from other tasks
  • Transfer learning
    • Meta-learning: learning to learn

Learning as the basis of intelligence

  • Some things we can all do (e.g. walking)
  • Some things we can only learn (e.g. driving a car)
  • We can learn a huge variety of things, including very difficult things
  • Therefore our learning mechanism(s) are likely powerful enough to do everything we associate with intelligence
  • But it may still be very convenient to “hard-code” a few really important bits

Why Deep RL

Deep = can process complex sensory input, and also compute really complex functions.

Reinforcement learning = can choose complex actions.

What can DL & RL do Well Now

  • Acquire high degree of proficiency in domains governed by simple, known rules
  • Learn simple skills with raw sensory inputs, given enough experience
  • Learn from imitating enough human provided expert behavior

What has Proven Challenging So Far

  • Humans can learn incredibly quickly
    • Deep RL methods are usually slow
  • Humans can reuse past knowledge
    • Transfer learning in deep RL is an open problem
  • Not clear what the reward function should be
  • Not clear what the role of prediction should be

Instead of trying to produce a program to simulate the adult ind, why not rather try to produce one which simulates the child’s? If this were then subjected to an appropriate course of education one would obtain the adult brain. —Alan Turing

Note: Cover Picture