Fundamentals of Reinforcement Learning Training Course

Course Code

rl

Duration

21 hours (usually 3 days including breaks)

Requirements

  • Experience with machine learning
  • Programming experience

Audience

  • Data scientists

Overview

Reinforcement Learning (RL) is a machine learning technique in which a computer program (agent) learns to behave in an environment by performing the actions and receiving feedback on the results of the actions. For each good action, the agent receives positive feedback, and for each bad action, the agent receives negative feedback (penalty).

This instructor-led, live training (online or onsite) is aimed at data scientists who wish to go beyond traditional machine learning approaches to teach a computer program to figure out things (solve problems) without the use of labeled data and big data sets.

By the end of this training, participants will be able to:

  • Install and apply the libraries and programming language needed to implement Reinforcement Learning.
  • Create a software agent that is capable of learning through feedback instead of through supervised learning.
  • Program an agent to solve problems where decision making is sequential and finite.
  • Apply knowledge to design software that can learn in a way similar to how humans learn.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Learning through positive reinforcement

Elements of Reinforcement Learning

Important Terms (Actions, States, Rewards, Policy, Value, Q-Value, etc.)

Overview of Tabular Solutions Methods

Creating a Software Agent

Understanding Value-based, Policy-based, and Model-based Approaches

Working with the Markov Decision Process (MDP)

How Policies Define an Agent's Way of Behaving

Using Monte Carlo Methods

Temporal-Difference Learning

n-step Bootstrapping

Approximate Solution Methods

On-policy Prediction with Approximation

On-policy Control with Approximation

Off-policy Methods with Approximation

Understanding Eligibility Traces

Using Policy Gradient Methods

Summary and Conclusion

Testimonials

★★★★★
★★★★★

Related Categories

Related Courses

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients

is growing fast!

We are looking for a good mixture of IT and soft skills in Luxembourg!

As a NobleProg Trainer you will be responsible for:

  • delivering training and consultancy Worldwide
  • preparing training materials
  • creating new courses outlines
  • delivering consultancy
  • quality management

At the moment we are focusing on the following areas:

  • Statistic, Forecasting, Big Data Analysis, Data Mining, Evolution Alogrithm, Natural Language Processing, Machine Learning (recommender system, neural networks .etc...)
  • SOA, BPM, BPMN
  • Hibernate/Spring, Scala, Spark, jBPM, Drools
  • R, Python
  • Mobile Development (iOS, Android)
  • LAMP, Drupal, Mediawiki, Symfony, MEAN, jQuery
  • You need to have patience and ability to explain to non-technical people

To apply, please create your trainer-profile by going to the link below:

Apply now!

This site in other countries/regions