- 著者
- Sebastian B. Thrun
- タイトル
- Efficient Exploration In Reinforcement Learning
- 日時
- January 1992
- 概要
- Exploration plays a fundamental role in any active learning
system.
This study evaluates the role of exploration in active
learning and describes several local techniques for exploration
in finite, discreate domains, embedded in a reinforcement
learning framework (delayed reinforcement).
This paper distinguishes between two families of schemes:
undirected and directed exploration.
While the former family is closely related to random walk
exploration, directed exploration techniques memorize
exploration-specific knowledge which is used for guiding the
exploration search.
In many finite deterministic domains, any learning technique
based on undirected exploration search.
In many finite deterministic domains, any learning technique
based on undirected exploration is inefficient in terms of
learning time, i.e. learning time is expected to scale
exponentially with the size of the state space [Whitehead 1991].
We prove that for all these domains, reinforcement learning
using a directed technique can always be performed in
polynomial time, demonstrating the important role of exploration
in reinforcement learning.
(The proof is given for one specific directed exploration
technique named counter-based exploration.)
Subsequently, several exploration techniques found in recent
reinforcement learning and connectionist adaptive control
literature are described .
In order to trade off efficiently between exploration and
exploitation - a trade-off which characterizes many real-world
active learning tasks - combination methods are described which
explore and avoid costs simultaneously.
This includes a selective attention mechanism, which allows
smooth switching between exploration and exploitation.
All techniques are evaluated and compared on a discrete
reinforcement learning task (robot navigation).
The empirical evaluation is followed by an extensive discussion
of benefits and limitations of this work.
- カテゴリ
- CMUTR
Category: CMUTR
Institution: Department of Computer Science, Carnegie
Mellon University
Abstract: Exploration plays a fundamental role in any active learning
system.
This study evaluates the role of exploration in active
learning and describes several local techniques for exploration
in finite, discreate domains, embedded in a reinforcement
learning framework (delayed reinforcement).
This paper distinguishes between two families of schemes:
undirected and directed exploration.
While the former family is closely related to random walk
exploration, directed exploration techniques memorize
exploration-specific knowledge which is used for guiding the
exploration search.
In many finite deterministic domains, any learning technique
based on undirected exploration search.
In many finite deterministic domains, any learning technique
based on undirected exploration is inefficient in terms of
learning time, i.e. learning time is expected to scale
exponentially with the size of the state space [Whitehead 1991].
We prove that for all these domains, reinforcement learning
using a directed technique can always be performed in
polynomial time, demonstrating the important role of exploration
in reinforcement learning.
(The proof is given for one specific directed exploration
technique named counter-based exploration.)
Subsequently, several exploration techniques found in recent
reinforcement learning and connectionist adaptive control
literature are described .
In order to trade off efficiently between exploration and
exploitation - a trade-off which characterizes many real-world
active learning tasks - combination methods are described which
explore and avoid costs simultaneously.
This includes a selective attention mechanism, which allows
smooth switching between exploration and exploitation.
All techniques are evaluated and compared on a discrete
reinforcement learning task (robot navigation).
The empirical evaluation is followed by an extensive discussion
of benefits and limitations of this work.
Number: CMU-CS-92-102
Bibtype: TechReport
Month: jan
Author: Sebastian B. Thrun
Title: Efficient Exploration In Reinforcement Learning
Year: 1992
Address: Pittsburgh, PA
Super: @CMUTR