2001 IntroductiontoSequenceLearning

(Sun, 2001) ⇒ Ron Sun. (2001). “Introduction to Sequence Learning.” In: Sequence Learning - Paradigms, Algorithms, and Applications. ISBN:978-3-540-44565-4 doi:10.1007/3-540-44565-X_1

Subject Headings: Reinforcement Learning; Recurrent Neural Network; Sequence Learning; Inductive Logic Programming; Time Series Prediction.

Notes

Cited By

Quotes

Abstract

Sequential behavior is essential to intelligence, and it is a fundamental part of human activities ranging from reasoning to language, and from everyday skills to complex problem solving. In particular, sequence learning is an important component of learning in many task domains - planning, reasoning, robotics, natural language processing, speech recognition, adaptive control, time series prediction, financial engineering, DNA sequencing, and so on.

1 Introduction

Sequential behavior is essential to intelligence, and it is a fundamental part of human activities ranging from reasoning to language, and from everyday skills to complex problem solving. In particular, sequence learning is an important component of learning in many task domains — planning, reasoning, robotics, natural language processing, speech recognition, adaptive control, time series prediction, financial engineering, DNA sequencing, and so on.

Naturally, there are many different approaches towards sequence learning, resulting from different perspectives taken in different task domains. These approaches deal with somewhat differently formulated sequential learning problems (for example, some with actions and some without), and/or different aspects of sequence learning (for example, sequence prediction vs. sequence recognition).

Sequence learning is clearly a difficult task. More powerful algorithms for sequence learning are needed in all of these afore-mentioned domains. It is our view that the right approach to develop better techniques, algorithms, models, and theories is to first better understand the state of the art in different disciplines related to this topic. There seems to be a need to compare, contrast, and combine different existing techniques, approaches, and paradigms, in order to develop better and more powerful algorithms. Currently, existing techniques and algorithms include recurrent neural networks, hidden Markov models, dynamic programming, reinforcement learning, graph theoretical models, search based models, evolutionary computational models, symbolic planning models, production rule based models, and so on.

Especially important to this topic area is the fact that work on sequence learning has been going on in several different disciplines such as artificial intelligence, neural networks, cognitive science (human sequence learning, e.g., in skill acquisition), and engineering. We need to examine the field in a cross-disciplinary way and take into consideration all of these different perspectives on sequence learning. Thus, we need interdisciplinary gatherings that include researchers from all of these orientations and disciplines, beyond narrowly focused meetings on specialized topics such as reinforcement learning or recurrent neural networks.

A workshop on neural, symbolic, and reinforcement methods for sequence learning (co-chaired by Ron Sun and Lee Giles) was held on August 1st, 1999, preceding IJCAr99, in Stockholm, Sweden. The following issues concerning sequence learning were raised and addressed:

(...)

2 Problem Formulations and Their Relationships

One aim of the present volume is to better understand different formulations of sequence learning problems.

With some necessary simplification, we can categorize various sequence learning problems that have been tackled into the following categories: (1) sequence prediction, in which, we want to predict elements of a sequence based on the preceding element(s): (2) sequence generation, in which we want to generate elements of a sequence one by one in their natural order: and (3) sequence recognition, in which we want to determine if a sequence is a legitimate one according to some criteria: in addition, (4) sequential decision making involves selecting sequence of actions, to accomplish a goal, to follow a trajectory, or to maximize (or minimize) a reinforcement (or cost) function that is normally the (discounted) sum of reinforcements (costs) that are received along the way (see Bellman 1957, Bertsekas and Tsitsiklis 1995)

(...)

These different sequence learning problems can be more precisely formulated as follows (assume a deterministic world for now):

Sequence prediction: Si, Si+i, Sj — > Sj-i-ii where 1 < i < j < oo; that is, given Si, Si+i, Sj, we want to predict Sj+i. When i = 1, we make predictions based on all of the previously seen elements of the sequence. When i = j, we make predictions based only on the immediately preceding element.
Sequence generation: Si, Si+i, ...., Sj — > Sj-i-i) where 1 < i < j < oo; that is, given Si, Si+i, ...., Sj, we want to generate Sj+i. (Put in this way, it is clear that sequence prediction and generation are essentially the same task.)
Sequence recognition: Si, Si+i, Sj — > yes or no, where 1 < i < j < oo; that is, given Si, Si+i, Sj, we want to determine if this subsequence is legitimate or not. (There are alternative ways of formulating the sequence recognition problem, for example, as an one-shot recognition process, as opposed to an incremental step-by-step recognition process as formulated here.)

With this formulation, sequence recognition can be turned into sequence generation/prediction, by basing recognition on prediction (see the chapter by D. Wang in this volume); that is, Si,Si+i, — > yes (a recognition problem), if and only if Si, Si+i, ...., Sj-i — > Sj (a prediction problem) and Sj = Sj, where Sj is the prediction and is the actual element.

Sequence learning (either generation, prediction, or recognition) is usually based on models of legitimate sequences, which can be developed through training with exemplars. Models may be in the form of Markov chains, hidden Markov models, recurrent neural networks, and a variety of other forms. Expectation-maximization, gradient descent, or clustering may be used in training (see e.g. the chapters by Oates and Cohen and by Sebastiani et al in this volume). Such training may extract “central tendencies” from a set of exemplars.

Sequential decision making (that is, sequence generation through actions): there are several possible variations. In the goal-oriented case, we have [math]\displaystyle{ s_i,s_{i+1}, \cdots, s_j; s_G \to a_j }[/math], where [math]\displaystyle{ 1\lt i\lt j\lt \infty; }[/math] that is, given [math]\displaystyle{ s_i,s_{i+1}, \cdots, s_j }[/math] and the goal state [math]\displaystyle{ s_G }[/math], we want to choose an action [math]\displaystyle{ a_j }[/math] at time step [math]\displaystyle{ j }[/math] that will likely lead to [math]\displaystyle{ s_G }[/math] in the future. In the trajectory oriented case, we have [math]\displaystyle{ s_i,s_{i+1}, \cdots, s_j; s_{j+1} \to a_j }[/math], where [math]\displaystyle{ 1\lt i\lt j\lt \infty; }[/math] that is, given [math]\displaystyle{ s_i,s_{i+1}, \cdots, s_j }[/math] the desired next state [math]\displaystyle{ s_{j+1} }[/math] , we want to choose an action [math]\displaystyle{ a_j }[/math] that at time step [math]\displaystyle{ j }[/math] will be likely to lead to [math]\displaystyle{ s_{j+1} }[/math] in the next step. In the reinforcement maximizing case, we have [math]\displaystyle{ s_i,s_{i+1}, \cdots, s_j \to a_j }[/math], where [math]\displaystyle{ 1\lt i\lt j\lt \infty; }[/math] that is, given [math]\displaystyle{ s_i,s_{i+1}, \cdots, s_j }[/math] we want to choose action [math]\displaystyle{ a_j }[/math] at time step [math]\displaystyle{ j }[/math] that will likely lead to receiving maximum total reinforcement in the future. The calculation of total reinforcement can be in terms of discounted or undiscounted cumulative reinforcement, in terms of average reinforcement, or in terms of some other functions of reinforcement (Bertsekas end Tsitsiklis 1996, Kaelbling et al 1996, Sutton and Barto 1997).
The action selection can be based on the immediately preceding element (i.e. [math]\displaystyle{ i=j }[/math]), in which casea Markovian action polity is in force: otherwise, a non-Markovian action policy is involved (McCallum 1190, Sun and Sessions 2000). Yet another possibility is that the action decisions are not based on preceding elements at all (i.e.,[math]\displaystyle{ s_i,s_{i+1}, \cdots, s_j }[/math] are irrelevant), in which case an open-loop policy is used (Sun and Sessions 1998) while in the Previous two cases, closed-loop policies are involved, as will be discussed later.
Note that on this view, sequence generation/prediction can be viewed as a special call of sequential decision making.

(...)

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2001 IntroductiontoSequenceLearning	Ron Sun			Introduction to Sequence Learning				10.1007/3-540-44565-X_1		2001

2001 IntroductiontoSequenceLearning

Notes

Cited By

Quotes

Abstract

1 Introduction

2 Problem Formulations and Their Relationships

References

Navigation menu

Search