next up previous contents index
Next: Application of DTRNN to Up: Discrete-time recurrent neural networks Previous: Neural Moore machines   Contents   Index


Other architectures without hidden state

There are a number of discrete-time neural network architectures that do not have a hidden state (their state is observable because it is simply a combination of past inputs and past outputs) but may still be classified as recurrent. One such example is the NARX (Nonlinear Auto-Regressive with eXogenous inputs) network used by Narendra and Parthasarathy (1990) and then later by Lin et al. (1996) and Siegelmann et al. (1996) (see also (Haykin, 1998, 746)), which may be formulated in state-space form by defining a state that is simply a window of the last $n_I$ inputs and a window of the last $n_O$ outputs. Accordingly, the next-state function simply incorporates a new input (discarding the oldest one) and a freshly computed output (discarding the oldest one) to the windows and shifts each one of them one position. The $n_X=n_In_U+n_On_Y$ components of the state vector are distributed as follows:

The next-state function ${\bf f}$ performs, therefore, the following operations: The output function is then simply
\begin{displaymath}
h_i({\bf x[t]}) = x_{i+{n_U}n_I}[t],
\end{displaymath} (4.20)

with $1\le i \le {n_Y}$. Note that the output is computed by a two-layer feedforward neural network. The operation of a NARX network $N$ may then be summarized as follows (see figure 3.3):
\begin{displaymath}
{\bf y}[t]=N({\bf u}[t],{\bf u}[t-1],\ldots,{\bf u}[t-n_I],{\bf y}[t-1],{\bf y}[t-2],\ldots,{\bf y}[t-n_O]).
\end{displaymath} (4.21)

Their operation is therefore a nonlinear variation of that of an ARMA (Auto-Regressive, Moving Average) model or that of an IIR ( Infinite-time Impulse Response) filter.

Figure 3.3: Block diagram of a NARX network (the network is fully connected but all arrows have not been drawn for clarity).
\includegraphics{narx}

When the state of the discrete-time neural network is simply a window of past inputs, we have a network usually called a time delay neural network (TDNN) (see also (Haykin, 1998, 641)). In state-space formulation, the state is simply the window of past inputs and the next-state function simply incorporates a new input to the window and shifts it one position in time:

\begin{displaymath}
\begin{array}{rlc} f_i({\bf x}[t-1],{\bf u}[t]) =& x_{i-{n_U...
...f x}[t-1],{\bf u}[t]) =& u_i[t],&
1\le i\le {n_U},
\end{array}\end{displaymath} (4.22)

with ${n_X} = {n_U}n_I$; and the output is usually computed by a two-layer perceptron (feedforward net):
\begin{displaymath}
h_{i}({\bf x}[t-1],{\bf u}[t]) =
g\left(\sum_{j=1}^{n_Z} W_{ij}^{yz} z_j[t] + W_i^y \right),\; 1
\le i \le n_Y
\end{displaymath} (4.23)

with
\begin{displaymath}
z_i[t] = g\left(\sum_{j=1}^{n_X} W_{ij}^{zx} x_i[t-1] +
\su...
...^{n_U} W_{ij}^{zu} u_j[t] + W_i^z \right),\;
1 \le i \le n_Z.
\end{displaymath} (4.24)

The operation of a TDNN network $N$ may then be summarized as follows (see figure 3.4):

\begin{displaymath}
{\bf y}[t]=N({\bf u}[t],{\bf u}[t-1],\ldots,{\bf u}[t-n_I]).
\end{displaymath} (4.25)

Their operation is therefore a nonlinear variation of that of an MA (Moving Average) model or that of a FIR (Finite-time Impulse Response) filter.

Figure 3.4: Block diagram of a TDNN (the network is fully connected but all arrows have not been drawn for clarity).
\includegraphics{tdnn}

The weights connecting the window of inputs to the hidden layer may be organized in blocks sharing weight values, so that the components of the hidden layer retain some of the temporal ordering in the input window. TDNN have been used for tasks such as phonetic transcription (Sejnowski and Rosenberg, 1987), protein secondary structure prediction (Qian and Sejnowski, 1988), or phoneme recognition (Waibel et al., 1989; Lang et al., 1990). Clouse et al. (1997b) have studied the ability of TDNN to represent and learn a class of finite-state recognizers from examples (see also (Clouse et al., 1997a) and (Clouse et al., 1994))4.8.


next up previous contents index
Next: Application of DTRNN to Up: Discrete-time recurrent neural networks Previous: Neural Moore machines   Contents   Index
Debian User 2002-01-21