next up previous contents index
Next: Featured papers Up: DTRNN based on sigmoid Previous: DTRNN based on sigmoid   Contents   Index


Some DTRNN architectures cannot implement all FSM

Not all DTRNN architectures are capable of representing all FSMFor example, as discussed in section 4.2.1, Elman (1990) neural nets can emulate any FSM using $n_X=\vert Q\vert\vert\Sigma\vert$ state neurons using a step function (see Kremer (1995)). As will be seen, Elman nets using sigmoids over rational numbers (Alquézar and Sanfeliu, 1995) may also be used to implement FSM (see the paper by Alquézar and Sanfeliu (1995)) ; more recently, Carrasco et al. (2000) have shown that this is also the case for Elman nets using real sigmoids.

On the other hand, first-order DTRNN without an output layer which use as output a projection of the state vector, that is,

\begin{displaymath}
y_i[t]=x_i[t];\;\;i=1,\ldots, n_Y;\;\; n_Y<n_X
\end{displaymath} (5.8)

cannot emulate all FSM, analogously to the work by Goudreau et al. (1994) . These networks can however emulate DFA using a suitable end-of-string symbol and $\vert Q\vert\vert\Sigma\vert+1$ state neurons.

Finally, it is easy to show (using a suitable odd-parity counterexample in the way described by Goudreau et al. (1994)) that the networks by Robinson and Fallside (1991) networks (see section 3.2.1) cannot represent the output function of all Mealy machines unless a two-layer scheme as the following is used:

\begin{displaymath}
h_i({\bf x}[t-1],{\bf u}[t])=g\left(\sum_{j=1}^{n_Z} W^{yz}_{ij} z_j[t]
+ W^y_i \right)
\end{displaymath} (5.9)

with
\begin{displaymath}
z_i[t] =g\left(\sum_{j=1}^{n_X} W_{ij}^{zx} x_j[t-1] +
\sum_{j=1}^{n_U} W_{ij}^{zu} u_j[t] +
W^z_i\right)
\end{displaymath} (5.10)

( $i=1,\ldots,n_Z$), and $n_Z$ the number of units in the layer before the actual output layer (the hidden layer of the output function). This may be called an augmented Robinson-Fallside network. The encoding of arbitrary FSM in augmented Robinson-Fallside networks is described by Carrasco et al. (2000).

Another interesting example is that of Fahlman's recurrent cascade correlation networks, which are constructed by an algorithm having the same name (see section 3.4.3). Kremer (1996b) --see the following section-- has shown these networks cannot represent all FSM by defining suitable classes of nonrepresentable machines.


next up previous contents index
Next: Featured papers Up: DTRNN based on sigmoid Previous: DTRNN based on sigmoid   Contents   Index
Debian User 2002-01-21