next up previous contents index
Next: #pollack91j###: Up: Inference of finite-state machines Previous: Inference of finite-state machines   Contents   Index

Cleeremans et al. (1989):

this paper describes how an Elman (1990) simple recurrent network may be trained to read a string and predict the next symbol. When a network is trained to predict the next symbol, a one-hot, exclusive, or local interpretation (one unit per symbol, high when the symbol is present and low otherwise) is used for the alphabet, and a quadratic error function is used, it is the case that the actual outputs of the DTRNN after having seen a string are a good approximation to the frequencies observed for each of the successors of that string in the learning set. If the learning set is interpreted to be a representative draw from a given probability distribution over strings, then the outputs may actually be interpreted as probabilities, and the DTRNN may be used as a generator having a similar probability distribution.

However, Cleeremans et al. (1989) do not use the trained DTRNN as a generator but rather as an acceptor for the language (they explored two languages). A string is accepted if each of its successors is predicted with a probability higher than a threshold, which the authors set to 0.36.2. For the simplest language, using 70,000 random strings, of which only 210 were grammatical, the network only accepted the grammatical ones. Also, the network accepted all string in a set of 20,000 random grammatical strings. The second language modelled long-term dependencies: the strings started and ended with the same symbol; if the probabilities of intervening strings were independent of that symbol, the network failed to learn the language; however, when the distribution of intervening strings depended even if very slightly on the initial symbol, then the DTRNN learned the language.

The authors also perform a hierarchical clustering of the observed values of the hidden units and observe clusters that correspond to the states in the automata defining the languages.


next up previous contents index
Next: #pollack91j###: Up: Inference of finite-state machines Previous: Inference of finite-state machines   Contents   Index
Debian User 2002-01-21