 
 
 
 
 
 
 
 
 
 
 states,5.4  but
the construction by Kremer (1995)
 allows for
input and output
symbols to be represented by
arbitrary patterns of `0' and `1' activations (instead of being
represented by exclusive or   one-hot patterns having
precisely one `1'). This has the effect of reducing the number of
learnable parameters in the input-state
(
 states,5.4  but
the construction by Kremer (1995)
 allows for
input and output
symbols to be represented by
arbitrary patterns of `0' and `1' activations (instead of being
represented by exclusive or   one-hot patterns having
precisely one `1'). This has the effect of reducing the number of
learnable parameters in the input-state
( ) and the state-output (
) and the state-output ( weight
matrices (see section 3.2.2).
 weight
matrices (see section 3.2.2).