Departamento de Lenguajes y Sistemas Informáticos

Comunicación

Título:Feasible lexical selection for rule-based machine translation Incorpóralo a tu calendario:
[CSV]
Tipo:Defensa de tesis doctoral
Por:Francis M. Tyers
Lugar:Aula de videoconferencia 13 del Taller de Imagen
Día/hora:16:00 17/07/2013
Duración aproximada:3:00 horas
Más información:http://cedip.ua.es/es/secretaria/tesis-doctoral/tesis-en-proceso-de-tramitacion/francis-morton-tyers.html
Persona de contacto:

Sánchez Martínez, Felipe (fsanchez[Perdone'm]dlsi.ua.es)
Resumen:
This thesis addresses the problem of lexical selection in rule-based
machine translation. Lexical selection is the task of choosing, for  a given
source-language word, the most adequate translation in the  target language
among a known set of alternatives. To address the  problem, it presents a
formalism for lexical selection based on  fixed-length context. Rules in
this formalism are compiled into a finite-state transducer where the input
side of the transducer represents source-language patterns, and the output
side are  rules selecting or removing possible translations in the target
language. A best-coverage algorithm was defined which selects  the fewest ---
meaning the longest --- rules which cover the  input sentence. We show that
given little time, it is possible for hand-written rules to be written that
improve the lexical-selection performance over three out of four translation
systems tested.

Relying on hand-written rules brings us face-to-face with the
knowledge-acquisition bottleneck. To overcome this, we describe a  general
method of learning lexical-selection rules in the previously-defined
formalism. The general method can be trained in a supervised manner, by using
a word-aligned parallel corpus, and also in a novel unsupervised manner using
the rest of the modules of the MT system in which the lexical-selection module
is to be embedded. The unsupervised training works by first generating all the
possible translations of a given source language sentence as regards lexical
selection, and then scoring them on a  target-language model. The scores from
the target-language model are normalised such that each translation gets
a fractional count corresponding to its  share of the probability mass. In
training, these fractional counts are  used in place of the word-alignment
counts in the supervised method. As preliminary experiments showed that
including the whole ruleset had a detrimental effect on lexical-selection
quality, we define a threshold under which rules are discarded. The threshold
is set by tuning on a development corpus. Both methods are evaluated on the
same four systems, with the result that,  for the supervised-learning method,
an improvement in lexical-selection quality is shown in three out of four
systems compared with the baseline of  choosing the most-frequently-aligned
translation. For the unsupervised-learning method, an improvement is shown
in two out of the four systems.

Examination of the results of the rule-learning methods showed that useful
information could be being discarded at two points, first in the training
process when we apply the threshold and discard possible rules. The second is
in the application of the best-coverage algorithm, where we make the assumption
that longer rules are better rules. To overcome these two restrictions,
we present a well-formed probability model  for finding the most probable
translation based on the principle of maximum entropy.  At training time,
each rule is assigned a weight, which is learnt from the corpus. At  runtime,
instead of applying only the best-coverage, we apply all of the rules and
for each translation, sum the weights of the matched rules to find the  most
probable translation. Using the maximum-entropy weighted rules with the
supervised-learning method resulted in an improvement over the unweighted
rules in three out of four systems, and for the unsupervised-learning method,
a substantial improvement in one out of four systems.

In conclusion, the thesis presents a formalism based on fixed-length context
rules and an implementation based on finite-state transducers. Methods of
learning these rules from parallel and monolingual corpora are presented,
and a method of assigning rule weights based on the principle of maximum
entropy. Improvements over the baseline are shown for the majority of the
systems tested.

[ Tancar ]