Departament de Llenguatges i Sistemes Informàtics


Títol:Bilingual Termbank Creation via Log-Likelihood Comparison and Phrase-Based Statistical Machine Translation Importa'l al teu calendari:
Per:Andy Way, Dublin City University, Irlanda
Lloc:*ATENCIÓ: NOVA AULA*: Aula 16, Politècnica IV
Dia/hora:10.30 12/09/2014
Duració aproximada:1:00 hora
Persona de contacte:

Forcada Zubizarreta, Mikel L. ( )
Bilingual termbanks are important for many natural language processing (NLP)
applications, especially in translation workflows in industrial settings. In
this paper, we apply a log-likelihood comparison method to extract monolingual
terminology from the source and target sides of a parallel
corpus. Then, using a Phrase-Based Statistical Machine
Translation model, we create a bilingual terminology with
the extracted monolingual term lists. We manually evaluate
our novel terminology extraction model on English-to
Spanish and English-to-Hindi data sets, and observe
excellent performance for all domains. Furthermore, we
report the performance of our monolingual terminology extraction model
comparing with a number of the state-of-the-art terminology extraction models
on the English-to-Hindi datasets.

[ Tancar ]