Software

November 11, 2011

DocTrans: document translation retrieval based on SMT techniques

Free/open-source piece of software implementing a method based on SMT techniques to retrieve documents which are a plausible translation of a given source text. The method provides the terms to use in a query to retrieve the document translation of the source document provided as input. It relies on the free-/open-source SMT system Moses and was last tested with revision 2281.
Download : Read paper

Apertium-tagger-training-tools: target-language-driven POS tagger trainer

Free/open-source package for the unsupervised training of hidden-Markov-model-based POS taggers involved in MT. It uses information, not only from the source language, but also from the target language; to this end the Apertium MT platform is used. After training a file containing the hidden-Markov-model parameters is produced; this file can be directly used within the Apertium MT platform.
Download : Read paper

Apertium-transfer-tools: automatic inference of shallow-transfer rules

Free/open-source package consisting of a set of tools that allow the automatic inference of Apertium (level 1) transfer rules from parallel corpora. The generated transfer rules (in XML format) can be directly used by the Apertium MT platform. This package depends on the GIZA++ toolkit to compute word alignments.
Download : Read paper

Apertium-morph: using morphological information with Apache Lucene

Free/open-source package providing a set of tools and Java classes that allow the Apache Lucene text search engine to use morphological information to index and search. To that end, the linguistic resources developed for the Apertium MT platform are used to extract morphological information while indexing.
Download : Read paper