Department of Software and Computing Systems

Department of Software and Computing Systems

Research Project

Construcción de Analizadores Híbridos de Lenguajes Naturales (Construction of Hybrid Analyzers for Natural Languages Defined in Constrained Semantic Domains)
Reference: TIC97-0671-C02-02
Funding: Comisión Interministerial de Ciencia y Tecnología (CICYT)
Period: from 1997-08-01 to 2000-07-31
Project Leader: Palomar Sanz, Manuel
Members: Ferrández Rodríguez, Antonio
Gómez Ortega, Jaime
Martínez Barco, Patricio Manuel
Muñoz Guillena, Rafael
Peral Cortés, Jesús
Saiz Noeda, Maximiliano
Suárez Cueto, Armando


Summary of the project:
The development of systems for Natural Language processing has specific difficulties when we want to achieve a wide coverage of the language. Some well characterised linguistic phenomena (anaphora, ellipsis,...), and the phenomena characteristics of the spontaneous speech, which could produce syntactically incorrect sentences, must be considered. These problems can be raised from a deductive point of view, through the design of a language model from the linguistic knowledge, or can be raised through the application of inductive learning techniques from data. On the other hand, attending to the output that the system must supply, one of the most interesting possibility is that the system give us the meaning of the input sentences, in terms of a semantic language. This kind of semantic output could be used in later processes (translation, execution of actions, etc.). The main objective of this project is the development of language understanding systems in semantically constrained tasks. We will develop different deductive methodologies of analysis, as it is habitual in the Natural Language area, and some inductive learning techniques, which are successfully used in the Automatic Speech Recognition area. The co-operation of these two approaches will allow us to develop Natural Language processing systems more and more complex. The main aspects to be studied are the estimation of statistic regular models and the construction of context-free grammars in order to represent the characteristics of the language of the task, and the development of analyzers which supply the semantic messages of the input sentences.