The CLUVI (Linguistic Corpus of the University of Vigo) is an open set of parallel textual corpora of specialized registers of contemporary Galician language developed by the SLI (Computational Linguistics Group of the University of Vigo) and publicly available in its website since September 2003. The CLUVI Corpus contains over 23 million words, and its main components are the TECTRA Corpus of English-Galician literary texts, the FEGA Corpus of French-Galician literary texts, the LEGA Corpus of Galician-Spanish legal texts, the UNESCO Corpus of English-Galician-French-Spanish scientific-technical divulgation texts, the LOGALIZA Corpus of English-Galician software localization, and the CONSUMER Corpus of Spanish-Galician-Catalan-Basque consumer information. The public searching and browsing tool designed by the SLI is available at http://sli.uvigo.es/CLUVI/. This web application permits both simple and very complex searches of isolated words or sequences of words, and shows the multilingual equivalences of the terms in context, as found in real and referenced translations. The terms searched can correspond to either of the languages of the translation, but it is also possible to carry out true multilingual searches, that is, to simultaneously search one term from each of the languages of translation. The number of aligned works and language pairs available in the website increases regularly, since the CLUVI is a academic research project in progress and with great vitality. At the moment, the CLUVI Parallel Corpus webpage permits to search five major corpora -TECTRA, FEGA, LEGA, UNESCO and LOGALIZA-, as well as other minor parallel corpora now in progress. It should be pointed out that the CLUVI interface also permits to browse the TURIGAL Corpus of Portuguese-English tourism texts, and the Legebiduna Corpus of Basque-Spanish administrative texts developed by the DELi group at the U. of Deusto.
On-line Documents
Gómez Guinovart, Xavier and Alberto Simões (2010): Translation Dictionaries Triangulation. In Proceedings of FALA2010: VI Jornadas en Tecnología del Habla & II Iberian SLTech, Universidade de Vigo, Vigo.
Gómez Guinovart, Xavier and Alberto Simões (2009): Parallel corpus-based bilingual terminology extraction. In Proceedings of the 8th International Conference on Terminology and Artificial Intelligence, IRIT (Institut de recherche en Informatique de Toulouse), Université Paul Sabatier, Toulouse.
Crespo Bastos, Ana, Xosé María Gómez Clemente, Xavier Gómez Guinovart e Susana López Fernández (2008): XML-based Extraction of Terminological Information from Corpora. In José Carlos Ramalho, João Correia Lopes e Salvador Abreu (eds.), Actas da 6ª Conferência Nacional XATA2008.XML, Aplicações e Tecnologias Associadas. 14-15 Febreiro 2008, Universidade de Évora (Portugal), pp. 28-39.
Simões, Alberto; Almeida, José João; and Gómez Guinovart, Xavier (2004): Memórias de Tradução Distribuídas. In Ramalho, José Carlos and Simões, Alberto (eds.), XATA2004 - XML, Aplicações e Tecnologias Associadas, Universidade do Porto, Porto (Portugal), pp. 59-68.
Aguirre Moreno, José Luis; Alberto Álvarez Lugrís; Iago Bragado Trigo; Luz Castro Pena; Xavier Gómez Guinovart; Santiago González Lopo; Angel López López; José Ramom Pichel Campos; Elena Sacau Fontenla and Lara Santos Suárez (2003): Alinhamento e etiquetagem de corpora paralelos no CLUVI (Corpus Linguístico da Universidade de Vigo). In Almeida, J.J. (ed.), Actas do Workshop CP3A 2003, Corpora Paralelos: Aplicações e Algoritmos Associados, pp. 33-47. Universidade do Minho, Braga (Portugal).
Project: Desenvolvemento e explotación de recursos integrados da lingua galega. With the Instituto da Lingua Galega. Financed by: Consellería de Innovación e Industria, Xunta de Galicia, Programa de promoción xeral de investigación do Plan galego de investigación, desenvolvemento e innovación tecnolóxica (Incite), 2008-2011 (ref. INCITE08PXIB302185PR).
Project: Deseño e implementacion dun servidor de recursos integrados para o desenvolvemento de tecnoloxías da lingua galega (RILG). With the Instituto da Lingua Galega. Financed by: Ministerio de Educación y Ciencia, Plan Nacional de I+D+I, 2006-2009 (ref. HUM2006-11125-C02-01/FILO).
Project: Procesamento lingüístico-computacional do Corpus Lingüístico da Universidade de Vigo (CLUVI). Financed by: Ministerio de Ciencia y Tecnología, Plan Nacional de I+D+I, 2002-2005 (ref. BFF2002-01385). Co-financed by: Dirección Xeral de I+D da Xunta de Galicia and Universidade de Vigo.
Project: Adquisición de recursos básicos de lingüística computacional do galego para aplicacións informáticas de tecnoloxía lingüística. Financed by: Imaxin Software, Proxecto de I+D (Universidade - Empresa), 2002-2003.
Project: Estudio e adquisición de recursos básicos de lingüística computacional do galego para a elaboración e mellora de aplicacións informáticas de tecnoloxía lingüística. With Imaxin Software. Financed by: Secretaría Xeral de Investigación e Desenvolvemento, Xunta de Galicia, 2001-2004 (ref. PGIDT01TICC06E).
Project: Desenvolvemento e aplicación de técnicas de análise lingüístico-computacional de corpus orais e escritos para o procesamento do CLUVI (Corpus Lingüístico da Universidade de Vigo). Financed by: Secretaría Xeral de Investigación e Desenvolvemento, Xunta de Galicia, 2001-2003 (ref. PGIDT01PXI30203PR).
Project: Desenvolvemento de ferramentas informáticas de revisión lingüística para a lingua galega. Financed by: Imaxin Software, Proxecto de I+D (Universidade - Empresa), 2001-2002.