Language models

How bank and uni preserve Luxembourgish through AI

The SnT and BGL BNP Paribas created an open source Luxembourgish language model. Matic Zorman / Maison Moderne

The SnT and BGL BNP Paribas created an open source Luxembourgish language model. Matic Zorman / Maison Moderne

BGL BNP Paribas and the University of Luxembourg’s SnT department on 9 November revealed details of their collaboration on the first Luxembourgish AI model. Delano talked with PhD student at the TruX research team and LuxemBERT’s main developer, Cedric Lothritz.

LuxemBERT, the grand duchy’s very own language model, has recently been made available online to the wider public and is open source. Using what the bank describes as “state-of-the-art BERT technology”, the model should help develop multilingual virtual assistants or chatbots in Luxembourgish.

Professor Jacques Klein of the University of Luxembourg’s Interdisciplinary Centre for Security, Reliability and Trust (SnT) led the project, with Lothritz working as the main developer for the language model.

Why is it important to have Luxembourgish AI models?

Cedric Lothritz: As the world becomes more and more automated and reliant on AI to complete tasks, we need to put more research in building reliable and well-performing AI models. In Natural Language Processing, relevant tasks include question answering systems, chatbots and digital assistants, and spellcheckers. Lots of research to create such models is done for widespread languages such as English, Chinese, Spanish, or German, but not for so-called low-resource languages - such as Luxembourgish - that are spoken by only a few thousand people. As such, it is important to make an effort to preserve those low-resource and endangered languages. 

Is there a demand for it? Where did this demand come from?

In general, any company that processes a lot of textual data benefits from the existence of such models. BGL [BNP Paribas] handles a vast number of documents on a daily basis, and puts in a big effort to investigate AI models for numerous use cases. One such use case is a chatbot that clients can use to handle simple requests without needing to contact an employee. We recognised the lack of and the need for a Luxembourgish language model that would enable the creation of such a chatbot, marking the beginning of LuxemBERT.

Who contacted whom to begin this collaboration? The collaboration between our research group, BGL and the [Alphonse] Weicker Foundation already existed before our work on LuxemBERT began. Several researchers in the TruX group are working on research projects funded by the Fonds National de la Recherche in collaboration with BGL. In the context of my own PhD programme, I have been in a partnership with BGL since March 2019. Their support has helped me a lot in my work, leading to the publication of multiple research papers.