MULTILINGUAL ADAPTIVE FINE-TUNING (MAFT)

We introduce MAFT as an approach to adapt a multi-lingual PLM to a new set of languages. Adapting PLMs has been shown to be effective when adapting to a new domain (Gururangan et al., 2020) or language (Pfeiffer et al., 2020; Alabi et al., 2020; Adelani et al., 2021). While previous work on multilingual adaptation has mostly focused on autoregressive sequence-to-sequence models such as mBART (Tang et al., 2020), in this work, we adapt non-autoregressive masked PLMs on monolingual corpora covering 20 languages. Crucially, during adaptation we use the training objective that was also used during pre-training. The models resulting from MAFT were then fine-tuned on supervised NLP downstream tasks. We only applied MAFT to smaller models (XLM-R-base, AfriBERTa, and XLM-R-miniLM), since one of our goals is to reduce model size, but XLM-R-large requires a lot of compute resources and the training is slower. We call the resulting model after applying MAFT to XLM-R-base as AfroXLMR-base, and AfroXLMR-mini when MAFT is applied to XLMR-miniLM

MULTILINGUAL ADAPTIVE FINE-TUNING (MAFT)

CONNECTED RECORDS

LANGUAGES

TASKS

TAGS