Cookies are used on the Lanfrica website to ensure you get the best experience.
Research into machine translation for African languages is very limited and low- resourced in terms of datasets and model evaluations. This work aims to add to the field of neural machine translation research, for four low-resourced Southern African languages. The effect of two byte pair encoding tokenisation algorithms (subword nmt and SentencePiece), with various parameters, are evaluated. The paper builds upon previous research in the field for comparison, using an opti- mised transformer architecture and pre-cleaned data to translate English to North- ern Sotho, Setswana, Xitsonga and isiZulu. The results obtained show improve- ments in the previous BLEU scores obtained for Setswana and isiZulu.