Join our mailing list to get updates on our events, news, and the latest from the world of African language resources.

Your email is safe with us. We promise not to spam!
Please, consider giving your feedback on using Lanfrica so that we can know how best to serve you. To get started, .
X
Filter

Filter Records

Languages

Loading...

Tasks

Loading...

Record Types

Loading...

Tags

Loading...

The African Storybook (ASb) is a multilingual literacy initiative that works with educators and children to publish openly licensed picture storybooks for early reading in the languages of Africa. An initiative of Saide, the ASb has an interactive website that enab...

Expand Abstract

This website contains information about African Languages, and other African Language related resources. Currently mostly only the South African languages are covered, as well as Kiswahili and Cilubà. It is estimated that there are between 2000 and 3000 languages ...

Expand Abstract

In promoting a multilingual South Africa, the government is encouraging people to speak more than one language. In order to comply with this initiative, people choose to learn the languages which they do not speak as home language. The African languages are mostly ...

Expand Abstract

AfroLID is a powerful neural toolkit for African languages identification which covers 517 African languages....

Expand Abstract

Language identification (LID) is a crucial precursor for NLP, especially for mining web data. Problematically, most of the world's 7000+ languages today are not covered by LID technologies. We address this pressing issue for Africa by introducing AfroLID, a neural ...

Expand Abstract

This version of the Bloom Library data is developed specifically for the language modeling task. It includes data from nearly 400 languages across 35 language families, with many of the languages represented being extremely low resourced languages. Note: If you sp...

Expand Abstract

CCAligned consists of parallel or comparable web-document pairs in 137 languages aligned with English. These web-document pairs were constructed by performing language identification on raw web-documents, and ensuring corresponding language codes were corresponding...

Expand Abstract

Cross-lingual document alignment aims to identify pairs of documents in two distinct languages that are of comparable content or translations of each other. In this paper, we exploit the signals embedded in URLs to label web documents at scale with an average preci...

Expand Abstract

The development of linguistic resources for use in natural language processing is of utmost importance for the continued growth of research and development in the field, especially for resource-scarce languages. In this paper we describe the process and challenges ...

Expand Abstract

Open Resource Term Bank (OERTB) project is to support the collaborative development and dissemination of terminological resources, and thereby promoting the use of African languages in teaching and learning at higher education institutions....

Expand Abstract

Founded in 1988, the Folio Group has grown from a tiny start-up into the major-league language service provider that it is today. This is largely driven by our reputation for reliability, technical expertise, fast turnaround and meticulous accuracy. Folio is recogn...

Expand Abstract

Gboard is a virtual keyboard app developed by Google for Android and iOS devices.

This paper describes the named entity language resources developed as part of a development project for the South African languages. The development efforts focused on creating protocols and annotated data sets with at least 15,000 annotated named entity tokens for...

Expand Abstract

Sufficient target language data remains an important factor in the development of automatic speech recognition (ASR) systems. For instance, the substantial improvement in acoustic modelling that deep architectures have recently achieved for well-resourced languages...

Expand Abstract

Almost none of the 2,000+ languages spoken in Africa have widely available automatic speech recognition systems, and the required data is also only available for a few languages. We have experimented with two techniques which may provide pathways to large vocabular...

Expand Abstract