Join our mailing list to get updates on our events, news, and the latest from the world of African language resources.

Your email is safe with us. We promise not to spam!
Lanfrica’s new look is almost here! Get ready for a whole new way to discover. .
X
Filter

Filter Records

Languages

Loading...

Tasks

Loading...

Record Types

Loading...

Tags

Loading...

Introduction BOLT Egyptian Arabic Treebank - Conversational Telephone Speech was developed by the Linguistic Data Consortium (LDC) and consists of Egyptian Arabic conversational telephone speech data with part-of-speech annotation, morphology, gloss and syntactic ...

Expand Abstract

Introduction BOLT Egyptian Arabic Treebank -- Discussion Forum was developed by the Linguistic Data Consortium (LDC) and consists of Egyptian Arabic web discussion forum data with part-of-speech annotation, morphology, gloss and syntactic tree annotation. The DAR...

Expand Abstract

Introduction BOLT Egyptian Arabic Treebank - SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of Egyptian Arabic SMS/Chat data with part-of-speech annotation, morphology, and syntactic tree annotation. The DARPA BOLT (Broad Operational ...

Expand Abstract

Introduction BOLT Egyptian Arabic-English Word Alignment -- Conversational Telephone Speech Training was developed by the Linguistic Data Consortium (LDC) and consists of 153,171 words of Egyptian Arabic and English parallel text enhanced with linguistic tags to i...

Expand Abstract

Introduction BOLT Egyptian Arabic-English Word Alignment -- SMS/Chat Training was developed by the Linguistic Data Consortium (LDC) and consists of 349,414 words of Egyptian Arabic and English parallel text enhanced with linguistic tags to indicate word relations....

Expand Abstract

Neural machine translation (NMT) has achieved great successes with large datasets, so NMT is more premised on high-resource languages. This continuously underpins the low resource languages such as Luganda due to the lack of high-quality parallel corpora, so even ‘...

Expand Abstract

The Carpentries build global capacity in essential data and computational skills for conducting open research. Last year, supported by an Event Fund grant, the Carpentries hosted their first virtual CarpentryConnect from South Africa. The event included 2 Carpentr...

Expand Abstract

The contrast between the need for large amounts of data for current Natural Language Processing (NLP) techniques, and the lack thereof, is accentuated in the case of African languages, most of which are considered low-resource. To help circumvent this issue, we exp...

Expand Abstract

OpenNMT checkpoints of the models (Procedure D) from the paper "Congolese Swahili Machine Translation for Humanitarian Response" published in Africa NLP workshop organized within the 16th Conference of the European Chapter of the Association for Computational Lingu...

Expand Abstract

In this paper we describe our efforts to make a bidirectional Congolese Swahili (SWC) to French (FRA) neural machine translation system with the motivation of improving humanitarian translation workflows. For training, we created a 25,302-sentence general domain pa...

Expand Abstract

Building effective neural machine translation (NMT) models for very low-resourced and morphologically rich African indigenous languages is an open challenge. Besides the issue of finding available resources for them, a lot of work is put into preprocessing and toke...

Expand Abstract

Even though Afaan Oromo is the most widely spoken language in the Cushitic family by more than fifty million people in the Horn and East Africa, it is surprisingly resource-scarce from a technological point of view. The increasing amount of various useful documents...

Expand Abstract

When it comes to scientific communication and education, language matters. The ability for science to be discussed in local indigenous languages can not only help expand knowledge to those who do not speak English or French as a first language but also can integrat...

Expand Abstract

Research into machine translation for African languages is very limited and low- resourced in terms of datasets and model evaluations. This work aims to add to the field of neural machine translation research, for four low-resourced Southern African languages. The ...

Expand Abstract

Recent research in natural language processing (NLP) has achieved impressive performance in tasks such as machine translation (MT), news classification, and question-answering in high-resource languages. However, the performance of MT leaves much to be desired for ...

Expand Abstract