In the tenth issue of ActuIA magazine, founder Chris Emezue discusses Lanfrica and our data-centric and community-driven efforts to improve AI in low-resource African contexts and enable AI-driven businesses to thrive in Africa. We are honored to have our work featured in such a prestigious publication.
ActuIA is a French magazine for information and promotion of artificial intelligence. First source of French-speaking information on AI. ActuIA is a French-language online magazine that focuses on Artificial Intelligence (AI) news, research, and industry insights. The magazine covers a range of AI-related topics, including machine learning, natural language processing, and computer vision. With its comprehensive coverage and expert analysis, ActuIA is a valuable resource for anyone interested in the latest developments in the world of AI. ActuIA is a member of the European AI Alliance and a partner of Paris Machine Learning.
While the magazine is for the French-speaking audience, we briefly touch on some of what was discussed.
On the problems Lanfrica is attempting to solve
Lanfrica aims to solve the problem of the limited availability of language datasets and technologies for African languages. The insufficient machine translation systems, scarcity of language datasets, and speech recognition technology restrict access to digital content and services, which hampers cross-cultural communication. For example, currently, neither Amazon’s Alexa, Apple’s Siri, nor Google’s Home, the main players in the global voice assistants market, support a single native African language.
On the major bottleneck to AI development in Africa
According to Chris Emezue, the major bottleneck to AI development in Africa are large, high-quality datasets.
If somebody wants to create a voice-assistant for his or her native African language, the person cannot. When you look deeply at the reason, you find it is because high-quality datasets do not exist. This is the narrative we want to change.
Chris Emezue
Lanfrica is looking to create large, high-quality and Afrocentric training datasets for African AI. With over 1000 African resources linked, we are actively working to expand our catalogue to include more African context: we are covering a wide range of fields from artificial intelligence (here we are moving beyond natural language processing to include other areas like computer vision, forecasting, etc.), linguistics resources (books, research papers, and linguistic tools for African languages), domain applications of AI such as agriculture and medicine, and education.
It can be challenging to create large, high-quality training datasets for African languages. Simply scraping the web is not an effective solution, as it often yields low-quality content that does not accurately represent African perspectives.
Nonetheless, there are success stories like Kinyarwanda’s Digital Umuganda, which demonstrate the benefits of focused efforts to develop high-quality datasets for an African language. The created datasets have enabled the development of voice-enabled applications in Kinyarwanda in conjunction with Rwanda Culture and Heritage Academy (RCHA).