In 2022, Lanfrica, following its launch, set a goal to find and link at least 1000 African language resources. We are pleased to announce that on the last day of 2022, we achieved this milestone, and are now providing access to more than 1000 African resources on our platform. This achievement is a testament to the hard work and dedication of the Lanfrica team, and will have a lasting impact on the African AI and linguistic communities.
Information retrieval is a crucial aspect of the advancement of artificial intelligence (AI). It involves finding, organizing, and accessing relevant data and information, which is necessary for training and evaluating AI models. Without access to high-quality data and information, it is difficult for researchers and practitioners to make progress in the field.
Lanfrica’s work in linking African languages and AI resources plays a vital role in supporting the advancement of AI on the continent. By providing access to a wealth of resources, Lanfrica is helping researchers, linguists, students, and practitioners find and use the data and information they need to advance AI in the African context. This is particularly important in the African context, where access to resources can sometimes be limited due to linguistic, cultural, or technological barriers. By overcoming these barriers and linking resources, Lanfrica is helping to facilitate the growth and development of the African AI community.
Linking resources is a time-consuming and challenging process. It involves finding resources in mostly hidden places on the web, cleaning and preprocessing their metadata, and accurately categorizing their task, African language, and domain. In many cases, even with sophisticated algorithms, this requires manual review to reduce the number of false positives. At Lanfrica, we have worked tirelessly to find and link these resources, and are proud to offer such a comprehensive collection to the African AI community.
But our efforts go beyond just linking resources for African languages. We have been working to expand our resources to include more African context: we are covering a wide range of fields from artificial intelligence (here we are moving beyond natural language processing to include other areas like computer vision, forecasting, etc.), linguistics resources (books, research papers, and linguistic tools for African languages), domain applications of AI such as agriculture and medicine, education, all the way to even including language learning services, keyboards, and dictionaries. We believe that advancement in AI in Africa requires the cooperation and collaboration of various fields, tools and disciplines. It is a complex process that involves the integration of multiple components working together towards a common goal. By providing easy access to resources from multiple fields and disciplines, we hope to support the growth and development of AI in Africa.
Lanfrica is also working to showcase resources that could be useful but may not be widely known. For example, we have highlighted the public datasets from the Zindi competitions, which extends the utility of the datasets to users beyond the competitions. Another example are the posters from the 2022 Deep learning indaba. We believe that by bringing these resources to light and together, we can help support the growth and development of the African AI community.
Overall, we are thrilled to have reached this milestone of linking more than 1000 African language resources on Lanfrica. We are grateful to everyone who has supported us along the way (special shoutout to our partners AfricArxiv, Masakhane and KANAC), and we look forward to continuing our work to preserve and promote African languages, linguistic diversity, and AI in Africa.
Looking ahead to 2023, we have exciting plans for Lanfrica. We will continue to expand our resource collection, adding new materials as we find them. We are also working hard on creating African-centric, high quality, large datasets, such as Naija Voices, which will provide valuable data for researchers and practitioners in Africa. Additionally, we are providing support for resources which the owners do not wish to be fully open source (like the Sign-to-Speech for Sign Language Understanding dataset), so that we can provide even more comprehensive coverage of African languages and AI resources.
For more updates as we move forward,
- Be part of the Lanfrica community by joining our Slack.
- Follow us on Twitter.
- Follow us on Linkedln