Building language technologies for our under-represented languages

How do we ensure AI understands all and not only some of us?

Event Date: 7th February, 2023 | 18:00 CET

This webinar intends to discuss the question of how we can build language technologies for our minority/indigenous/endangered languages and dialects. Language is an important part of culture and so how do we ensure that artificial intelligence represents the culture of everyone in the world and not just the majority?

Watch the recording

Read more about the event

Language technologies, including natural language processing (NLP) and machine translation, are a key aspect of artificial intelligence (AI) that allows computers to understand, interpret, and generate human language. These language technologies have a wide range of applications, such as language translation, text classification, and sentiment analysis. They can be used to improve communication between people who speak different languages, as well as to analyze and understand large amounts of text data. Overall, the development and advancement of AI and language technologies have the potential to greatly improve communication and understanding between people of different cultures and languages.

However not all languages are equally represented in the development of language technologies. Building language technologies for unpopular languages is a crucial but often overlooked aspect of artificial intelligence (AI) development because it can help to preserve and promote linguistic diversity and provide access to information and communication for speakers of these unpopular languages.

Low-resource languages, i.e., minority languages, indigenous languages/dialects, endangered languages etc., are in a perilous plight and on the verge of extinction. Efforts towards language revitalization are therefore high priority actions required to preserve and sustain the knowledge and assets held by communities within specific locales.

This webinar, which features two experts in creating language technologies for unpopular languages, will address questions like “What is the current state of unpopular languages in artificial intelligence?”, “What are the challenges facing the inclusion of these languages?” and “What are some of the works done, both in industry and academia, to include the minority languages in technologies?”. The webinar is designed to include perspectives from both the research sector and the industrial sector.

Meet the Speakers

Sebastian Ruder is a research scientist at Google based in Berlin, Germany, working on natural language processing (NLP) for under-represented languages. Before that he was a research scientist at DeepMind. He completed his PhD in Natural Language Processing and Deep Learning at the Insight Research Centre for Data Analytics, while working as a research scientist at Dublin-based text analytics startup AYLIEN. Previously, he studied Computational Linguistics at the University of Heidelberg, Germany, and at Trinity College, Dublin. During his studies, he worked with Microsoft, IBM’s Extreme Blue, Google Summer of Code, and SAP, among others. He’s interested in transfer learning for NLP and making ML and NLP more accessible.

Felix Laumann is the CEO of NeuralSpace. NeuralSpace is a Software as a Service (SaaS) platform which offers developers a suite of APIs for Natural Language Processing (NLP) that you can use without having any Machine Learning (ML) or Data Science knowledge. Their primary goal is to democratize NLP and make sure any developer can create apps with advanced language processing in any language and not just English.

Meet the Hosts

Daria Yasafova, a master’s student in Mathematics in Data Science at the Technical University of Munich and a researcher in Artificial Intelligence (AI) and Machine learning. She is interested in life extension and the intersection of AI and healthcare.

Chris Emezue, a master’s student in Mathematics in Data Science at the Technical University of Munich and a researcher in Artificial Intelligence and Machine learning with a focus on languages (Natural Language Processing, NLP) and particularly on low-resource African languages. He is also a researcher at Mila – Quebec AI Institute and the founder of Lanfrica, where he works on accelerating the development of AI applications in African and underrepresented regions. He is an active member of Masakhane and SisonkeBiotik, where he focuses on natural language processing for low-resource African languages and machine learning for healthcare, respectively.