Half of the world’s languages are expected to disappear by the end of the century. This is a huge cultural loss to humanity. When we think about endangered languages, we usually consider them as part of traditions that link us to the past. From a forward-looking perspective, they mean more than cultural heritage. When a language dies, a unique vision of the world is gone forever.
Does the language we speak online matter? Studies show that it deeply affects people’s experience of the Internet. It determines how much information we can access, who we choose to connect with and how we behave in our community. Keeping languages alive is essential to shape our future. The Internet offers the greatest chance to have a public voice in response to cultural globalization, a languages renaissance.
UNESCO is convinced that multilingualism on the Internet has a key role to play in fostering pluralistic, open and inclusive knowledge societies.
A project called Siminchikkunarayku, supported by The Internet Society Peru Chapter and the Beyond the Net Funding Programme, aims to build the linguistic corpus of the southern Quechua language by collecting and digitizing 10,000 hours of speeches. The Quechua is a family of languages native of America spoken by people living in and around the highlands of South America.
The linguistic corpus (literally Latin for body) is a collection of texts and recorded speech that have been selected and brought together so that a language can be studied on the computer. The job of corpus building divides itself into two stages: at first the texts are collected and well organized then the corpus design and digitization can start. This brief interview with Luis Camacho, telecommunications engineer and project coordinator, gives us a deeper insight of the project.
What inspired you to take action?
Languages around the world are dying, and dying fast. Although this may not seem important in the day-to-day life of an English speaker with no personal ties to the culture in which they are spoken, languages matter. The effects of their loss could be culturally devastating. Each language is a key that can unlock local knowledge about medicinal secrets, ecological wisdom, weather and climate patterns, spiritual attitudes, and mythological histories. In South America, there are 400 native languages which are still alive but all endangered. Native languages interaction with computers is today possible and should be a right to the entire South American citizens. Our desire is to facilitate a horizontal form of intercultural exchange and dialogue of South American native people with the rest of the world.
How will you achieve the project’s purpose?
Our initiative is introducing a holistic vision: the use of artificial intelligence and multimedia to preserve and foster South American native languages. Artificial Intelligence allows the computational portability of languages that involves creating systems of natural language processing (NLP). Currently, the level of computerization of South American languages is extremely low due to the following reasons:
- Lack of a unique writing system or an established spelling
- Lack of massive presence on the Internet
- Lack of a critical mass of experts and linguists
- Lack of electronic resources: monolingual corpora, bilingual electronic dictionaries, database transcribed speeches, pronunciation dictionaries, or specific vocabularies.
It is clear that diminishing the lack of resources is the first step. Thanks to the ISOC grant, we are going to collect 10,000 hours of Quechua language speeches from radio programs. This collection will become a milestone for the building of the Quechua language corpus.
What is the Chapter’s role in this initiative?
ISOC Peru Chapter believes that this initiative will have an important role in strengthening the native languages. ISOC Peru will follow closely the project’s progress and will monthly inform the Peruvian community with graphics, visual data, and valuable information.
Do you have a great idea to make your community better via the Internet? Applications for Medium and Large Scale Projects are now open.