In a recent post Honey, The AI Shrunk the Language Divide, I talked about how AI has the capacity to be the game-changer in shrinking the language divide and bring many languages back from the brink of extinction. And from this, I was asked the question:
Will the divide be closed through development alone, rather than something to be actively researched and worked on because, as far as anyone could see, GPT-5 would be much better and GPT-6 would blow your proverbial socks off and GPT-10 would be so powerful that it would make the last year of AI hype look like being excited by a bicycle when you now have a motorcycle...?
Honestly, with my coder-hat on, that would be ideal—we are simply on the train-track towards accessible technology for all and getting there one line of python at a time. But that isn't the case: it simply has the capacity technically, and there is so much more at play and to do by us and by every individual to get to accessible technology and AI for languages.
At a recent Network to Promote Linguistic Diversity (NPLD) event, issues like technology accessibility in minority languages for those with disabilities and accessibility in technology generally were discussed. What was palpable from the talks was the need to include all needs and interact with all stakeholders—regardless of their population proportion in order to build truly accessible technology.
We can't rely on big-tech alone. The job of profit-driven companies is to make profits, and that generally comes from working with the largest markets, even if smaller markets are in a greater need of the service. This leads to things like having cutting-edge text-to-speech for the major languages like we have in OpenAI, and shoddy performance or simply no availability for others. We could just wait for the technology to be built, but what priority would it be given when they have many other products and services with potentially greater income that they could provide? The chances are high that it would never happen...
If you want something done right, you often times have to do it yourself. We are seeing this a lot lately as the fight against bland centralisation occurs. You are constantly being asked by companies: "Why use a product that is so-so for most things when you can use this one that is perfect for exactly the job you want done?" Recent success stories in this are Arc for browsing and Superhuman for email. This is similarly true for language technology: there is a need to build the technology that is exactly what the users need—technology that is sensitive to the sociolinguistic aspects and dimensions of the language. For example, this means having availability for the multiple dialects of Irish or, for a language learning application, developing technology so that learners of a language can have automated speech recognition that recognises them while also having text-to-speech for their aural lessons that is trained on a native speaker instead of a learner's voice which may teach them the wrong sounds—and preventing us from being mistaken for a native speaker! We have already seen with Duolingo's AI voices for Irish how a mis-alignment between the user's needs and the outlook of large companies can lead to a lack of accessibility and frustration.
These products, services and vital research need support; and, that support needs to come from the top as well as from the bottom. When you can't rely on large profits, much of the investment world is out. Without government-led investment and support, much of these technologies will never be available to serve the users most in need.
The future of accessible technology needs you. In the case of Irish, we particularly need the native speakers to give their time and their voices for text-to-speech and speech recognition. We need the language community at large to shape the products and services by communicating their needs to those who are developing the technology. This is true for all minority languages all around the world.
Building the technology is not enough though; it also has to be made available and accessible where it is needed most. A closer and more active community will mean easier distribution of the technology because the developers and researchers will understand and know where the needs have to be met. Additionally, stronger ties with the larger players will be needed to round off their suite of services to be truly accessible. Friction-less delivery end-to-end from developer to consumer is something that won't be remedied by the technology alone and is about people-power in engaging all the stakeholders and intermediary services.
It is true that much of the development of these technologies will be accelerated by AI development. But for many languages out there right now, the vehicle that is their products and services hasn't even gotten to the start line—it simply doesn't exist—so each and every language needs each and every member of their language community to come together and build them up and push them there, so that they too can be accessible to all.