Thousands of years ago, from the sprawling savannas of Africa came the origins of early spoken language. Spoken language served a common purpose. It was a beacon of social solidarity, creating feelings of cultural familiarity and kinship.

Thousands of years later came a way to record language: the written word. And today, six decades into the computer revolution, the arduous onus has been placed on humans to employ written commands rather than the innate spoken word.

But that’s changing.

Since its inception in the 1950s, speech recognition has been silenced by its inability to accurately understand the human voice. Thanks to recent technological advancements, speech recognition is approaching the 95 percent accuracy mark, the catalyst where early adoption of speech recognition greets the mass majority. Speech recognition will become more influential— especially in healthcare—as older and less technologically savvy patients choose to interact with their mobile device in their own language, using their voice. 

Lost in translation

As the focus moves from treating patients in the hospital to their homes, a patient’s limited ability to comprehend their clinician’s instructions or directives can have devastating consequences on their health. Patients who have limited health literacy tend to suffer from worse health, are less likely to undergo regular preventative screenings, and are more frequently hospitalized. Limited health literacy is also associated with an almost two-fold increase in mortality in the elderly. Given the mechanisms by which inferior health outcomes can be attributed—in part to limited health literacy—a common means of communication that transcends language and cultural barriers is required.

The rise of contextual conversational interfaces, where the clinician can communicate with the patient regardless of language or educational attainment, will create a shared common ground of understanding. Skype was one of the first to offer real-time voice-to-voice translation in 2015, but the future of natural language processing will go far beyond this point, enabling computers to comprehend not just the literal definition of our words, but the connotations, context, and intent behind them. The use of voice-to-voice communication will perpetuate a mutual understanding across culture, language, and education, ensuring that the patient truly comprehends their clinician’s diagnosis, treatment, and care plan. 

To search and protect

We turn to the internet first and foremost when we seek answers or confirmation of self-diagnosis, with approximately 72 percent of internet users searching for their symptoms online. But in the hands of patients, the vast surplus of sometimes false medical information available can prove to be hazardous. With the rise of voice-activated, artificially intelligent platforms, including Amazon Echo, Apple Siri, and Microsoft Cortana, these devices can take into account other factors, including location, historic behavior, and quantitative health data measurements, to serve accurate and summarized information via voice.

But the future of voice search will be predictive and query-less, serving the user with relevant summarized information before even being prompted. This principle is based on the semantic prediction of needs where a user’s habits or behavior could indicate signs of an illness, and could then be notified in real-time by their artificially intelligent conversational side-kick.

Remote disease detection and diagnosis

A proactive and preventative approach to healthcare involves the preemptive diagnosis of patients from afar. Speech recognition will enable remote disease identification and diagnosis to be conducted in an effective, efficient, and economical way. Already, using an analysis of voice parameters, researchers were able to detect the presence of Parkinson’s disease with 98.6 percent accuracy. While this process currently requires a researcher to analyze the results, it clears the way for an automated process to remotely diagnose disease. In the not-too-distant future, speech recognition technology will be able to accurately identify a user’s psychological state of mind, and even predict the onset of Huntington's and Parkinson’s disease, and oral and laryngeal cancer, just to name a few.

While speech recognition will play a definitive role in the race to end healthcare inequality, we need to be cautious that it does not serve to further exacerbate existing healthcare disparities. Health insurers will be able to remotely identify and diagnose their potential customers as a means of pre-screening, while the creators of the faceless voices will be able to curate and influence the summarized information that we are served. The addition of voice will fundamentally reimagine healthcare, but we need to be careful to listen to our own voice, and ensure that our moral compass points toward the true north as we embrace Siri and all her friends.