personal background

Product images of the rita1 prototype transcription device
If you have seen The Sound of Metal you have been moved to tears by witnessing the tragic decline of a person’s hearing (and mental health) in their adult life. It’s not something I think many people have a contingency plan for nor something that society has developed a neat solution. You’ve spent your whole life with a certain set of tools and then late in the game a critical one is taken away when you need it the most.

We experienced our own version of this event with my wife’s grandparent losing their hearing well into their golden years, with little motivation nor ability to learn to adapt to the loss. The film continues as we did in our situation to the process of scrambling to adapt to maintain communication with the affected person.

pen & paper

The first iteration for us, the movie, and I imagine most people was to adopt a pen and notebook. We’d write down questions and concerns and Nanna would respond (usually vocally). This had the advantage of being very cheap and simple but falls apart significantly for social communication.

While this solution satisfied many of the requirements, and is possibly as far as most affected people get into this process, it fails to maintain conversation flow. This puts a burden on the affected person for setting the velocity and spontaneity of group conversations.

Any quick remark deserving of a laugh had to be considered remarkable enough for someone to pause and jot it down for Nanna’s consideration. Given the availability, addiction-like qualities of tablet computers I thought there might be an app-based speech to text solution that could at the very least automate that process and give Nanna some more agency.

RITA1 - Rapid Interactive Transcription Aid

I quickly drew up a quick list of important design considerations to ensure the success of this concept so as not to waste Nanna’s important and limited time:

  • Simple to use (has to be causally operable by a variety of users)
  • Durable (in a care home setting expect things to be knocked, spilled on, connected the wrong way and played with)
  • Reliable (needs to be always available and dead simple to troubleshoot if something goes wrong)
  • Low cost
  • Useful (above all else it should solve a problem)

A quick look online for a cheap and available tablet computers came up with a Samsung Galaxy A9 on top. It’s pretty unremarkable in most respects but I recalled from a previous life setting up tablets for retail displays that the Samsung ones have some extra capabilities to lock it down to a single function with Knox.

With a new A9 in hand I went about finding an Automatic Speech Recognition (ASR) application and a way to make it this devices primary function. Of the few applications I found and tested, the Live Transcribe & Notification application by Google Research proved to be head and shoulders above the rest in terms of fitting my design criteria.

The app is a collaboration between Google and Gallaudet University that can perform a number of listening tasks using the device microphone and provide real-time visual or physical feedback.

Another attractive attribute of this application is that the current version (as of 2025) supports Android 12 at a minimum which was released in 2021, meaning the possibility of a large number of nearing or at EOL devices might be supported.

With the application loaded up and running beautifully I set about protecting this investment by procuring a foam case from Amazon to protect it from falls and hopefully some spills.

To lock down the device to just the transcription application there are a number of solutions but I landed on the excellently named Fully Single App Kiosk.

This app lets you set an application to immediately launch on powering up the device and restrict users from accessing other functions of the device (kiosk mode). Thankfully the application also allows for a handy secret handshake to unlock the device from kios mode and turn it back into a normal tablet. I believe this is essential for any device to be troubleshooted or maintained in the field.

With most of the design goals achieved it was time to test the most important ones: reliability & durability. I drew up some basic documentation and sent the device and charger off to Nanna for testing.

A picture of the transcriber mishearing someone as saying GET LOST
The results exceeded all of my expectations. Nanna adopted the transcriber and it seemed to improve with keeping connected and eventually became an invaluable tool to navigate the day.

From my own perspective our family visits seems to have a more natural flow with Nanna being able to follow along at the groups pace, and without the expending the effort to write it out. We also had positive feedback and interest from the care givers at the facility that found it to be an improvement on what they usually see.

A screenshot of a text message.

future work

There are a number of quality of life features that would add value to this project including:

  • Identify different speakers
  • Screen size
  • Charging/obsolescence/waste
  • Remote support
  • Larger screen
  • Eye tracking to follow readers pace
  • Ignore speakers words

I’ve posted a project repo on codeberg to continue documentation and development on this and any other products/devices under the same theme.

As an added insurance policy and fun couples activity myself and the wife are planning on learning American Sign Language this year.