Listen to this article
Estimated 4 minutes
The audio version of this article is generated by text-to-speech, a technology based on artificial intelligence.
Olivia Daub’s two-year-old son is obsessed with “doodidees.” He talks about them and screams for them at 5 a.m. every day.
Daub said most people don’t have a clue what her son is saying, but she knows to bring him the tiny, dark blue fruit that he actually wants: blueberries.
“We’ve all been children and we’ve all had the experience of not being understood by adults. Inversely, we [adults] have all had a really hard time understanding children because they produce speech and language in ways that are different from adults.”
Daub, an assistant professor in Western University’s school of communication sciences and disorders in London, Ont., said understanding toddler-speak is even trickier for artificial intelligence (AI). That’s why she is leading new research on how AI can better understand the way toddlers talk.
Daub said that while automatic speech recognition software, such as automatic closed captions on Zoom meetings and Amazon’s Alexa virtual assistant, has become good at recognizing adult speech, it still struggles to accurately pick up what young children are saying.

“I think we’ve all seen YouTube clips of a toddler asking Alexa to play a song, and getting something completely different and really inappropriate,” she said. “This study is trying to understand how we can leverage AI and machine-learning principles to improve recognition for toddlers and preschoolers.”
To do that, she’s working with Western electrical and computer engineering assistant professor Soodeh Nikan to train an AI model on toddlers’ common speech patterns and shortcuts.
“Most of the speech models that we have are trained with adult speech, so that’s why most of these models are not very successful in recognizing that toddler speech, especially the mistakes that they make,” Nikan said.
“You have to provide examples to AI [for it] to be able to understand and distinguish normal mistakes and speech disorder problems.”
How the study will work
Daub plans to bring in a sample of 30 children to play, tell stories and speak to research assistants. Each session will be recorded and transcribed by humans, who will also collect data on children’s speaking patterns.
One common pattern, Daub said, is that many English-speaking toddlers struggle to pronounce the “r” sound and instead use a “w” sound.
Data like this will be handed over to Nikan, who will feed the information into a private AI model in order to train it.
“We can fine-tune these models using the data that is specifically annotated for this purpose,” said Nikan, adding the AI model will also be trained with some data from already-existing OpenAI online.
Daub and her team have so far met with nine toddlers and are still seeking more study participants.
Clinical and everyday uses for AI model
While the research is in its early stages, Daub and Nikan said their goal is to train an AI model that can be used in a clinical setting, helping speech language pathologists analyze and transcribe what kids are saying.
“I don’t think we’ll ever be 100 per cent accurate unless we’re following children around for 24 hours a day … but I think we can get a lot closer than where we are at,” she said.
Down the line, Daub said, if AI can better understand preschoolers, it could improve tools such as closed captioning and voice-activated accessibility software, while also allowing kids more space to play with technology.
“We can think really creatively about what these little people can contribute to society. They’re not just consumers of the world around them. Giving them access to technology is an important consideration too,” Daub said.

