You say eether and I say eyether
You say neether and I say nyther
Eether, eyether, neether, nyther
Let’s name the entire thing off!
You like potato and I like potahto
You like tomato and I like tomahto
Potato, potahto, tomato, tomahto!
Let’s name the entire thing off!

— Ella Fitzgerald and Louis Armstrong, “Let’s Call the Whole Thing Off

The variety of methods to pronounce “water” could also be as ubiquitous because the substance itself. In 2003, a group of Harvard University researchers concluded a survey and mapped out the idiosyncrasies within the pronunciation of frequent English phrases throughout the globe. (How do you say “pecan?”)

But simply because the identical phrase could sound humorous—maybe even unintelligible—in numerous components of the world doesn’t imply the speaker is fallacious. Yet among the voice recognition techniques at the moment had been neither designed nor skilled to choose up totally different dialects, and that may be an issue when they’re used to evaluate oral fluency, says Patricia Scanlon.

“I can drive three counties away, and probably won’t understand some of the words people say,” quips Scanlon, the founder and CEO of SoapBox Labs, who is predicated in Dublin, Ireland. “But that doesn’t mean they’re wrong.”

Scanlon has loads to say about speech. She is an engineer who holds a doctorate in speech-recognition expertise, and has spent the previous 20 years researching this discipline at Columbia University, IBM and Bell Laboratories.

In 2013, after watching her three-year-old daughter work by means of phonics workout routines on an iPad, it struck Scanlon that the apps on the time usually did poorly measuring oral fluency. Some provided multiple-choice questions, for example, that merely requested children to choose the proper pronunciation—hardly a helpful train.

She observed one other drawback: Many of the apps used speech recognition applied sciences that had been skilled with grownup information samples. “The vocal cords and their development are vastly different between children and adults. Some kids tend to over-enunciate or elongate words, and say things in ways that adults just don’t,” says Scanlon. “This messes with the speech recognition systems that were built for adults.”

In different phrases, children speak like, nicely, children. And that may trigger snafus when kids work together with voice assistants. Just a few years in the past, a video went viral of a kid asking Alexa to play a music and getting raunchy outcomes in return. (The firm mentioned it has fastened the glitch.)

Even worse, Scanlon says, some voice recognition instruments would even modify grownup voice samples to make them sound extra high-pitched, even squeaky, to imitate how kids may sound.

Standing on a Soapbox

Those shortcomings spurred her to begin SoapBox Labs in 2013 and develop voice recognition expertise constructed particularly for kids aged three to 12.

That effort concerned ranging from scratch and amassing copious information from children internationally. Scanlon says her group, which now numbers 22, has captured hundreds of hours of audio samples from kids throughout 170 international locations. These recordings generally happen in residing rooms, kitchens or open air, she provides, in an effort to seize how kids converse in environments which might be extra pure than the managed laboratories through which grownup audio samples are often recorded.

The purpose is to construct a strong information set that displays the vary of dialects and pronunciations through which English is spoken internationally, and which can be utilized to construct instruments that assess for speech and oral fluency. Scanlon says it needs to be potential, for instance, for customers to tune a system to acknowledge that totally different pronunciation of phrases are acceptable.

Collecting that quantity of knowledge from children could nicely give privateness advocates pause. Scanlon says she’s conscious of the issues, and notes that “we unequivocally say that the data we have is used to improve our system, and not to market and resell.” This summer time, the corporate submitted a paper to the United Nations Office of the High Commissioner for Human Rights that acknowledges the issues and dangers concerned in amassing kids’s information for speech-recognition techniques, and descriptions the “privacy by design” rules that it follows.

Today, the corporate claims its expertise can energy assessments for phonics, sentences and different fluency measurements. Its work has been financially supported by $5.9 million in outdoors funding.

Soapbox doesn’t develop user-facing merchandise, however licenses its speech recognition instruments to 25 different organizations that combine them into their merchandise. They embrace the MIT Media Lab, which is utilizing the system for an academic robotics undertaking, and Lingumi, a developer of an English studying app for younger kids. The firm has additionally partnered with Microsoft to convey its speech-recognition instruments onto its cloud computing platform Azure. (The software program large is piloting the expertise in 20 faculties in Ireland and Britain.)

This week, Soapbox introduced its latest analysis companion: the Florida Center for Reading Research at Florida State University, which can examine how the speech-recognition expertise fares in pilots with 1,000 college students in grades Okay-2 throughout Florida, Georgia, Oregon and South Carolina.

For Yaacov Petscher, an affiliate professor at Florida State and affiliate director on the studying analysis heart there, a key side of this partnership is to place the expertise by means of discipline exams in a wide range of real-world classroom environments. “It’s critically important that we have a speech-recognition system that is able to take into account non-mainstream English, and that kids are not going to be misidentified as having a problem like dyslexia just because they speak differently or in a different dialect,” he says.

The plan is to check one other 1,000 college students in 2020, and one other 600 the next yr. “The major pieces are looking at how speech recognition can accurately judge a child’s response compared to a traditional assessor or teacher scoring of child accuracy on reading and language items,” he provides.

This partnership is a part of the Reach Every Reader initiative, a five-year early literacy undertaking that additionally consists of researchers from Harvard, MIT and the Charlotte-Mecklenburg School District in North Carolina. This effort is funded by a $30 million grant from the Chan Zuckerberg Initiative.

If all goes nicely, Scanlon is bullish on the chance that this expertise will help scale how oral assessments are delivered, with precision and privateness because the foremost precedence.

“It’s hard to imagine a future,” she says, “where speech recognition technology won’t be used to help children with literacy and fluency.”

To Build Speech Recognition for Early Literacy, Soapbox Labs Gives Kids a Voice – Online MBA No GMAT


Please enter your comment!
Please enter your name here