Cosmos Magazine: Speaking of robots
Isaac Asimov predicted humanoid robots by 2010, but even getting robots to speak like humans do is a major challenge, says Jonathan Lowe.
NO DOUBT YOU'VE seen talking or singing robots, mainly from Japan, where development into robotic speech has been ongoing for over a decade.
Yet surprisingly, the recreation of human speech is more complex than most people realise. Coming up with a robotic voice that is indistinguishable from human in expression may still be some time off. What complicates the problem is the variety of vocal qualities that we generate when we talk.
It is not merely that human vocal cords lend vibration to a stream of air from the lungs, and then shape these vibrations within the voice box. The entire throat cavity, including the larynx, mouth, tongue as well as the nasal cavity and lips, assist forming words and sounds, taking cues from aural feedback processed by the brain.
"Consider how we are able to speak in the first place," says Sayoko Takano, an expert in bio-mechanics who worked in robotic speech research a decade ago in Japan, and is now doing research into magnetically-controlled, tongue-operated wheelchairs for paraplegics at the University of Arizona in Tucson.
"Not only do we have to control respiration, the vibration of our vocal folds, plus our tongue and lip and velum motion, but also the tension of the larynx, the motion of the tongue, and the shape of the vocal tract itself. No computer voice synthesiser can yet match this complexity without coming off sounding artificial," she says.
ROBOTICIST HIDEYUKI SAWADA of Kagawa University in Takamatsu, Japan, agrees. "Voice quality depends not only on control and learning techniques, but also on the materials, which should be very close to the human anatomy.
"The dampness and viscosity of the organs have influence on the quality of generated sounds, like what you experience when you have a sore throat," he says.
The typical method for generating human-like voices in robots was by using software algorithms - in the same way that computer speech is simulated. "But we now try to generate human-like voices by the mechanical way, as humans themselves do," says Sawada.
"The goal is to totally reconstruct the human vocal system mechanically, with learning ability for autonomous acquisition of vocalisation skills."
In short, what scientists in Japan are doing are creating robots which mimic the way humans actually speak, which is the only way to obtain the qualities that would make you believe a human is speaking.
Of course one might think that building a tongue and larynx robot would be relatively easy, given today's engineering technology. But again, speech organs are very different from limbs like the leg or arm.
"The tongue is a bundle of muscle assemblages composed of seven main tongue muscles, and there are also lip and jaw muscles adding up to more than 30 combinations in controlling the speech organs," says Takano. "Each muscle moves by activity innovated in the brain."
The tongue and lip have both voluntary and involuntary muscles, and also fast-weak and slow-strong controls. So the complex relationship between speech-related air flows from tongue, muscle activation, muscle character, and vibration of the vocal fold mean that a non-sentient computer with replicated and motorised parts is still at a disadvantage to a human, Takano says, who was lucky enough to obtain these through evolution.
ENCYCLOPEDIA ASIMOVA IS UPDATED ON MONDAY, WEDNESDAY AND FRIDAY
SUBSCRIBE TO OUR OTHER KINDLE BLOGS:
* Seaborn: Oceanography Blog
* Star Trek Report: Space Sciences
* Volcano Seven: Treasure and Treasure Hunters
* Rush Limbaugh Report