Alex Waibel, who holds faculty appointments in both the Language Technologies Institute and at the Karlsruhe Institute of Technology (KIT) in Germany, reports that his German lab has developed a computer system that for the first time outperforms people in recognizing conversational speech.
It's difficult even for people to accurately transcribe conversations, Waibel said. "When people talk to each other, there are stops, stutters, hesitations such as 'er' or 'hmmm,' laughs, and coughs," he said. "Often, words are pronounced unclearly." With humans capable of no better than a 5% error rate, the feat has remained a major challenge for artificial intelligence.
Now, KIT scientists and the staff of a KIT startup company called KITES have developed a system that boasts a 5% error rate with just a one-second delay, or latency. The measurements are based on an internationally recognized benchmark for the task established by the Defense Advanced Research Projects Agency.
Waibel said fast, high accuracy speech recognition will enable better voice-based interactions with machines.