AI is already better at lip reading than we are
dates:2022-11-01    Browse times: 237    

They Shall Not Grow Old, a 2018 documentary about the lives and aspirations of British and New Zealand soldiers living through World War I from acclaimed Lord of the Rings director Peter Jackson, had its hundred-plus-year-old silent footage modernized through both colorization and the recording of new audio for previously non-existent dialog. To get an idea of what the folks featured in the archival footage were saying, Jackson hired a team of forensic lip readers to guesstimate their recorded utterances. Reportedly, “the lip readers were so precise they were even able to determine the dialect and accent of the people speaking.”

 

“These blokes did not live in a black and white, silent world, and this film is not about the war; it’s about the soldier’s experience fighting the war,” Jackson told the Daily Sentinel in 2018. “I wanted the audience to see, as close as possible, what the soldiers saw, and how they saw it, and heard it.”

 

That is quite the linguistic feat given that a 2009 study found that most people can only read lips with around 20 percent accuracy and the CDC’s Hearing Loss in Children Parent’s Guide estimates that, “a good speech reader might be able to see only 4 to 5 words in a 12-word sentence.” Similarly, a 2011 study out of the University of Oklahoma saw only around 10 percent accuracy in its test subjects.

 

“Any individual who achieved a CUNY lip-reading score of 30 percent correct is considered an outlier, giving them a T-score of nearly 80 three times the standard deviation from the mean. A lip-reading recognition accuracy score of 45 percent correct places an individual 5 standard deviations above the mean,” the 2011 study concluded. “These results quantify the inherent difficulty in visual-only sentence recognition.”

 

For humans, lip reading is a lot like batting in the Major Leagues — consistently get it right even just three times out of ten and you’ll be among the best to ever play the game. For modern machine learning systems, lip reading is more like playing Go — just round after round of beating up on the meatsacks that created and enslaved you — with today’s state-of-the-art systems achieving well over 95 percent sentence-level word accuracy. And as they continue to improve, we could soon see a day where tasks from silent-movie processing and silent dictation in public to biometric identification are handled by AI systems.

 

The article is reprinted from the Internet. If there is any issue like copyright or others, please contact: lmy01@gdchico.cn to delete it.

 

AI is already better at lip reading than we are

https://www.engadget.com/ai-is-already-better-at-lip-reading-that-we-are-183016968.html

 

Copyright ©  Guangdong ICP Prepared No. 16087234-1

 

Contact Us

Guangdong Chico Electronic Inc.

Email: yhy01@gdchico.cn

Business Phone: +86-757-86394397

Whatsapp/Phone: +86 18038859196

Office Address: Rm.1604, Tianan Powerise Bldg., Jianping Rd., Nanhai Dt., 528200, Foshan City, Guangdong Prov., China

Factory Address: 2/F, Block A, Hantian Technology Park, Dongping Rd., Nanhai Dt., 528200, Foshan City, Guangdong Prov., China