Robot learns how to lip sync using AI and YouTube
Meet EMO, the lip-syncing robot
- Published
Robots can already dance, play football, and even help out with jobs around the house. But now, engineers in the US have invented one that lip sync.
The robot is called EMO, and it can learn and recreate the way that humans move their lips when they talk.
Human lip movements happen due to a complicated combination of muscles, bones and skin working together - which scientist say is very hard to reproduce.
Instead of being given step-by-instructions to follow, EMO used artificial intelligence (AI) and a process called 'observational learning' - which means learning by watching and copying another person's behaviour.
The robot even learned to sing a song out of its own AI-generated debut album "hello world".
More like this
Why might you spot this robot dog in the Forest of Dean?
- Published3 December 2025
The special robot helping rebuild Pompeii's treasures
- Published28 November 2025
What if robots tried to play football...?
- Published30 June 2025
How did the robot learn to lip sync?

Could a future robot beat you in a lip sync battle?
First, the robot was programmed to keep moving the 26 motors in its face and to watch itself doing that using its own reflection in a mirror.
It made thousands of random face expressions and lip gestures, learning how to move its motors to achieve particular movements.
Then, scientists used hours of YouTube videos to show how humans move their mouths when they speak and sing.
This helped the computer inside the robot match mouth movements to sounds, too.
In the science journal Science Robotics, researchers at Columbia University in New York explained how they tested EMO by playing it lots of sounds, including different languages and songs to see if it could keep up.
What did the robot scientists say?

Getting a robot face to act and move like a human's is big challenge
It's thought to be the first time that a robot has ever been able to do this.
But making EMO wasn't easy.
Scientist Hod Lipson said the robot found it tricky to copy sounds like "B" and "W", which need careful lip movements.
He added: "The more it interacts with humans, the better it will get."