Haiper is the latest artificial intelligence startup to come out with its own generative video model, joining the likes of Pika Labs, Runway and OpenAI’s Sora in tackling AI storytelling.
The company is focused on quality and at the moment is producing short two-second clips, but it has big ambitions, including setting its sights on Sora-like realism in the future.
Like OpenAI and Anthropic, Haiper is working towards artificial general intelligence (AGI), building what it calls a "powerful perceptual foundation model".
In the clips I generated it seemed to have a more native understanding of motion, not requiring extensive prompting or motion controls to get the flutter of wings or dancing correct. Haiper is also currently free.
What is Haiper?
Haiper is the brainchild of former Google DeepMind and TikTok engineers, bringing together cutting-edge machine learning skills to make creativity more democratic.
I spoke to Dr Yishu Miao, co-founder and CEO of Haiper about the product and to get a demonstration of how it works, including a batch generation feature for creators.
I asked him what makes it different to the ever-growing number of AI video platforms and he said they were "hyper focused on video", creating impressive generative motion without requiring extensive prompt engineering.
The goal, he told me, is to build a simple product that anyone can use, focusing on getting the training data and labelling right so that prompting becomes easier.
How well does Haiper work?
One thing I noticed about Haiper is how intuitive the underlying model is at determining the correct motion for a video. The clips are short and I haven’t been able to try extending a clip, but for four seconds of video it captured what I expected very well.
Dr Miao said it actually works better with shorter prompts and without using the motion control slider as the AI model is better at anticipating motion than humans.
In a demonstration of this capability he created one video of a forest at twilight with motion at full, then one with motion at 1 (the default). When set to full it struggled not just with color but also in putting stars and trees in places they don’t belong.
After he changed the setting and left motion up to the AI it captured the sky moving in the background behind the trees and even rustling in the trees themselves. It felt more natural.
Testing out Haiper with 7 clips
I came up with seven video clip ideas designed to test both its realism and motion capabilities from a relatively short prompt. It did a surprisingly good job on most of them but struggled with animals and people.
A hummingbird
First I asked it to show “A hummingbird hovering and feeding on a vibrant, nectar-rich flower in a lush garden.” This is something other models can struggle with, particularly around the wings but Haiper seemed to do a good job, not over exaggerating or even merging the wings.
Dancing in a spotlight
Next I asked it to make me a video of "A skilled dancer performing a fast-paced, intricate pirouette on a spotlit stage." Slightly slow motion and while the first two seconds were elegant, we got some blurring and merging in the second half.
Dolphin leap
I then challenged it to show “A majestic dolphin leaping out of the ocean, spinning midair, and splashing back into the water.” It really struggled here. It looks cool but isn’t realistic and the Dolphin’s fin seemed to appear out of nowhere.
Hot air balloon
When I asked it to create a “colorful hot air balloon rapidly ascending into a clear blue sky dotted with fluffy white clouds,” I had fairly high expectations as this should be easy. It captured the clouds and the sky, made a beautiful hot air balloon but missed the mark on the amount of motion, having a small lift.
Soccer player
It failed spectacularly to capture “a professional soccer player powerfully kicking a ball, sending it curving into the top corner of the goal.” Haiper merged the player into itself creating a beautiful but bizarre mush of motion.
Fireworks in the sky
I think it is harder to get fireworks wrong than it is right, so the idea was to see how well it followed the prompt. I asked it for “a mesmerizing display of fireworks exploding in quick succession, illuminating the night sky with brilliant colors.” It didn’t do a bad job. Started well with colors but took the explosion part too literally.
Pensive robot
Finally I tested its ability to animate a still image. I gave it a picture generated by Adobe Firefly of a robot looking up to space with the text prompt "pensive robot watches the sky move". It was exactly what I had in my head. The robot didn't move but the sky rotated as you'd expect.
Final thoughts
Overall it didn’t do a bad job. What it did particularly well was the original visual, creating a much more realistic depiction than some models manage. The quality of the image was on par with some dedicated AI image generators.
The problems came with motion of larger objects. If it was small motion or animating an element within a large video it did well. It struggled to animate if the object was the dominant feature.
Haiper is an impressive addition to the AI video market, but it has some way to go before it reaches the longevity of Sora. For now it also has a way to go before it matches the motion consistency of models from Runway, StabilityAI and Pika Labs but it is very close.
Having spoken to the team behind Haiper I have no doubt they’ll expand and solve these issues very quickly. The main problem is compute power and training time, so with a recent $13.8 million funding round they’ll likely ramp up to meet the demand in no time.