If you work in video, you've probably heard about AI video tool Sora, and the fact that it's just been opened up to the public. And if you haven't, please just accept that this is a pretty big deal.
Why? Firstly because it's made by Open AI, the same people that brought you ChatGPT, which for all its faults has taken the world by storm in the last year. And secondly, because its carefully selected demo videos look seriously impressive; often indistinguishable from real footage.
So much so, that many videographers and filmmakers are wondering: is this going to put me out of work?
Imagine it. Rather than having to spend a fortune flying a crew out to a far-flung location, a creative agency could generate the kind of video they need—something far too specific to find in a stock library—just by typing in a text prompt. That's the promise of AI video, and Sora is the first serious crack at it.
At the moment, it's pretty early days: you can only generate videos of five seconds. But let's face it: that in itself would have seemed extraordinary just a couple of years ago.
So is it any good? Well, I signed up for the basic package, which costs $24 per month, to find out. And while I haven't spent a lot of time with Sora yet, my first experiences with it have been pretty instructive.
First attempts
Like all good AI products, Sora is very easy to get started with: you pretty much just type and go. So I wrote what I thought was a simple, yet descriptive prompt:
A woman throws a ball in a back garden. Her dog, a one-year-old cocker spaniel, enthusiastically runs to catch it before it hits the fence at the other end of the garden. The woman quickly lifts her Canon EOS DSLR camera to take a photo of her pet.
I then got taken to a storyboard where I could describe four sections of the sequence in more detail. I won't bore you with that, but let's just jump ahead to what I got back in return.
Hmmm... not that great, really. In case you're not in position to play the video right now, it sets up the scene nicely, but then it all goes a bit weird. The dog running looks sort-of convincing, but no ball is thrown and rather than take a picture, the woman seems to magically make the camera give birth to a new lens. Or something. It's very odd.
I figured that maybe I was trying to fit too many things into one five second scene. So for my next go, I tried a simpler prompt: "A penguin walks into a British pub and orders a pint of beer." Here's what Sora gave me.
This video is better. The pub and its atmosphere are convincing, and although Sora gave me two penguins rather than one, I'm not complaining.
On the downside, we don't get to see them order beers. And if you enlarge the video to your whole screen, you notice that the movements of the regulars are a bit disturbing: jerking back and forth like something out of a horror movie, rather than the fun scene I was expecting.
Okay, maybe a penguin in a pub is a too bit surreal for Sora. So next I thought I'd try something true-to-life and simpler than my first attempt: "A woman on a beach watches a butterfly land on a rock and takes a photo with her camera."
On the face of it, this is the best yet. The sea looks very real, the woman is convincing, and while she doesn't actually take the photo as requested, she's clearly about to. The butterflies, however, look very 'CG', and more importantly, they're nowhere near where the camera is pointing; the physics, again, are completely off.
I could go on... I've played around quite a bit, and made a ton more videos. But crucially, I don't think I've improved much on my initial efforts yet. Every video I've made has suffered from weird physics, objects and parts of people mysteriously transmogryphying or disappearing altogether. Slow motion is used a lot, without you ever asking for it. And overall, there's a general sense that This World is Not Right.
Should you subscribe?
So... should you spend $24 on this? I'll be honest: for sheer fun, it's probably worth it. It's certainly a much better place to throw your money than booze, gambling, or renting Winnie-the-Pooh: Blood and Honey.
And no doubt some people will use it to make some kind of surreal art project or other. Or memes and GIFs for Facebook. Which won't be annoying or quickly get old in the slightest.
But in terms of anything that might help your day job, whether you're a photographer, film-maker or someone who hires them, I'd say that right now, it's probably a no.
While the results of Sora are pretty amazing, I don't think they're really worth paying for, because as yet they're just not good enough.
And by that, I don't just mean they're not good enough for prime time. They're not good enough even for storyboarding, brainstorming and idea generation. Because they're so offputting, you can't even use them for inspiration, like you can with AI generated static imagery. The weirdness is just too distracting.
All that might change very quickly, of course. Watch this space.