AI-Generated Video Takes its Baby Steps

As the process of using artificial intelligence to generate text and images has become more sophisticated with the emergence of AI platforms such as ChatGPT (for text) and Stable Diffusion (for images), it was just a matter of time before the technology made its way to video. Now a New York-based startup has introduced such a service that generates crude videos based on short text descriptions.

The company, Runway AI, is introducing its service to select testers this week and is part of the wave of “generative AI” tools that have generated an immense amount of buzz over the past several months. And although the use of AI in film and video production has been used for years and has even led to the emergence of “deep fake” videos, the process of using text commands to generate media could have huge implications on the creative community.

In an article in the New York Times, Ian Sansavera, a software architect at the startup, described how it worked by typing in a short description “A tranquil river in the forest,” which resulted in a short 4-second video in just a few minutes.

Early implementations of the technology have resulted in crude, oftentimes weird, distorted videos—for example, typing in “a bear holding a cellphone” could generate a bear shaped like a cellphone. And currently it can only generate videos up to 4 seconds in length. However, like most services based on AI, it’s expected to rapidly improve as the service gathers more data from more users. That’s one of the reasons Runway wants to get its service out to the public as soon as possible.

“This is one of the single most impressive technologies we have built in the last hundred years,” Runway Chief Executive Cristóbal Valenzuela told the Times, “you need to have people actually using it.”

As services like Runway gain traction and become more sophisticated, the concept of creating professional looking videos with just the press of a button could become a big cause for concern for anyone in the business of creating content.

“In the old days, to do anything remotely like this, you had to have a camera. You had to have props. You had to have a location. You had to have permission. You had to have money,” Susan Bonser, a Pennsylvania-based author and a publisher who has been dabbling in generative AI told the Times. “You don’t have to have any of that now. You can just sit down and imagine it.”