Grow with AppMaster Grow with AppMaster.
Become our partner arrow ico

Stability AI Unveils Promising Video Generation Models

Stability AI Unveils Promising Video Generation Models

Stability AI, a leading name in the tech space, has made its groundbreaking entry into the video generation realm with the launch of their Stable Video Diffusion (SVD). With this remarkable move, they've showcased two highly advanced AI models - SVD and SVD–XT, designed to generate short video clips from still images.

However, as of now, these state-of-the-art models are open for research purposes only. According to the company, both SVD and SVD–XT sanction high-fidelity outcomes that rival or potentially outshine the performance of other extant artificial video generators.

Stability AI aims to benefit from user feedback in fine-tuning these image-to-video models, having open-sourced them as part of the research preview. This endeavor signifies the company's intent to pave the way for eventually applying these models commercially.

A company blog post detailed that SVD and SVD-XT employ latent diffusion models that generate 576 x 1024 videos, using a single still image as a conditioning frame. Even though the output videos are brief in duration – maxing out at four seconds – these models can generate content at a pace ranging from three frames per second to 30 frames per second. Specifically, the SVD model is calibrated to derive 14 frames from a still image, while SVD-XT possesses the capability to generate up to 25 frames.

To create the SVD, Stability AI relied on an immense, meticulously curated video library consisting of approximately 600 million samples. The company used the samples compiled in the database to train a primary model, which was subsequently refined using a smaller, high-def dataset to handle downstream tasks such as image-to-video and text-to-video conversion, enabling it to predict a sequence of frames from a singular conditioning image.

A whitepaper released by Stability AI elucidates the potential of SVD as a base for refining a diffusion model to generate a multi-view synthesis, thus enabling generation of several consistent views of an object from a singular still image.

This opens up a plethora of opportunities for potential uses in various sectors, such as education, entertainment, and marketing, according to the company's blog post.

A significant note in the company's disclosure is that an external evaluation conducted by human reviewers revealed that SVD's output surpasses the quality of premiere closed text-to-video models produced by competitors such as Runway and Pika Labs.

Despite the initial success, Stability AI acknowledges that there are many limitations in the current models. For instance, these models occasionally lack photorealistic output, generate still videos, or struggle with replicating human figures accurately.

But it's merely the onset of their venture into video generation. The present research preview's data will help evolve these models by identifying the existing gaps and introducing new features, such as supporting text prompts or text rendering in the videos, making them ready for commercial applications.

With the potential of diverse applications encompassing sectors including but not limited to, advertising, education, and entertainment, platforms like AppMaster, renowned for empowering users with tools to create mobile and web applications easily, might find Stable Video Diffusion a useful integration.

The company envisages that the findings from the open investigation of these models will flag more concerns (such as biases) and assist in facilitating a safer deployment later.

Already, plans are underway to develop a variety of models that would fortify and extend the base built by stable diffusion.

However, it remains uncertain when these improvements would be available to users.

Related Posts

AppMaster at BubbleCon 2024: Exploring No-Code Trends
AppMaster at BubbleCon 2024: Exploring No-Code Trends
AppMaster participated in BubbleCon 2024 in NYC, gaining insights, expanding networks, and exploring opportunities to drive innovation in the no-code development space.
FFDC 2024 Wrap-Up: Key Insights from the FlutterFlow Developers Conference in NYC
FFDC 2024 Wrap-Up: Key Insights from the FlutterFlow Developers Conference in NYC
FFDC 2024 lit up New York City, bringing developers cutting-edge insights into app development with FlutterFlow. With expert-led sessions, exclusive updates, and unmatched networking, it was an event not to be missed!
Tech Layoffs of 2024: The Continuing Wave Affecting Innovation
Tech Layoffs of 2024: The Continuing Wave Affecting Innovation
With 60,000 jobs cut across 254 companies, including giants like Tesla and Amazon, 2024 sees a continued wave of tech layoffs reshaping innovation landscape.
GET STARTED FREE
Inspired to try this yourself?

The best way to understand the power of AppMaster is to see it for yourself. Make your own application in minutes with free subscription

Bring Your Ideas to Life