Introduction

  • TL;DR: Krea AI released Krea Realtime 14B on October 14, 2024 — a 14B parameter open-source autoregressive text-to-video model capable of real-time, long-form generation at 11fps on a single NVIDIA B200 GPU. Built with Self-Forcing distillation from Wan 2.1 14B, it marks a major leap for real-time video synthesis and interactive AI creation.
  • Krea Realtime 14B redefines open-source video generation with its ability to stream frames as it generates them, supporting live prompt changes and restyling in real-time.

Architecture and Techniques

Krea Realtime 14B uses the Self-Forcing method to convert Wan 2.1 14B into an autoregressive model. It introduces advanced techniques like KV Cache Re-computation and Attention Biasing, reducing error accumulation during long-form rendering.

Why it matters: This is the first open-source model to combine 14B-scale performance with interactive, low-latency text-to-video generation.


Real-Time Capabilities

Running at 11fps with 4 inference steps on a single B200 GPU, it streams the first frame in about one second, enabling users to adjust prompts or styles mid-generation.

Why it matters: This transforms generative video AI from batch rendering into an interactive creative medium.


Deployment

Developers can clone from Hugging Face (krea/krea-realtime-video) and launch a local server with a few lines of code. Integration with diffusers allows advanced modular workflows.

Why it matters: Open availability under the Apache 2.0 license empowers both researchers and commercial creators to iterate freely.


Comparison with Prior Models

ModelParametersArchitectureFPS (B200)Open Source
Krea Realtime 14B14BAutoregressive11fpsYes
Wan 2.1 T2V1.3BDiffusion<1fpsPartial
Pika 1.5UnknownDiffusion~1fpsNo
Runway Gen-3UnknownDiffusion~1fpsNo

Why it matters: Krea Realtime 14B outpaces prior open models by a full order of magnitude, bridging large-scale fidelity with real-time responsiveness.


Conclusion

The Krea Realtime 14B model represents a significant breakthrough in open-source text-to-video generation. With its 14B parameter autoregressive architecture, real-time 11fps generation capability on B200 GPU, and full Apache 2.0 licensing, it provides researchers and developers with a powerful tool for interactive video creation. The Self-Forcing technique and advanced optimization methods enable both high-quality output and responsive, real-time performance, marking a major advancement in accessible video AI technology.


Summary

  • Krea Realtime 14B is a 14-billion parameter open-source autoregressive text-to-video model released in October 2024
  • Achieves real-time generation at 11fps on a single NVIDIA B200 GPU
  • Built using Self-Forcing distillation from Wan 2.1 14B with KV Cache Re-computation and Attention Biasing
  • Supports interactive frame streaming with live prompt control and restyling
  • Released under Apache 2.0 license on Hugging Face for unrestricted research and commercial use

#KreaAI #TextToVideo #Autoregressive #OpenSource #VideoGeneration #HuggingFace #SelfForcing #RealTimeAI #MachineLearning

References

  1. “Krea Realtime 14B Official Blog” | Krea.ai | 2024-10-14 | https://www.krea.ai/blog/krea-realtime-14b
  2. “krea/krea-realtime-video” | HuggingFace | 2024-10-07 | https://huggingface.co/krea/krea-realtime-video
  3. “Krea AI Announcement” | X | 2024-10-14 | https://x.com/krea_ai/status/1980358158376988747