Krea Realtime 14B: Real-Time Open Source Text-to-Video at 11fps

Introduction

TL;DR: Krea AI released Krea Realtime 14B on October 14, 2024 — a 14B parameter open-source autoregressive text-to-video model capable of real-time, long-form generation at 11fps on a single NVIDIA B200 GPU. Built with Self-Forcing distillation from Wan 2.1 14B, it marks a major leap for real-time video synthesis and interactive AI creation.
Krea Realtime 14B redefines open-source video generation with its ability to stream frames as it generates them, supporting live prompt changes and restyling in real-time.

Architecture and Techniques

Krea Realtime 14B uses the Self-Forcing method to convert Wan 2.1 14B into an autoregressive model. It introduces advanced techniques like KV Cache Re-computation and Attention Biasing, reducing error accumulation during long-form rendering.

Why it matters: This is the first open-source model to combine 14B-scale performance with interactive, low-latency text-to-video generation.

Real-Time Capabilities

Running at 11fps with 4 inference steps on a single B200 GPU, it streams the first frame in about one second, enabling users to adjust prompts or styles mid-generation.

Why it matters: This transforms generative video AI from batch rendering into an interactive creative medium.

Deployment

Developers can clone from Hugging Face (krea/krea-realtime-video) and launch a local server with a few lines of code. Integration with diffusers allows advanced modular workflows.

Why it matters: Open availability under the Apache 2.0 license empowers both researchers and commercial creators to iterate freely.

Comparison with Prior Models

Model	Parameters	Architecture	FPS (B200)	Open Source
Krea Realtime 14B	14B	Autoregressive	11fps	Yes
Wan 2.1 T2V	1.3B	Diffusion	<1fps	Partial
Pika 1.5	Unknown	Diffusion	~1fps	No
Runway Gen-3	Unknown	Diffusion	~1fps	No

Why it matters: Krea Realtime 14B outpaces prior open models by a full order of magnitude, bridging large-scale fidelity with real-time responsiveness.

Conclusion

The Krea Realtime 14B model represents a significant breakthrough in open-source text-to-video generation. With its 14B parameter autoregressive architecture, real-time 11fps generation capability on B200 GPU, and full Apache 2.0 licensing, it provides researchers and developers with a powerful tool for interactive video creation. The Self-Forcing technique and advanced optimization methods enable both high-quality output and responsive, real-time performance, marking a major advancement in accessible video AI technology.

Summary

Krea Realtime 14B is a 14-billion parameter open-source autoregressive text-to-video model released in October 2024
Achieves real-time generation at 11fps on a single NVIDIA B200 GPU
Built using Self-Forcing distillation from Wan 2.1 14B with KV Cache Re-computation and Attention Biasing
Supports interactive frame streaming with live prompt control and restyling
Released under Apache 2.0 license on Hugging Face for unrestricted research and commercial use

Recommended Hashtags

#KreaAI #TextToVideo #Autoregressive #OpenSource #VideoGeneration #HuggingFace #SelfForcing #RealTimeAI #MachineLearning

References

“Krea Realtime 14B Official Blog” | Krea.ai | 2024-10-14 | https://www.krea.ai/blog/krea-realtime-14b
“krea/krea-realtime-video” | HuggingFace | 2024-10-07 | https://huggingface.co/krea/krea-realtime-video
“Krea AI Announcement” | X | 2024-10-14 | https://x.com/krea_ai/status/1980358158376988747

Introduction#

Architecture and Techniques#

Real-Time Capabilities#

Deployment#

Comparison with Prior Models#

Conclusion#

Summary#

Recommended Hashtags#

References#