Introduction
Magistral Small (24B) is Mistral AI’s open-source reasoning-focused language model with 24 billion parameters. Built on the foundation of the Mistral Small 3.1 model, it utilizes a specialized training regimen combining Supervised Fine-Tuning (SFT) traces from its larger sibling, Magistral Medium, with a custom Reinforcement Learning (RL) pipeline. This hybrid SFT+RL approach enhances its performance in tasks requiring long chains of logic, particularly in mathematics and coding.
- TL;DR: Magistral Small (24B) is a highly efficient, 24-billion-parameter open-source model from Mistral AI, released under the Apache 2.0 License. Its standout feature is superior reasoning performance in math and code, achieved through a unique SFT combined with RL training pipeline. The model’s compact size allows for easy local deployment, potentially running on a single RTX 4090 or a 32GB RAM MacBook once quantized.
Introduction
Magistral Small (24B), released by Mistral AI in June 2025, marks the company’s first model explicitly focused on complex, domain-specific reasoning capabilities [1.3, 2.1]. Built on the foundation of the Mistral Small 3.1 model, the 24-billion-parameter model utilizes a specialized training regimen combining Supervised Fine-Tuning (SFT) traces from its more powerful sibling, Magistral Medium, with a custom Reinforcement Learning (RL) pipeline [1.4, 1.8]. This hybrid SFT+RL approach elevates its performance in tasks requiring long chains of logic, particularly in mathematics and coding.
- TL;DR: Magistral Small (24B) is a highly efficient, 24-billion-parameter open-source model from Mistral AI, released under the Apache 2.0 License. Its standout feature is superior reasoning performance in math and code, achieved through a unique SFT combined with RL training pipeline. The model’s compact size allows for easy local deployment, potentially running on a single RTX 4090 or a 32GB RAM MacBook once quantized [1.4].
Technical Architecture and Training Methodology
The design of Magistral Small is centered around maximizing logical tracing and transparency, which is a key requirement for enterprise-grade reasoning applications.
SFT and Scalable RL Pipeline
Mistral AI adopted a ground-up approach to training, demonstrating the effectiveness of their custom, scalable RL pipeline.
- SFT from Traces: The model’s initial fine-tuning (SFT) was conducted using reasoning traces specifically derived from the training of the larger, enterprise-focused Magistral Medium model. This technique effectively distilled high-quality reasoning capability into the smaller model.
- Pure RL on Text: Following SFT, the model underwent an RL phase using text data. Research indicates that this approach maintains or even improves multimodal understanding and function calling, demonstrating that pure RL can significantly refine small models beyond the SFT baseline.
- Transparent Reasoning: The model is trained to generate responses that start and end with
<think>and</think>tags, providing a traceable thought process for verification and interpretability.
Deployment and Accessibility
The model is released under the Apache 2.0 License, granting users the freedom to use and modify it for both commercial and non-commercial projects. Its parameter count (24B) is optimally sized for efficient deployment.
| Characteristic | Specification | Implication |
|---|---|---|
| Parameters | 24 Billion | Efficient scale for reasoning tasks |
| License | Apache 2.0 | Enables unrestricted commercial and non-commercial use |
| Context Window | 128k Tokens | Supports extensive context processing |
| Local Deployment | Quantized versions fit on a single RTX 4090 GPU | High-performance inference is accessible on consumer-grade hardware |
Why it matters: The SFT+RL hybrid training method is the foundational reason for the model’s high performance in complex tasks, offering a path for smaller models to achieve reasoning quality comparable to much larger or proprietary alternatives. The open-source license and efficient sizing make this high-tier performance widely accessible.
Performance in Logic-Heavy Domains: Math and Code
Magistral Small is specifically designed to excel in structured, multi-step logical challenges, such as those found in advanced mathematics and programming.
Benchmark Highlights
The model demonstrates strong performance on reasoning-focused benchmarks:
| Benchmark | Performance Area | Notes |
|---|---|---|
| AIME2024 | Math Competition | Strong performance on advanced mathematical reasoning |
| HumanEval | Code Generation | Competitive code generation capabilities |
| Math | Mathematical Reasoning | Solid capability in mathematical problem-solving |
The performance boost observed when combining SFT and RL, particularly on mathematical benchmarks, suggests that the RL stage is effective at refining the model’s ability to execute long, correct reasoning chains.
Enterprise Use Cases
Its core strengths position Magistral Small ideally for several high-value enterprise applications:
- Software Development: Designed for programmatic logic, project planning, and backend architecture design through sequenced, multi-step actions.
- Regulated Industries (Legal, Finance): Provides traceable, verifiable thought processes essential for compliance and auditability in high-stakes environments.
- Business Strategy: Capable of executing complex risk assessments, financial modeling, and operational optimization tasks involving multiple constraints.
Why it matters: By prioritizing verifiable, multi-step reasoning, Magistral Small is designed as a specialized tool for solving complex, structured problems where output accuracy and the ability to audit the intermediate steps are important requirements.
Conclusion
Magistral Small (24B) represents an advancement for smaller, openly available LLMs in the realm of complex reasoning. Through a targeted SFT+RL training process, it aims to deliver strong performance in domains like math and code. Its Apache 2.0 license and ability to be deployed efficiently on widely available hardware make it accessible to the global developer community.
Summary
- The model is Mistral AI’s 24B open-source (Apache 2.0) offering, focused on transparent, multi-step reasoning
- Training uses a blend of SFT (leveraging traces from Magistral Medium) and a custom RL pipeline
- Demonstrates strong performance in math and coding tasks, validating the hybrid training approach
- Its compact size and open license enable efficient, local deployment on hardware like the RTX 4090
Recommended Hashtags
#AI #OpenSourceAI #LLM #MagistralSmall #MistralAI #ReasoningModel #CodeAI #LLMOps #Apache20
References
- “Magistral: Reasoning Models with Better Thinking” | arXiv | 2024
- “Magistral - Mistral AI” | Mistral AI | 2024
- “mistralai/Magistral-Small-2506” | Hugging Face | 2024