Introduction

  • TL;DR: Alibaba’s Tongyi DeepResearch, launched September 2025, delivers deep research agent performance at unrivaled cost-efficiency. With 30.5B total parameters and just 3.3B active per token, it achieves state-of-the-art results versus GPT-4o and DeepSeek-V3, fully open-source and built on Mixture-of-Experts architecture with synthetic data training.

What is Tongyi DeepResearch?

Model Architecture and Positioning

Tongyi DeepResearch is an agent-specialized LLM from Alibaba, optimized for long-horizon web search, evidence accumulation, and synthesis. Its 30.5B Mixture-of-Experts structure ensures only 3.3B parameters activate per token, reducing computational cost without losing specialist capability. The model’s 128K context window supports advanced multi-turn workflows.

Why it matters:
Efficiency and agentic reasoning capacity set new standards for deploying research-focused AI by enterprises with limited resources.

Training Pipeline and Technical Features

Synthetic Data, Agentic Training, Dual Inference

Distinctive agentic mid-training and group RL approaches use fully synthetic trajectories, removing human labeling from all stages. Key expectations include robust long-term planning, low hallucination rates, and support for two inference paradigms: native ReAct and heavy IterResearch synthesis for deeper tasks.

Why it matters:
Open access to low-cost scalable agentic training methods promotes innovation and rigorous experimentation across the community.

Benchmark Results and Efficiency

SOTA Metrics and Real-World Impact

BenchmarkScore (%)Peer Comparison
Humanity’s Last Exam32.9GPT-4o (lower)
BrowseComp43.4(EN)/46.7(ZH)DeepSeek-V3 (lower)
xbench-DeepSearch75.0GLM-4.5 (70.0)
FRAMES90.6Other Agents

The model runs major tasks on just two H100 GPUs at under $500, outperforming proprietary systems on multiple benchmarks while keeping inference scalable for research/enterprise.

Why it matters:
Demonstrates parity or superiority versus larger, closed-source LLMs, driving industry standards toward affordable advanced AI.

Open-Source Accessibility and Application

Adoption, Licensing, Enterprise Use

Full model, training, and inference stack are available via GitHub and Hugging Face, under Apache-2.0. Vertical adoption (academia, pharma, finance) is encouraged by low entry barriers, validated traceability, and compliance with new regulatory frameworks like EU’s AI Act (Aug 2024).

Why it matters:
Opens the field for SMEs and startups, democratizing access to advanced agentic AI in evidence-driven domains.

Conclusion

  • Tongyi DeepResearch demonstrates that large and efficient agentic models can be affordable and reproducible.
  • Outperforms GPT-4o and DeepSeek-V3 on agentic search and research tasks as of November 2025.
  • Fully open-source release makes custom deployment and experimentation viable for organizations of all sizes.
  • Synthetic data, automated training, and high context support establish new market and technical standards.

What is Tongyi DeepResearch?

Model Architecture and Positioning

Tongyi DeepResearch is an agent-specialized LLM from Alibaba, optimized for long-horizon web search, evidence accumulation, and synthesis. Its 30.5B Mixture-of-Experts structure ensures only 3.3B parameters activate per token, reducing computational cost without losing specialist capability. The model’s 128K context window supports advanced multi-turn workflows.[7][5][2][3][4][1]

Why it matters:
Efficiency and agentic reasoning capacity set new standards for deploying research-focused AI by enterprises with limited resources.

Training Pipeline and Technical Features

Synthetic Data, Agentic Training, Dual Inference

Distinctive agentic mid-training and group RL approaches use fully synthetic trajectories, removing human labeling from all stages. Key expectations include robust long-term planning, low hallucination rates, and support for two inference paradigms: native ReAct and heavy IterResearch synthesis for deeper tasks.[8][5][2][3][4][1]

Why it matters:
Open access to low-cost scalable agentic training methods promotes innovation and rigorous experimentation across the community.

Benchmark Results and Efficiency

SOTA Metrics and Real-World Impact

BenchmarkScore (%)Peer Comparison
Humanity’s Last Exam32.9GPT-4o (lower)
BrowseComp43.4(EN)/46.7(ZH)DeepSeek-V3 (lower)
xbench-DeepSearch75.0GLM-4.5 (70.0)
FRAMES90.6Other Agents

The model runs major tasks on just two H100 GPUs at under $500, outperforming proprietary systems on multiple benchmarks while keeping inference scalable for research/enterprise.[5][2][3][4][1]

Why it matters:
Demonstrates parity or superiority versus larger, closed-source LLMs, driving industry standards toward affordable advanced AI.

Open-Source Accessibility and Application

Adoption, Licensing, Enterprise Use

Full model, training, and inference stack are available via GitHub and Hugging Face, under Apache-2.0. Vertical adoption (academia, pharma, finance) is encouraged by low entry barriers, validated traceability, and compliance with new regulatory frameworks like EU’s AI Act (Aug 2024).[2][3][4][1]

Why it matters:
Opens the field for SMEs and startups, democratizing access to advanced agentic AI in evidence-driven domains.

Conclusion

  • Tongyi DeepResearch demonstrates that large and efficient agentic models can be affordable and reproducible.
  • Outperforms GPT-4o and DeepSeek-V3 on agentic search and research tasks as of November 2025.
  • Fully open-source release makes custom deployment and experimentation viable for organizations of all sizes.
  • Synthetic data, automated training, and high context support establish new market and technical standards.

Summary

  • 30.5B (MoE) architecture, 3.3B activated per token; high efficiency.
  • SOTA results on deep research benchmarks, surpassing GPT-4o.
  • Open-source, cost-effective agent for enterprise and research deployment.

#TongyiDeepResearch #Alibaba #AgenticLLM #MoE #OpenSource #DeepResearch #GPT4o #DeepSeekV3 #AIAgent #SyntheticData

References