Introduction
TL;DR
The Future of Life Institute released its 2025 AI Safety Index in December 2025, evaluating seven leading frontier AI companies—Anthropic, OpenAI, Google DeepMind, xAI, Meta, Zhipu AI, and DeepSeek. The findings are stark: no company achieved a grade higher than C+, and all scored at D or below in Existential Safety planning. While these firms publicly commit to achieving Artificial General Intelligence (AGI) within the decade, independent expert panels found they lack coherent, actionable plans to ensure such superintelligent systems remain under human control. The evaluation, conducted across 33 indicators spanning six critical safety domains, reveals a fundamental mismatch between corporate ambition and safety infrastructure, raising concerns about catastrophic risks from uncontrolled AI development.
Context and Key Keywords
The AI Safety Index represents the most comprehensive independent assessment of frontier AI companies’ safety and governance practices. Conducted by a panel of distinguished researchers—including MIT Professor Max Tegmark—the evaluation uses a standardized GPA grading system (A+ = 4.3 to F = 0) and measures companies on risk assessment rigor, current safety benchmarks, safety frameworks, existential risk preparation, governance accountability, and information transparency. The 2025 assessment comes at a critical juncture: as regulatory frameworks emerge globally (EU AI Act, UK AI Safety Institute), the gap between market-driven AI development and safety-aligned governance has become a central policy concern.
Why it matters: The companies evaluated control the most capable AI systems on the market and are in direct competition to develop AGI. Without binding safety standards, competitive pressures create a “race to the bottom” where safety investments become liabilities rather than competitive advantages.
Overall Results: A C+ at Best
Grading Summary
The 2025 AI Safety Index assigned the following overall grades:
| Company | Overall Grade | Overall Score (GPA scale) |
|---|---|---|
| Anthropic | C+ | 2.64/4.3 |
| OpenAI | C | 2.10/4.3 |
| Google DeepMind | C- | 1.76/4.3 |
| xAI | D | 1.23/4.3 |
| Meta | D | 1.06/4.3 |
| Zhipu AI | F | 0.62/4.3 |
| DeepSeek | F | 0.37/4.3 |
The highest achievable score across all companies was a C+, held by Anthropic. This ceiling reflects a conclusion by the expert review panel: even industry leaders fall substantially short of the safety standards necessary for safe superintelligence development.
The Existential Safety Collapse
The most alarming finding emerged in the Existential Safety domain, where all companies scored at D or below. This domain evaluates whether firms have developed credible, detailed strategies for mitigating catastrophic and existential AI risks, including plans for alignment, control, governance, and post-AGI societal management.
Results by Company:
- Anthropic: D (1.0/4.3)
- OpenAI: F (0.67/4.3)
- Google DeepMind: D- (0.23/4.3)
- xAI, Meta, Zhipu AI, DeepSeek: All F
One expert reviewer summed up the consensus: despite racing toward human-level AI, “none of the companies has anything like a coherent, actionable plan” for ensuring such systems remain controllable. Max Tegmark, MIT Professor and FLI President, described this disconnect as “deeply disturbing.”
Why it matters: Superintelligent systems, by definition, exceed human cognitive capacity in reasoning, problem-solving, and strategic thinking. If such systems become misaligned with human values or escape human oversight, the consequences could range from economic disruption to civilizational-level harm. The absence of credible control plans represents an extraordinary gamble with societal stakes.
Domain-Level Analysis: Where Systems Fall Behind
Risk Assessment: Only Three Companies Show Rigor
Risk assessment evaluates whether companies conduct systematic evaluations of dangerous capabilities before deploying frontier models. Priority domains include biological and chemical weapons development, offensive cyber operations, autonomous self-replication, and behaviors associated with goal misalignment or deception.
Key Findings:
Anthropic, OpenAI, Google DeepMind: Only three firms conduct substantive dangerous capability evaluations
- Anthropic: Unique in conducting human participant bioweapon uplift trials with domain experts
- OpenAI: Assessed risks on pre-mitigation (uncensored) models
- DeepMind: Partial evaluations but weaker disclosure
xAI, Meta: No publicly documented dangerous capability assessments
External Verification Crisis: Minimal third-party independent review of internal evaluations across all firms
One reviewer noted: “The methodology explicitly linking a given evaluation or experimental procedure to the risk, with limitations and qualifications, is usually absent. I have very low confidence that dangerous capabilities are being detected in time to prevent significant harm.”
This finding indicates that companies rely primarily on internal assessments, creating inherent incentive misalignment: firms have financial and reputational reasons to underreport alarming results.
Why it matters: Risk management begins with risk discovery. If companies cannot reliably identify dangerous capabilities before deployment, no downstream mitigation strategy can be effective. Reliance on internal evaluation without external verification leaves societies vulnerable to unknown hazards.
Safety Frameworks: Qualitative Commitments, No Quantitative Triggers
Safety framework evaluation, conducted in partnership with SaferAI, assesses whether companies have operationalized their risk management commitments through measurable criteria.
The Core Problem: Abstraction Without Measurement
Firms publish risk management frameworks, but fail to translate abstract “risk tolerances” into concrete, measurable thresholds. Effective risk governance requires:
- Key Risk Indicators (KRIs): Measurable signals that risk levels are approaching critical thresholds
- Key Control Indicators (KCIs): Measurable evidence that mitigation strategies are working
- If-Then Relationships: Concrete triggers (“If KRI crosses threshold X, then activate control measure Y”)
Reality:
- Anthropic, OpenAI: Published frameworks acknowledge importance of quantification but lack specificity
- xAI, Meta: Draft or minimal frameworks
- Zhipu AI, DeepSeek: No public frameworks
Example of inadequacy: A company states “we maintain safety as a core value.” Adequate governance would specify: “If our internal evaluations detect bioweapon-acceleration capabilities above threshold Z on three independent tests, we halt deployment for 90 days and commission external review.”
Without such specificity, safety decisions depend on executive discretion rather than consistent policy, and commitments dissolve under commercial pressure.
Why it matters: Quantified thresholds create accountability. They enable external parties (regulators, investors, boards) to verify whether a company’s stated commitments are operationalized. They also prevent “moving goalposts” when business incentives favor progress over caution.
Current Harms: Variable Safety Performance
This domain evaluates flagship model performance on third-party safety benchmarks, including Stanford’s HELM Safety Benchmark, AIR-Bench 2024, and TrustLLM.
Performance Tiers:
- Anthropic, OpenAI: B to B- (good safety benchmark performance)
- Google DeepMind: C+ (adequate)
- xAI, Meta: D+ (weak)
- Zhipu AI, DeepSeek: D (poor)
Specific Vulnerabilities:
- All models show susceptibility to algorithmic jailbreaking
- DeepSeek: Particularly vulnerable to basic adversarial attacks
- Watermarking: Only Google DeepMind (SynthID) shows robust implementation
- Privacy: Anthropic distinguishes itself by not training on user chat data by default; others do
One concern: models may perform well on standardized safety benchmarks while remaining vulnerable to novel attacks. Benchmark performance, while important, does not guarantee real-world safety.
Why it matters: Current harms are tangible risks (misinformation, fraud, illegal content generation) that can be empirically measured. Poor performance on established benchmarks signals that models may cause immediate harm once deployed at scale.
Governance & Accountability: Transparency Gaps
This domain evaluates whether corporate structures and operations prioritize accountability for AI system impacts, including whistleblower protections, governance structure, and regulatory engagement.
Whistleblowing Policy Transparency:
- OpenAI: Only company to publish a complete whistleblowing policy (though revelations of restrictive non-disparagement clauses initially emerged via media)
- Anthropic: Policy exists but is non-public; promised public release
- xAI: Non-public policy
- Meta, Google DeepMind, Zhipu AI, DeepSeek: Fragmented or absent policies; past retaliation incidents reported
Whistleblower protections matter because employees are often the first to observe safety problems. Public policies enable external monitoring and set credible deterrents to retaliation.
Corporate Structure:
- Anthropic: Public Benefit Corporation + Long-Term Benefit Trust (highest governance score)
- OpenAI: Public Benefit Corporation (undergoing restructuring; reviewers expressed concern about weakening safety governance)
- Google DeepMind, Meta, xAI: Pure for-profit structures (profit-maximization incentives may override safety)
Regulatory Engagement:
- Anthropic, OpenAI: Relatively supportive of international and U.S. state-level AI safety regulation
- Meta, Google DeepMind: Actively lobby against state-level safety regulations
A core observation: firms that lobby against regulation while simultaneously claiming strong safety commitments create credibility problems. Genuine commitment to safety would embrace external oversight as a validation mechanism.
Why it matters: Governance structure and accountability mechanisms shape corporate behavior more reliably than rhetorical safety commitments. Companies that resist regulation while underfunding safety raise reasonable suspicions about the sincere priority of safety in their decision-making.
Information Sharing: Selective Disclosure
This domain measures transparency regarding technical specifications, risk management practices, and incident reporting.
Transparency Rankings:
- Anthropic, OpenAI: A- (strong information sharing, participate in transparency initiatives)
- xAI: C+ (moderate)
- Meta: D (minimal)
- Zhipu AI, DeepSeek: F (essentially no disclosure)
Specific Gaps:
- System Prompt Transparency: Only Anthropic and xAI publish actual system prompts
- Incident Reporting: Most companies lack formal, public mechanisms for reporting safety-critical incidents
- Model Card Detail: Few companies provide comprehensive model card documentation
- Pre-Deployment Testing Results: Limited public disclosure of external evaluation findings
Why it matters: Information asymmetry advantages firms but disadvantages society. Regulators, investors, researchers, and the public cannot verify safety claims without transparency. Disclosure requirements are standard in safety-critical industries (pharmaceuticals, aviation) precisely because information transparency enables effective oversight.
Existential Risk Unprepared: The Critical Gap
The Stated Goal vs. Actual Planning
All seven companies evaluated either explicitly state or strongly imply that their goal is to develop AGI—a system matching or exceeding human-level intelligence across all domains. Five of them (Anthropic, OpenAI, Meta, xAI, and DeepSeek) have publicly indicated they expect AGI within the decade.
Yet when expert reviewers evaluated companies’ published strategies for ensuring AGI safety, control, and alignment, all scored in the F-D range. This gap—between stated ambitions and demonstrated preparation—was characterized by reviewers as “fundamentally unprepared for their own stated goals.”
What’s Missing: Technical Alignment Research
One component of existential safety is investment in fundamental alignment research. This includes work on:
- Mechanistic Interpretability: Understanding how neural networks reach decisions
- Scalable Oversight: Techniques for humans to meaningfully supervise superintelligent systems
- Goal Specification: Methods for reliably encoding human values into AI objectives
Finding: Companies vary significantly in research investment, but all fall short relative to the magnitude of the stated challenge. Anthropic and OpenAI publish alignment research, but reviewers note gaps and insufficiency. Most companies provide minimal public evidence of serious technical investment in controllability.
What’s Missing: Governance Plans
Another gap is concrete governance strategy for AGI. Critical questions remain unaddressed:
- If an AGI achieves capabilities exceeding human oversight ability, what mechanisms exist to prevent misuse or power consolidation?
- How will benefits be distributed if AGI renders human labor economically obsolete?
- How will international coordination mechanisms function to prevent AGI arms races?
Finding: No company has published a detailed, credible answer to any of these questions.
What’s Missing: Internal Monitoring and Control
The Index specifically evaluates whether companies implement technical protocols to detect and prevent model misalignment during internal use. This includes:
- Automated monitoring systems
- Red team exercises
- Staged safety testing before deployment escalation
Finding: Most companies acknowledge these practices exist but provide minimal public documentation, making independent verification impossible.
Why it matters: Superintelligence represents a qualitative shift in the type of risk. Current mitigation strategies—designed for systems humans can understand and predict—may be fundamentally inadequate for systems that exceed human cognitive capacity. The absence of concrete, evidence-based plans for maintaining control represents an extraordinary and unjustified risk.
Regulatory Context: Global Movement Toward Binding Standards
2025 Regulatory Landscape
As AI companies lag on self-governance, regulatory frameworks are consolidating globally:
EU AI Act (Effective 2025):
- Highest risk (prohibited) applications and high-risk systems subject to mandatory compliance
- Penalties up to 7% of global annual revenue for non-compliance
- Requires ex-ante risk assessments and third-party audits
UK Approach:
- Established UK AI Safety Institute (2023) focused on frontier AI risks
- Emphasis on red-teaming and adversarial evaluation
- Seeks to lead global standards development
U.S. Approach:
- NIST AI Risk Management Framework (AI RMF) provides non-binding guidance
- U.S. AI Safety Institute Consortium supporting testing and evaluation
- No federal binding regulation on frontier AI
China’s Approach:
- Interim Measures on Generative AI (2023) mandate data minimization and safety assessments
- National standards development for alignment capability evaluation underway
- Less emphasis on self-regulation; more direct regulatory oversight
International Cooperation:
- Bletchley Declaration (2023): 28 nations + EU committed to AI safety research and international coordination
- Singapore Consensus on Global AI Safety Research Priorities (2025)
- OECD monitoring of AI incidents and emerging practices
Implications for Companies
The regulatory environment is hardening. Companies that fail to meet emerging standards face:
- Market restrictions in EU and increasingly in other jurisdictions
- Reputational damage and investor pressure
- Potential civil and criminal liability
- Operational constraints and mandated capability limitations
Paradox: By resisting regulatory frameworks now, AI companies increase the likelihood of more restrictive, prescriptive regulation later. Proactive alignment with emerging standards would likely result in more commercially favorable rules.
Company-Specific Insights
Anthropic: Leading but Incomplete
Strengths:
- Best overall grade (C+)
- Only company conducting bioweapon-acceleration human uplift trials
- Not training on user chat data (privacy leadership)
- Public Benefit Corporation structure
- World-leading alignment research publication rate
Gaps:
- Whistleblower policy not yet publicly released (promised)
- Risk assessment methodology could be more transparent
- Safety framework still relies on qualitative rather than quantitative thresholds
Trajectory: Anthropic demonstrates serious commitment to safety institutionally and through research. However, even their strong performance is viewed as “barely adequate” by reviewers.
OpenAI: Transparency Leader with Governance Concerns
Strengths:
- Only company with published whistleblowing policy
- Robust external pre-deployment safety testing (partners with U.S. AI Safety Institute, UK AI Safety Institute, METR, Apollo Research)
- Assessed risks on uncensored model versions
- Strong safety benchmark performance (B grade)
- Comprehensive safety index survey engagement
Gaps:
- Recent loss of safety team capacity
- Governance restructuring raises concerns about dilution of safety mission
- Needs to maintain non-profit governance protections amid for-profit pressures
Trajectory: OpenAI established early leadership on transparency but faces credibility challenges as organizational safety culture appears to weaken relative to capability development priorities.
xAI: Nascent but Gaps Remain
Strengths:
- Published frontier AI safety framework
- System prompt transparency
- Safety framework formalized and public
Gaps:
- Risk assessment breadth and rigor limited
- Whistleblower policy not public
- Limited external testing for deployed models
- Framework still requires concrete operationalization
Trajectory: xAI is moving in the right direction but lags established leaders in systematic safety implementation.
Meta: Safety as Secondary
Strengths:
- Published frontier AI safety framework with clear thresholds
- Formal risk modeling mechanisms
Gaps:
- Very limited investment in technical safety research
- Open-weight model releases lack tamper-resistant safeguards
- Weak whistleblower policy and governance
- Limited information sharing (did not complete safety index survey)
- Leadership downplays frontier-level risks
Trajectory: Meta’s emphasis on model weight release prioritizes accessibility over controllability. This strategy, while beneficial for democratizing AI development, increases risks from malicious actors.
Expert Warnings and Policy Implications
Max Tegmark and the MIT Perspective
Max Tegmark, FLI President and prominent AI safety researcher, framed the findings starkly: “The AI race is happening faster than safety can catch up. Despite cases of self-harm and suicide linked to chatbots, U.S. AI companies remain less regulated than restaurants and continue lobbying against binding safety standards.”
Key points in his analysis:
- Regulatory Gap: The U.S. lacks enforceable AI development rules applicable to frontier systems
- Competitive Pressure: Without a regulatory floor, competitive dynamics incentivize “racing to the bottom” on safety investments
- Self-Regulation Failure: Voluntary safety commitments have proven unreliable and subject to revision when inconvenient
- International Stakes: The geopolitical competition for AI dominance drives acceleration absent coordinating mechanisms
The Reviewers’ Assessment
The six-member expert panel (including Dylan Hadfield-Menell from MIT, Jessica Newman from UC Berkeley, Stuart Russell from UC Berkeley, and others) converged on several conclusions:
- Capabilities are advancing faster than governance can respond
- No common regulatory floor enables divergent safety practices (leading firms adopt stronger controls while others neglect basics)
- The industry is unprepared for its stated objectives (AGI development without credible control plans)
- Transparency remains inadequate for external oversight (whistleblower policies, incident reporting, evaluation methodology)
- Technical safety research investment is insufficient relative to capability scaling efforts
Conclusion: Toward Systemic Stability
Summary of Key Findings
The 2025 AI Safety Index reveals an industry in crisis—not of immediate harms, but of systemic unpreparedness for its own ambitions. Even the highest-performing firms (Anthropic, OpenAI) score at levels that would be unacceptable in other safety-critical industries such as aviation or pharmaceuticals. The industry’s lowest scorers lack basic safety documentation and governance structures.
The most alarming gap is the existential safety domain: all firms pursuing AGI development lack credible, detailed, evidence-based plans for ensuring such systems remain aligned and controllable. This represents a profound mismatch between stated objectives and demonstrated capability to achieve them safely.
Necessary Reforms
For Companies:
- Operationalize safety commitments through quantified KRIs and KCIs with clear trigger thresholds
- Dramatically increase investment in technical safety research, particularly alignment and interpretability
- Implement public whistleblower policies meeting international best practices
- Commission independent pre-deployment safety evaluations and publish results without censorship
- Publish concrete, evidence-based strategies for AGI safety, governance, and control
For Regulators and Policymakers:
- Establish binding international standards for frontier AI development (following EU AI Act model but adapted for global context)
- Create independent oversight bodies with inspection and enforcement authority
- Require pre-deployment safety certification for high-capability models
- Fund independent technical safety research without corporate conflicts of interest
- Establish coordination mechanisms to prevent international AGI races
For Society:
- Demand transparency from AI companies through shareholder activism and consumer choice
- Support independent research on AI safety and governance
- Participate in policy consultations on AI regulation
- Monitor and publicize corporate safety investments and commitments
The Window for Proactive Governance
Tegmark noted that “the window for proactive governance is narrowing.” If frontier AI companies do not rapidly adopt stronger safety practices voluntarily, regulators will likely impose them. However, regulation developed under crisis conditions tends to be more restrictive and less nuanced than regulation developed proactively.
The current moment offers a choice: either the AI industry self-corrects through meaningful safety investment and transparent governance, or regulators will impose frameworks that may be less efficient but no less restrictive.
Looking Forward
The FLI AI Safety Index will continue as an annual evaluation. The 2025 findings set a baseline. Future iterations will track whether companies improve their safety practices or whether the gap between capability and governance continues to widen.
For now, the clearest takeaway is this: the most capable AI systems on the market are being developed and deployed by organizations that, by their own independent expert assessment, are fundamentally unprepared for the risks they are creating.
Summary
- No company achieved a safety grade higher than C+, indicating even industry leaders fall short of standards necessary for superintelligence development
- All firms scored F or D in Existential Safety, lacking credible AGI control plans despite pursuing AGI as a stated goal
- Risk assessment lags: Only three companies conduct substantive dangerous capability evaluations; most lack external verification
- Safety frameworks rely on qualitative commitments rather than quantified, measurable thresholds (KRIs/KCIs)
- Transparency gaps persist: Only one company (OpenAI) published a complete whistleblower policy; information sharing varies dramatically
- Regulatory pressures mounting: EU AI Act and international standards development are creating compliance pressures companies cannot ignore
- Proactive reform is urgent: Without rapid industry improvements, more restrictive regulation will follow
Recommended Hashtags
#AISafety #FrontierAI #Superintelligence #AIGovernance #AIRisk #FutureOfLife #AIRegulation #MachineAlignment #TechGovernance #ResponsibleAI
References
AI Safety Index - Summer 2025 Future of Life Institute | 2025-07-17 https://futureoflife.org/ai-safety-index-summer-2025/
AI Safety Index - Winter 2025 Future of Life Institute | 2025-12-01 https://futureoflife.org/ai-safety-index-winter-2025/
New report says OpenAI, xAI and Meta lag far behind global AI safety standards India Today | 2025-12-03 https://www.indiatoday.in/technology/news/story/new-report-says-openai-xai-and-meta-lag-far-behind
AI companies’ safety practices fail to meet standards BNN Bloomberg | 2025-12-03 https://www.bnnbloomberg.ca/business/2025/12/03/ai-companies-safety-practices-fail-to-meet-global-standards-study-shows/
AI companies’ safety practices fail to meet global standards - study shows Reuters | 2025-12-03 https://www.reuters.com/business/ai-companies-safety-practices-fail-meet-global-standards-study-shows-2025-12-03/
인류 보호를 위한 AI 기업의 노력을 평가하는 안전 보고서 카드 Neuron Expert | 2025-12-04 https://neuron.expert/news/a-safety-report-card-ranks-ai-company-efforts-to-protect-humanity/15543/ko/
The Winter 2025 AI Safety Index shows an industry whose capability outpaces safety P4S4L Substack | 2025-12-05 https://p4sc4l.substack.com/p/the-winter-2025-ai-safety-index-shows
첨단 인공지능 안전 및 신뢰성 기술 표준 동향 ETRI E-Trends | 2024-09-30 https://ettrends.etri.re.kr/ettrends/210/0905210023/
해외 AI안전연구소 추진 현황과 시사점 정보통신기획평가원(SPRI) | 2024 https://www.spri.kr/download/23535
AI 위험 관리 및 신뢰성 확보를 위한 기술표준화 현황 ETRI Technical Report | 2024