OpenAI unveils GPT-5.5, a powerful AI model for coding, knowledge work and research

OpenAI has unveiled GPT‑5.5, a model it describes as a significant leap forward in agentic AI — strong in "coding, computer use, knowledge work, and early scientific research" — capable of carrying out complex, multi-step work autonomously. The release also marks a contrasting approach to that of its rival Anthropic.

The company describes GPT‑5.5 as its smartest model to date, one that can take a multi-part task and see it through to completion — planning, using tools, checking its own work, and navigating ambiguity along the way.

"It just goes and kind of figures it out, deals with ambiguity," OpenAI co-founder and president Greg Brockman told Bloomberg. "It's a much more intuitive experience."
The announcement marks what OpenAI considers a pivotal moment in how people and organisations interact with artificial intelligence on a day-to-day basis — and arrives days before a high-profile federal trial in Oakland, California, involving Elon Musk and OpenAI executives Sam Altman and Greg Brockman. The case will settle a years-long dispute between the cofounders of OpenAI over the group’s founding mission.

A step change in coding ability

Nowhere are GPT‑5.5's gains more visible than in software engineering, which Brockman described as a market where the model is "extremely" good. On Terminal-Bench 2.0, a benchmark that tests complex command-line workflows, the model achieves an accuracy of 82.7 per cent. On SWE-Bench Pro, which evaluates the resolution of real-world GitHub issues, it reaches 58.6 per cent — solving more tasks end-to-end in a single pass than any previous OpenAI model.

Early testers, including senior engineers, reported that GPT‑5.5 felt noticeably stronger than its predecessor at reasoning and autonomy. One engineer at NVIDIA described losing access to the model as feeling like having "a limb amputated”.

Crucially, GPT‑5.5 achieves all of this while using fewer tokens than GPT‑5.4, meaning it is simultaneously more capable and more efficient. Brockman told Bloomberg that the model will be used to power a so-called super app that OpenAI plans to launch, bringing together its chatbot, coding tool and web browser.

Knowledge work at scale

The model's strengths extend well beyond coding. It is designed to operate across the full breadth of everyday office software — using email, spreadsheets, calendars, and other applications to follow a user's commands on a computer, functioning as a personal digital assistant for office workers. OpenAI reports that over 85 per cent of its own workforce now uses Codex every week, spanning functions from finance and communications to data science and product management.

Internal teams have already put GPT‑5.5 to practical use. The finance team used it to review 24,771 tax forms — totalling 71,637 pages — completing the task two weeks faster than the previous year. A member of the go-to-market team automated the generation of weekly business reports, saving between five and 10 hours per week.
On the GDPval benchmark, which tests an agent's ability to produce well-specified knowledge work across 44 occupations, GPT‑5.5 scores 84.9 per cent.

On OSWorld-Verified, which measures whether a model can operate real computer environments independently, it reaches 78.7 per cent.

Helping scientific research

Among the most remarkable claims in OpenAI's announcement concern scientific research. The company says GPT‑5.5 has demonstrated the ability to function as a genuine “research partner” rather than a search tool — critiquing manuscripts, stress-testing technical arguments, and working iteratively across code, notes, and documents.
The launch comes just days after OpenAI separately rolled out an early AI model designed to accelerate drug discovery, as tech companies scramble to demonstrate that artificial intelligence can deliver genuine scientific breakthroughs.

On GeneBench, a benchmark focused on multi-stage scientific data analysis in genetics and quantitative biology, GPT‑5.5 shows clear improvement over its predecessor. The tasks involved correspond to what human scientific experts might spend several days completing.

An internal version of the model helped discover a new mathematical proof related to Ramsey numbers — a longstanding problem in combinatorics. The proof was subsequently verified in Lean, a formal verification system, providing concrete evidence that GPT‑5.5 can contribute not merely code or explanation, but original, useful mathematical arguments.

Speed without sacrifice

One of the more technically notable achievements OpenAI highlights is that GPT‑5.5 matches the per-token latency of GPT‑5.4 in real-world serving, despite being a considerably more capable model. This was achieved by co-designing the model for deployment on NVIDIA GB200 and GB300 NVL72 systems, and by using the model itself — alongside Codex — to identify and implement infrastructure improvements. Token generation speeds were increased by over 20 per cent through custom load-balancing heuristics written by Codex after analysing weeks of production traffic patterns.

The release of GPT‑5.5 arrives against a backdrop of fierce competition in AI-powered cybersecurity.

Earlier this month, Anthropic unveiled a model called Mythos, which it described as uniquely advanced in cybersecurity and capable of detecting and exploiting vulnerabilities in critical software. Anthropic chose to share that model with only around 40 organisations maintaining critical infrastructure — including Apple, Amazon, Microsoft, and Google — arguing the approach would allow those organisations to patch security holes before malicious actors could exploit them. Some cybersecurity experts, however, questioned whether limiting access so severely would ultimately leave more organisations unable to defend themselves.

OpenAI has taken a different path. Rather than restricting its most capable model, it has released GPT‑5.5 to hundreds of millions of ChatGPT users, while separately distributing its own cyber-focused model, GPT‑5.4-Cyber, to hundreds of organisations — with plans to expand — and has worked to verify user identities to prevent misuse.

That said, OpenAI has added guardrails to GPT‑5.5 specifically aimed at preventing its use for cybersecurity tasks. The cyber-specific restrictions are dropped only for verified users through its Trusted Access for Cyber programme, meaning the two tiers of access serve distinct purposes. Notably, benchmark testing by Vals AI, a company that tracks AI performance, suggests GPT‑5.5 is not yet as powerful as Anthropic's Claude Mythos in this domain, reports The New York Times.

OpenAI has rated GPT‑5.5's cybersecurity and biology capabilities as "High" under its internal Preparedness Framework, and has introduced stricter classifiers for potential cyber risks, acknowledging that some users may initially find these disruptive. The company is also working with government partners to explore how the model might support the protection of critical infrastructure such as power grids and public data systems.

Availability and pricing

GPT‑5.5 is rolling out immediately to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex. Notably, OpenAI has not yet released the model as an API, giving the company more time to study security issues before allowing third parties to embed the technology in their own applications and tools. That API access — priced at $5 per million input tokens and $30 per million output tokens — is expected to follow shortly.
A more powerful GPT‑5.5 Pro variant, aimed at higher-accuracy work, will be available via the API at $30 per million input tokens and $180 per million output tokens.

While the new model is priced higher than GPT‑5.4, OpenAI argues that its improved token efficiency means most users will see better results at comparable or lower overall cost.

OpenAI unveils GPT-5.5, a powerful AI model for coding, knowledge work and research

The company describes GPT‑5.5 as its smartest model to date, one that can take a multi-part task and see it through to completion — planning, using tools, checking its own work, and navigating ambiguity along the way

RELATED TOPICS

Iranian foreign minister Araghchi leaves Pakistan, skips 'peace' negotiations with US envoys

Rahul Gandhi plays same gambit as 10 years ago with twin attacks on Modi, Mamata in Bengal

PM Modi’s jhalmuri to prime-time TV at restaurants, why food is central to Bengal polls

Rashtriya Surrender Sangh, fake nationalism in Nagpur. Pure servility in USA

Lansdowne rename plan will hit tourism and local economy, BJP MLA warns Rajnath Singh

India's power demand breaks all-time record at over 252 GW, heat wave pushes AC sales

Jadavpur University teachers hit back at PM Modi, Pradhan targets Mamata Banerjee

‘Raghav Chaddi’ to ‘washing machine’, memes take over after seven AAP MPs join BJP

Cascio siblings turn on Michael Jackson, allege years of abuse after decades of defense

'Not warranted' to drag unpublished book into limelight: Ex-army chief Naravane

Safe to say unsafe: Bengal among India's top four states in crimes against women

United States Republicans introduce bill for 3-year H-1B visa pause, tougher curbs