AI launch 2026

GPT-5.6 for Business: Sol, Terra, and Luna - Which Model Do You Actually Need?

OpenAI previewed GPT-5.6 on 26 June 2026 as a family of three models covering every price point from high-stakes agentic work to bulk automation. Here is the decision guide for teams building on the API today.

Which GPT-5.6 model should a business use? For most production workloads, Terra is the default choice: it matches GPT-5.5 performance at half the cost ($2.50/$15 per million tokens). Use Sol ($5/$30) only when you need maximum reasoning depth or orchestrated subagent workflows. Use Luna ($1/$6) for high-volume, low-complexity tasks such as classification, summarisation, or first drafts where latency and cost dominate.

What OpenAI Actually Released on 26 June 2026

GPT-5.6 is not a single model. OpenAI launched it as a three-tier family under a restricted preview shared first with approximately 20 trusted partner organisations, including the U.S. government, before broader access is opened 'in the coming weeks.' The three tiers are Sol (flagship), Terra (balanced), and Luna (fast and affordable). Each targets a distinct price-to-performance point, which matters enormously for teams designing cost-aware agent pipelines.

The release also introduced two advanced modes specific to Sol: a max reasoning effort setting that allocates additional compute time before the model responds, and Ultra mode, which replaces single-model responses with a coordinated system of subagents tackling subtasks in parallel. Sol Ultra scored 91.9% on Terminal-Bench 2.1, an agentic coding benchmark; standard Sol scored 88.8%. For context, GPT-5.5 scored 88.0% on the same benchmark.

Sol vs Terra vs Luna: Full Comparison

DimensionSol (flagship)Terra (balanced)Luna (fast)
Input price (per 1M tokens)$5.00$2.50$1.00
Output price (per 1M tokens)$30.00$15.00$6.00
Cost vs SolBaseline2x cheaper5x cheaper
Terminal-Bench 2.1 score88.8% (91.9% Ultra)82.5%84.3%
Max reasoning effort modeYes (confirmed)Not confirmedNot confirmed
Ultra subagent modeYes (confirmed)NoNo
Context windowNot officially confirmedNot officially confirmedNot officially confirmed
Best forComplex agents, security, deep reasoningProduction APIs, migration from GPT-5.5Classification, bulk drafts, tagging
Cerebras deployment (July)750 tokens/sec (confirmed)Not announcedNot announced

Note: context window sizes were not disclosed in OpenAI's 26 June preview post. Reports of 1.5 million tokens are unconfirmed leaks. Treat context window figures as speculative until general availability documentation is published.

Cost vs GPT-5.5: What the Numbers Mean

GPT-5.5 standard pricing is $5.00 input / $30.00 per million output tokens, the same as Sol. Terra therefore delivers comparable benchmark performance (GPT-5.5 scored 88.0% on Terminal-Bench 2.1; Terra scored 82.5%) at exactly half the price. For the majority of business API calls, where the task does not require Sol-level reasoning depth, migrating from GPT-5.5 to Terra cuts model costs by 50% with no architectural changes required.

A practical routing model illustrates the savings further. A pipeline processing 10 million tokens per day that routes 10% of traffic to Sol, 30% to Terra, and 60% to Luna costs roughly $46 per day, compared to $125 per day if every call goes to Sol. The discipline is: classify task complexity first, then route to the cheapest tier that can handle it reliably.

OpenAI also confirmed improved prompt caching with a 30-minute minimum cache lifetime and a 90% discount on cached input reads. For agent loops that re-send long system prompts on every turn, this alone can substantially reduce effective per-call cost.

Which Workload Belongs on Which Tier

Sol is the right choice when the cost of a wrong answer outweighs the cost of the model call. Practical examples include automated security vulnerability triage, legal contract review with tool use, multi-step agentic pipelines where each subtask conditions the next, and any workflow where you would previously have routed to GPT-5.5 Pro ($30/$180 per million tokens) for maximum output quality. Sol Ultra specifically suits workloads that benefit from parallelised subagent decomposition, such as large codebase refactors or complex research synthesis. Digiton uses this level of reasoning for its AI employees practice, where the agent must make consequential decisions across long contexts.

Terra is the default for production APIs that currently run on GPT-5.5 and where engineering teams want a straightforward cost reduction without re-evaluating every prompt. Research summaries, competitor analysis, structured data extraction, customer support deflection, and batch SEO rewrites all fit here. If you are uncertain which tier to start on for a new use case, Terra is the low-risk entry point.

Luna serves high-volume, latency-sensitive pipelines. Think millions of short classification calls per day, title variant generation, content tagging, and first-draft production at scale where a human or a stronger model reviews before publishing. At $1/$6, Luna costs five times less than Sol and is the correct tier for the 'worker agent' layer in a multi-agent system where Sol or Terra acts as the orchestrator. Our AI automation ROI calculator can help you model cost across tiers for your specific call volumes.

How to Adopt GPT-5.6 in Your Organisation

During the limited preview period, most organisations will not have direct API access. The practical steps right now are: audit your current GPT-5.5 usage to classify calls by complexity, define a routing schema that maps complexity buckets to Sol, Terra, or Luna, and prepare to test on Terra first when access opens.

Agentic workflows require additional planning. Sol Ultra's subagent architecture means your orchestration layer needs to handle parallel task results rather than a single response stream. Design for asynchronous results, implement fallbacks to standard Sol if Ultra latency is unacceptable, and gate subagent calls behind confidence thresholds so you do not spend Sol-level compute on tasks Luna can handle.

Security teams should note that Sol was specifically highlighted by OpenAI for cybersecurity vulnerability detection, which is part of why the U.S. government was involved in the initial rollout. If your use case touches infrastructure security or access control reasoning, Sol is worth the premium, and it warrants evaluation against the threat landscape described in our agentjacking defense guide.

Equally important: GPT-5.6's arrival does not change the underlying distribution challenge. If your organisation does not already appear in AI-generated answers about your industry, raw model capability gains do not solve that visibility gap. Our Google AI Mode survival playbook and the broader AI agency work we run out of Lisbon address that layer separately from the model selection question.

If you want a structured review of which GPT-5.6 tier fits your current workflows, or help designing the routing and caching layer to minimise cost without sacrificing output quality, get in touch with Digiton.

Frequently asked questions

What is GPT-5.6 and when did it launch?

GPT-5.6 is a family of three models, Sol, Terra, and Luna, previewed by OpenAI on 26 June 2026. It launched in a restricted preview to approximately 20 partner organisations, with broader API and ChatGPT availability expected in the weeks following the initial announcement.

What is the difference between Sol, Terra, and Luna?

Sol is the flagship tier for complex reasoning, security, and agentic workflows, priced at $5 input / $30 output per million tokens. Terra is the balanced mid-tier at $2.50/$15, designed to match GPT-5.5 performance at half the cost. Luna is the fast, affordable tier at $1/$6, suited to high-volume tasks where speed and unit cost matter more than maximum intelligence.

How does GPT-5.6 pricing compare to GPT-5.5?

GPT-5.5 standard pricing is $5 input / $30 output per million tokens, identical to Sol. Terra at $2.50/$15 therefore represents a 50% cost reduction for workloads that were previously running on GPT-5.5 standard, with comparable benchmark performance. Luna at $1/$6 is an 80% cost reduction relative to GPT-5.5.

What is Sol Ultra mode?

Sol Ultra is a mode where GPT-5.6 Sol orchestrates multiple subagents rather than producing a single response. It scored 91.9% on Terminal-Bench 2.1, versus 88.8% for standard Sol. It is designed for the most demanding agentic tasks where parallel decomposition of subtasks improves quality. Exact pricing for Ultra mode has not been published by OpenAI.

What is the context window for GPT-5.6 Sol, Terra, and Luna?

OpenAI did not disclose context window sizes in the 26 June 2026 preview announcement. Reports of 1.5 million tokens are unconfirmed and appear to come from early-access user observations rather than official documentation. Treat context window figures as unconfirmed until OpenAI publishes general availability specifications.

Which GPT-5.6 model should I use for AI agents?

For orchestrator agents that make high-stakes decisions or coordinate complex multi-step workflows, use Sol, and consider Ultra mode for tasks that benefit from parallel subagent decomposition. For worker agents handling routine subtasks within a pipeline, Luna is typically sufficient and dramatically reduces per-call cost. Terra works well as a middle tier for agents with moderate complexity requirements.

Can I access GPT-5.6 through the API now?

As of the 26 June 2026 launch date, access is limited to approximately 20 trusted partner organisations. OpenAI has stated that broader availability through the API, ChatGPT, and Codex is planned 'in the coming weeks.' A Cerebras-hosted version of Sol offering 750 tokens per second is confirmed for July 2026 for select customers.

Is Terra a direct replacement for GPT-5.5?

Terra is positioned as a cost-efficient alternative to GPT-5.5 standard, offering 'competitive performance at approximately 2x lower cost' according to OpenAI's framing. On Terminal-Bench 2.1, Terra scored 82.5% versus GPT-5.5's 88.0%, so there is a measurable gap on agentic coding tasks. Evaluate on your specific workload before treating it as a drop-in replacement.

How does prompt caching work with GPT-5.6?

GPT-5.6 introduces a 30-minute minimum cache lifetime for prompt caching, longer than previous models. Cache writes are billed at 1.25x the model's standard uncached input rate, while cache reads receive a 90% discount on cached input pricing. For agent loops that re-send long system prompts on every turn, caching meaningfully reduces effective per-call cost.

Why did the U.S. government get early access to GPT-5.6?

OpenAI coordinated with the U.S. government before the preview release, citing cybersecurity concerns given Sol's advanced capability for vulnerability detection and security-focused agentic tasks. OpenAI stated this staged approach is a 'short-term step' and that it does not intend government-gated releases to become the long-term default for future model launches.

What benchmarks has GPT-5.6 been tested on?

OpenAI published results on Terminal-Bench 2.1, an agentic coding benchmark. Sol Ultra scored 91.9%, Sol scored 88.8%, Luna scored 84.3%, and Terra scored 82.5%. GPT-5.5 scored 88.0% on the same benchmark. Additional benchmark improvements on FrontierMath and other evaluations have been reported but are not yet officially confirmed by OpenAI.

How should I route traffic across Sol, Terra, and Luna to minimise cost?

The recommended approach is to classify task complexity before routing. Default new calls to Luna, escalate to Terra when the task requires structured reasoning or output quality above a threshold, and reserve Sol for tasks where the cost of an error outweighs the model premium. Modelling a 10%/30%/60% Sol/Terra/Luna split can reduce costs by over 60% compared to routing everything through Sol.

Related

State of AI Operations for SMBs 2026AI agency in LisbonGoogle Preferred Sources guide

Ready to put AI to work?

Book a discovery audit and we will map the highest-ROI AI agents and automations for your business.

Book a discovery audit →