How large is the context window?

Gemini 3 Pro supports up to a 2 million-token context window — the largest in the current frontier tier.

Can it understand video?

Yes — it natively reads and reasons over video, including timestamps and on-screen text, not just extracted frames.

How much does Gemini 3 Pro cost?

Pricing is around $7 per million input tokens and $21 per million output tokens — aggressive for the context size on offer.

Book a Strategy Call

BY Google DeepMind

Released December 2025 · v3.0

Live

Gemini 3 Pro

Long-context multimodal model with 2M-token windows and native video understanding.

Context: 2M tokens
Pricing · in / out: $7 / $21 per 1M
Modality: Text + Image + Video

At a glance

Gemini 3 Pro is Google DeepMind's long-context multimodal model, released in December 2025. Its headline feature is a 2 million-token context window with native video understanding — it can watch hours of footage and reason across an entire document set in one request.

It is tightly integrated with Google Workspace and Vertex AI, which makes it a natural fit for teams already on Google Cloud. Pricing is aggressive for the context size, making large-context workloads cheaper than comparable frontier models.

Key abilities

2M-token context

The largest production context window in the frontier tier — enough for entire codebases, video transcripts, or document libraries in a single pass.

Native video understanding

Reads and reasons over video directly, including timestamps and on-screen text — useful for review, summarisation, and search.

Workspace & Vertex integration

First-class hooks into Google Workspace and Vertex AI, with grounding against Google Search.

Strong multimodal grounding

Accurate citation and grounding across mixed text, image, and video inputs.

How teams use it

Research

Whole-library analysis

Loading hundreds of documents into a single 2M-token window to synthesise findings with citations.

2Mtokens in one request

Media

Video understanding & search

Transcribing, summarising, and making long-form video searchable by content and on-screen text.

hoursof footage reasoned over per call

Operations

Workspace automation

Drafting in Docs, analysing Sheets, and triaging Gmail with native Workspace context.

nativeGoogle Workspace integration

Drawbacks

Ecosystem lock-in

Best value inside Google Cloud. Much of the advantage comes from Workspace and Vertex integration; outside that ecosystem the edge narrows.

Latency

Large contexts are slow. Filling the 2M-token window adds real latency, so it is better for back-office analysis than sub-second chat.

Closed weights

No self-hosting. Like other frontier closed models, Gemini 3 Pro cannot be self-hosted or fine-tuned on its weights.

Gemini 3 Pro

Long-context multimodal model with 2M-token windows and native video understanding.

Context: 2M tokens
Pricing · in / out: $7 / $21 per 1M
Modality: Text + Image + Video

At a glance

Key abilities

2M-token context

The largest production context window in the frontier tier — enough for entire codebases, video transcripts, or document libraries in a single pass.

Native video understanding

Reads and reasons over video directly, including timestamps and on-screen text — useful for review, summarisation, and search.

Workspace & Vertex integration

First-class hooks into Google Workspace and Vertex AI, with grounding against Google Search.

Strong multimodal grounding

Accurate citation and grounding across mixed text, image, and video inputs.

How teams use it

Research

Whole-library analysis

Loading hundreds of documents into a single 2M-token window to synthesise findings with citations.

2Mtokens in one request

Media

Video understanding & search

Transcribing, summarising, and making long-form video searchable by content and on-screen text.

hoursof footage reasoned over per call

Operations

Workspace automation

Drafting in Docs, analysing Sheets, and triaging Gmail with native Workspace context.

nativeGoogle Workspace integration

Drawbacks

Ecosystem lock-in

Best value inside Google Cloud. Much of the advantage comes from Workspace and Vertex integration; outside that ecosystem the edge narrows.

Latency

Large contexts are slow. Filling the 2M-token window adds real latency, so it is better for back-office analysis than sub-second chat.

Closed weights

No self-hosting. Like other frontier closed models, Gemini 3 Pro cannot be self-hosted or fine-tuned on its weights.

Gemini 3 Pro

At a glance

Key abilities

2M-token context

Native video understanding

Workspace & Vertex integration

Strong multimodal grounding

How teams use it

Research

Whole-library analysis

Media

Video understanding & search

Operations

Workspace automation

Drawbacks

Ecosystem lock-in

Latency

Closed weights

People also ask

Read AI Insights weekly.

Gemini 3 Pro

At a glance

Key abilities

2M-token context

Native video understanding

Workspace & Vertex integration

Strong multimodal grounding

How teams use it

Research

Whole-library analysis

Media

Video understanding & search

Operations

Workspace automation

Drawbacks

Ecosystem lock-in

Latency

Closed weights

People also ask

Read AI Insights weekly.