Gemini 3.1: Beginner Guide

Rui Dai
Rui Dai Engineer
Gemini 3.1: Beginner Guide

Most "getting started" guides skip the part where you actually get started. They spend 600 words on history, throw a wall of model specs at you, then end with "happy prompting!" — and you're still staring at a blank input box. Not this one. By the end of this guide you'll have Gemini 3.1 running a real prompt, either in your browser or via API, with the right model picked for your use case. I've been testing all three models since the March 3, 2026 launch — here's what you actually need to know.

What Is Gemini 3.1?

What Is Gemini 3.1?

Gemini 3.1 is the latest generation of Google's AI model family, launched on March 3, 2026. It's not one model — it's three, each designed for a different job. All of them run on the same foundation of state-of-the-art multimodal reasoning, handling text, images, audio, video, and code.

What's new in 3.1 versus earlier Gemini versions? Three things worth caring about: better reasoning depth, a new thinking_level parameter that lets you control how hard the model thinks (and how much it costs), and significantly improved agentic capabilities — meaning it can plan and execute multi-step tasks, not just answer single questions.

All Gemini 3.1 models are currently in preview. Production apps that were using Gemini 3 Pro Preview need to migrate — Google deprecated that model on March 9, 2026.

Flash-Lite vs Flash vs Pro — Which One to Use

Flash-Lite vs Flash vs Pro — Which One to Use

Here's the decision table I wish existed when I started:

ModelModel StringBest ForInput PriceOutput PriceContext Window
Gemini 3.1 Flash-Litegemini-3.1-flash-lite-previewTranslation, moderation, high-volume tasks$0.25/1M$1.50/1M1M tokens
Gemini 3 Flashgemini-3-flash-previewBalanced tasks, agentic workflows, prototyping~$0.30/1M~$2.50/1M1M tokens
Gemini 3.1 Progemini-3.1-pro-previewComplex reasoning, coding, research synthesis$2.00/1M$18.00/1M1M tokens

The honest rule of thumb: start with Flash-Lite if you're doing repetitive or high-volume work. Use Flash if you're prototyping something that needs a bit more reasoning. Reach for Pro only when the task genuinely needs it — it's roughly 12x more expensive than Flash-Lite on output tokens.

One thing that tripped me up: Flash and Flash-Lite both have free tiers in the Gemini API. Pro does not have a free API tier, but you can test it for free inside Google AI Studio.

How to Use Gemini 3.1 in Google AI Studio

How to Use Gemini 3.1 in Google AI Studio

Google AI Studio is the fastest way to try Gemini 3.1 without writing a line of code. No billing setup, no SDK install — just a browser.

Step 1: Sign In

Go to aistudio.google.com. Sign in with any Google account. That's it. New users get a default Google Cloud project and API key created automatically on first login.

One note: if you already have a Google Cloud account, you won't get a default project auto-created. You'll need to import one via the Projects view in the dashboard.

Step 2: Select Your Model

Once you're in, click Create new prompt (or open an existing one). In the top-right panel, you'll see a model selector dropdown. As of March 2026, the available Gemini 3 series options are:

  • Gemini 3.1 Pro Preview
  • Gemini 3 Flash Preview
  • Gemini 3.1 Flash-Lite Preview

Select the one that fits your use case using the table above. If you're just exploring, start with Gemini 3 Flash — it's free in AI Studio and handles most general tasks well.

You'll also see a Thinking toggle and a thinking level selector (minimal / low / medium / high). For quick tests, leave it on the default. For tasks that involve logic or multi-step reasoning, bump it up to medium or high and watch the difference.

Step 3: Run Your First Prompt

Type your prompt in the message box and hit Run. Here are three prompts worth testing to get a feel for the model's range:

Simple text task:

Summarize the key differences between REST and GraphQL APIs in three bullet points.

Reasoning task (good for comparing thinking levels):

A pipeline processes 10,000 requests per hour. Each request costs $0.003 to process.
If we reduce processing cost by 40%, how much do we save per day? Show your work.

Multimodal task (upload an image first using the paperclip icon):

Describe what's happening in this image and suggest three ways to improve it.

You'll see the response stream in real time. In the top-right corner, AI Studio shows you token usage and estimated cost — useful for calibrating which model tier actually makes sense for your workload before you commit to API calls.

How to Use Gemini 3.1 via API

If you're building something — even a quick script — the API is the right path. Here's the minimal setup.

Get Your API Key

How to Use Gemini 3.1 via API

Head to aistudio.google.com/app/apikey. Click Create API key, select your project, and copy the key. Set it as an environment variable so you don't hardcode it anywhere:

# macOS / Linux
export GEMINI_API_KEY="your-key-here"

# Windows (Command Prompt)
set GEMINI_API_KEY=your-key-here

The Gemini API key documentation recommends always using environment variables — never hardcoding keys in source files, especially anything going to version control.

Make Your First Call

Install the SDK. You need Python 3.9+:

pip install -q -U google-genai

Then make your first call. This example uses Flash-Lite, which has a free tier and is fast enough that you'll see results in a couple of seconds:

from google import genai

# Client picks up GEMINI_API_KEY from environment automatically
client = genai.Client()

response = client.models.generate_content(
    model="gemini-3.1-flash-lite-preview",
    contents="Explain the difference between synchronous and asynchronous programming in plain English.",
)

print(response.text)

Want to use Pro instead? Swap the model string:

response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents="Analyze the architectural trade-offs between microservices and a monolith for a team of 5 engineers.",
)

Want to control thinking depth? Add the thinking_config parameter — useful when you're optimizing for cost vs. quality on a specific task:

from google import genai
from google.genai import types

client = genai.Client()

# High thinking — for complex reasoning
response = client.models.generate_content(
    model="gemini-3.1-flash-lite-preview",
    contents="Design a database schema for a multi-tenant SaaS app with role-based access control.",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(thinking_level="high")
    ),
)

print(response.text)

The full reference for available parameters — including streaming, function calling, and context caching — lives in the Gemini API reference docs. If you're working in JavaScript, Go, or Java, the SDK libraries page has install commands and equivalent code samples for each language.

Quick note on the Batch API: If you're running non-time-sensitive workloads at scale (think: processing thousands of documents overnight), the Batch API gives you 50% off standard pricing with a 24-hour turnaround target. Worth knowing before you architect anything high-volume.

Which Method Is Right for You?

No hedging — here's a direct answer:

If you...Use this
Want to test a prompt quickly with no setupGoogle AI Studio
Are building an application or scriptGemini API
Need Pro but want to test free before billingAI Studio first, then API
Are running high-volume / batch workloadsAPI + Batch API
Work in an enterprise with compliance requirementsVertex AI (contact Google Cloud)

One practical workflow I've settled into: prototype in AI Studio with the Thinking panel open so I can see how the model reasons, tweak the prompt until the output is right, then move to the API with the exact model string and thinking level I've validated. Skips a lot of trial-and-error on the code side.

The Gemini API quickstart is genuinely good if you want to go deeper after this — it covers streaming, multimodal inputs, and function calling with clean code examples. That's the next logical step once you've got the basics running.

You might also find these useful:

What Is Gemini 3.1 Flash-Lite? A Plain-English Guide

Gemini 3.1 Pro: Eval Harness

Gemini 3.1 Pro: Debug Workflow

Gemini 3.1 Pro: Vertex Access

Gemini 3.1 Pro: Builder’s Model

Rui Dai
Written by Rui Dai Engineer

Hey there! I’m an engineer with experience testing, researching, and evaluating AI tools. I design experiments to assess AI model performance, benchmark large language models, and analyze multi-agent systems in real-world workflows. I’m skilled at capturing first-hand AI insights and applying them through hands-on research and experimentation, dedicated to exploring practical applications of cutting-edge AI.