Models and pricing

Two surfaces are billed: agentic skills (per token, by the model in skill.yaml) and media generation (per call / second / megapixel, by the model slug you pass to media.run()). Deterministic Python skills are free at the platform level — you only pay for what they call into.

Money is tracked in MICROS (1 USD = 1,000,000 micros); the dashboard and API responses convert back to dollars.

Agentic skill models

The model: field in skill.yaml is a public slug in family/variant form. Use one of:

yaml

# skills/my-skill/skill.yaml
model: claude/sonnet-4-7

Three families are available — Claude, GPT, and Gemini. Pick by need; you don't need to know how each one is served.

Claude — prices

Per 1 million tokens, rounded to the cent.

Slug	Family	Input	Output	Notes
`claude/opus-4-7`	Opus 4.7	$18.00	$90.00	Highest reasoning, 1M context tier available
`claude/opus-4-6`	Opus 4.6	$18.00	$90.00
`claude/opus-4-5`	Opus 4.5	$18.00	$90.00
`claude/sonnet-4-7`	Sonnet 4.7	$3.60	$18.00	Balanced — recommended default
`claude/sonnet-4-6`	Sonnet 4.6	$3.60	$18.00
`claude/sonnet-4-5`	Sonnet 4.5	$3.60	$18.00
`claude/haiku-4-5`	Haiku 4.5	$0.30	$1.50	Fastest, cheapest — good for narrow tools

Vision and PDF attachments are supported on every Claude slug above. See agent-attachments for how attachments flow through.

GPT — prices

Per 1 million tokens, rounded to the cent.

Slug	Family	Input	Output	Notes
`gpt/5`	GPT-5	$1.50	$12.00	Strong general reasoning
`gpt/5-mini`	GPT-5 mini	$0.30	$2.40	Cheap; good for narrow tools
`gpt/4o`	GPT-4o	$3.00	$12.00	Mature, multimodal
`gpt/4o-mini`	GPT-4o mini	$0.18	$0.72	Cheapest in the catalog

Vision (image) attachments supported. PDF attachments are not supported — convert pages to images first.

Gemini — prices

Per 1 million tokens, rounded to the cent.

Slug	Family	Input	Output	Notes
`gemini/2.5-pro`	2.5 Pro	$1.50	$12.00	Long-context reasoning
`gemini/2.5-flash`	2.5 Flash	$0.09	$0.36	Fast, very cheap
`gemini/2.0-flash`	2.0 Flash	$0.12	$0.48	Older fast tier

Vision (image) attachments supported. PDF attachments are not supported — convert pages to images first.

Indicative rates. Final cost per job is shown on the job detail page based on the actual tokens used.

Defaults and fallbacks

If model: is omitted from skill.yaml, the platform default is used (claude/sonnet-4-6 today). The fallback is stable per deployment — you'll see it on the agent_start event.
Unknown slugs are rejected at deploy time with a clear error listing the available slugs.

Media generation

media.run(model, inputs) invokes any model we have a registered cost for. The full catalog of slugs and their rates is the pricing page or GET /v1/pricing. Cost is static per model — unknown slugs are rejected with HTTP 400 (so a request can't slip through unpriced).

Four pricing shapes — most models use exactly one:

Shape	Formula	Used for
Per call	flat fee	Image generators billed per image
Per second	rate × `inference_time` (or input duration for video/audio)	Video and audio generation
Per megapixel	rate × (width × height ÷ 1,000,000)	Upscalers, some image models
Input-conditional	rate table indexed by inputs (audio on/off, quality × size, with/without video reference)	GPT Image 2, Kling v3 video, Veo 3, Seedance r2v

Cost is debited at job-completion time and shown on the job detail page.

Most model families ship multiple slugs — one per variant. The naming convention:

-t2v, -i2v, -r2v → text-to-video, image-to-video, reference-to-video
-edit → image-to-image edit
-fast- segment → the fast tier of that variant (lower per-second rate, slightly lower quality)

Example

python

from puras import media

# Per-call image, default high@1024x1024
img = media.run("openai/gpt-image-2", {"prompt": "a red bicycle"})

# Cheap edit
cheap = media.run("bytedance/seedream-v4-edit", {
    "image_url": img["output_url"],
    "prompt": "make it neon",
})

# Per-second video with audio on (billed at the audio-on rate)
clip = media.run("google/veo-3-t2v", {"prompt": "...", "duration": 5, "generate_audio": True})

# Image-to-video, fast tier
vid = media.run(
    "bytedance/seedance-2-fast-i2v",
    {"image_url": img["output_url"], "prompt": "spin slowly", "duration": 8},
)

The dashboard's Usage tab breaks down spend by model for every billable call so you can see exactly what cost what.

Free things

Function execution (deterministic Python skills) and worker time.
Drive uploads / signed URLs / list calls.
MCP server, dashboard, API key checks.
web_search, image_search, web_fetch, download_url, bash, file_read agent tools at the platform level (you only pay for what they trigger — e.g. a media.run inside a tool is billed normally).

Conventions

Default to Sonnet unless you have a reason. Opus's premium pays off for genuinely hard reasoning; Haiku and the mini tiers shine for narrow, structured tools where you can keep the prompt tight.
Budget caps live on the project, not the skill. A runaway agent will burn the project's balance, not just one skill's. Set per-project balance limits in the dashboard.

See concepts for how usage rolls up into the billing surface, and sdk-media for the full media.run contract.