puras

Models and pricing

Every model you can put in skill.yaml, plus how media generation is billed.

Two surfaces are billed: agentic skills (per token, by the model in skill.yaml) and media generation (per call / second / megapixel, by the model slug you pass to media.run()). Deterministic Python skills are free at the platform level — you only pay for what they call into.

Money is tracked in MICROS (1 USD = 1,000,000 micros); the dashboard and API responses convert back to dollars.

Agentic skill models

The model: field in skill.yaml is a public slug in family/variant form. Use one of:

yaml
# skills/my-skill/skill.yaml
model: claude/sonnet-4-7

Three families are available — Claude, GPT, and Gemini. Pick by need; you don't need to know how each one is served.

Claude — prices

Per 1 million tokens, rounded to the cent.

SlugFamilyInputOutputNotes
claude/opus-4-7Opus 4.7$18.00$90.00Highest reasoning, 1M context tier available
claude/opus-4-6Opus 4.6$18.00$90.00
claude/opus-4-5Opus 4.5$18.00$90.00
claude/sonnet-4-7Sonnet 4.7$3.60$18.00Balanced — recommended default
claude/sonnet-4-6Sonnet 4.6$3.60$18.00
claude/sonnet-4-5Sonnet 4.5$3.60$18.00
claude/haiku-4-5Haiku 4.5$0.30$1.50Fastest, cheapest — good for narrow tools

Vision and PDF attachments are supported on every Claude slug above. See agent-attachments for how attachments flow through.

GPT — prices

Per 1 million tokens, rounded to the cent.

SlugFamilyInputOutputNotes
gpt/5GPT-5$1.50$12.00Strong general reasoning
gpt/5-miniGPT-5 mini$0.30$2.40Cheap; good for narrow tools
gpt/4oGPT-4o$3.00$12.00Mature, multimodal
gpt/4o-miniGPT-4o mini$0.18$0.72Cheapest in the catalog

Vision (image) attachments supported. PDF attachments are not supported — convert pages to images first.

Gemini — prices

Per 1 million tokens, rounded to the cent.

SlugFamilyInputOutputNotes
gemini/2.5-pro2.5 Pro$1.50$12.00Long-context reasoning
gemini/2.5-flash2.5 Flash$0.09$0.36Fast, very cheap
gemini/2.0-flash2.0 Flash$0.12$0.48Older fast tier

Vision (image) attachments supported. PDF attachments are not supported — convert pages to images first.

Indicative rates. Final cost per job is shown on the job detail page based on the actual tokens used.

Defaults and fallbacks

  • If model: is omitted from skill.yaml, the platform default is used (claude/sonnet-4-6 today). The fallback is stable per deployment — you'll see it on the agent_start event.
  • Unknown slugs are rejected at deploy time with a clear error listing the available slugs.

Media generation

media.run(model, inputs) invokes any model we have a registered cost for. The full catalog of slugs and their rates is the pricing page or GET /v1/pricing. Cost is static per model — unknown slugs are rejected with HTTP 400 (so a request can't slip through unpriced).

Four pricing shapes — most models use exactly one:

ShapeFormulaUsed for
Per callflat feeImage generators billed per image
Per secondrate × inference_time (or input duration for video/audio)Video and audio generation
Per megapixelrate × (width × height ÷ 1,000,000)Upscalers, some image models
Input-conditionalrate table indexed by inputs (audio on/off, quality × size, with/without video reference)GPT Image 2, Kling v3 video, Veo 3, Seedance r2v

Cost is debited at job-completion time and shown on the job detail page.

Most model families ship multiple slugs — one per variant. The naming convention:

  • -t2v, -i2v, -r2v → text-to-video, image-to-video, reference-to-video
  • -edit → image-to-image edit
  • -fast- segment → the fast tier of that variant (lower per-second rate, slightly lower quality)

Example

python
from puras import media

# Per-call image, default high@1024x1024
img = media.run("openai/gpt-image-2", {"prompt": "a red bicycle"})

# Cheap edit
cheap = media.run("bytedance/seedream-v4-edit", {
    "image_url": img["output_url"],
    "prompt": "make it neon",
})

# Per-second video with audio on (billed at the audio-on rate)
clip = media.run("google/veo-3-t2v", {"prompt": "...", "duration": 5, "generate_audio": True})

# Image-to-video, fast tier
vid = media.run(
    "bytedance/seedance-2-fast-i2v",
    {"image_url": img["output_url"], "prompt": "spin slowly", "duration": 8},
)

The dashboard's Usage tab breaks down spend by model for every billable call so you can see exactly what cost what.

Free things

  • Function execution (deterministic Python skills) and worker time.
  • Drive uploads / signed URLs / list calls.
  • MCP server, dashboard, API key checks.
  • web_search, image_search, web_fetch, download_url, bash, file_read agent tools at the platform level (you only pay for what they trigger — e.g. a media.run inside a tool is billed normally).

Conventions

  • Default to Sonnet unless you have a reason. Opus's premium pays off for genuinely hard reasoning; Haiku and the mini tiers shine for narrow, structured tools where you can keep the prompt tight.
  • Budget caps live on the project, not the skill. A runaway agent will burn the project's balance, not just one skill's. Set per-project balance limits in the dashboard.

See concepts for how usage rolls up into the billing surface, and sdk-media for the full media.run contract.