puras

Agent tools reference

Per-tool spec for every built-in tool the skill agent sees at runtime (bash, file_read, media, web_*). Auto-generated from the worker tool specs.

Every agentic skill (one with a markdown entrypoint) runs an Anthropic Messages loop with this set of built-in tools available, plus any user tools the skill declares under tools: and the auto-injected set_output tool when an output_schema is set.

The built-ins are platform-provided — you don't declare them in skill.yaml. Only bash can be turned off (via disable_bash: true); the rest are always on.

For when the agent should reach for which tool — and for the broader attachment model — see agent-attachments. For media model slugs and pricing, see media-models-reference. For the deterministic-skill equivalents (puras.media.run, etc.), see sdk-media and inputs-and-drive.

Index

ToolKind
bashshell
mediagenerate
web_searchweb
image_searchweb
web_fetchweb
file_readfile → context
download_urlurl → drive
set_outputlifecycle

Built-in tools

bash

Run a shell command in the job's working directory. The current dir contains a drive/ folder for files that should persist across jobs (synced to project storage). Anything written elsewhere is ephemeral. Returns combined stdout+stderr (last 8KB) and the exit code.

Inputs

FieldTypeRequiredDefaultNotes
commandstringShell command to execute
timeoutintegerMax seconds (default 60, hard ceiling 600)

Environment

  • cwd = the job's working directory. A drive/ subdirectory is synced to project storage — anything written there persists across jobs; everything else is ephemeral.
  • $PURAS_DEPLOYMENT_ROOT points at the deployment bundle.
  • $PYTHONPATH includes the deployment root and the workdir, so python -c "import <your_module>" works against bundled code.
  • Project secrets are injected as env vars (see concepts).
  • The skill's venv bin/ is on $PATH, so installed CLIs work.

Output

Combined stdout+stderr, last 8KB only. Use redirection (> drive/log.txt) for larger output.

Disable

Set disable_bash: true in skill.yaml to remove bash from the agent's tool list — useful for skills that should only call user-defined tools.

media

Generate media (image, video, audio) by calling a registered model. Pass a model slug (e.g. 'openai/gpt-image-2', 'bytedance/seedream-v4-edit', 'kuaishou/kling-v3-i2v', 'bytedance/seedance-2-fast-t2v') and the inputs that model expects — we pass them through unchanged. Returns the drive path where the file was saved plus a fresh signed URL.

Inputs

FieldTypeRequiredDefaultNotes
inputsobjectInputs sent straight to the upstream model — bring whatever the chosen model wants: 'prompt', 'image_url', 'duration', 'aspect_ratio', 'num_images', etc
modelstringRegistered model slug. Images: 'openai/gpt-image-2' (+ '-edit'), 'bytedance/seedream-v4' (+ '-edit'), 'google/imagen-4', 'kuaishou/kling-v3-image'. Video families with t2v/i2v/r2v + fast variants: 'bytedance/seedance-2-{t2v,i2v,r2v}[-fast]', 'kuaishou/kling-v3-{t2v,i2v}', 'google/veo-3-{t2v,i2v}[-fast]'. Some slugs are input-conditional — e.g. Kling/Veo charge more with generate_audio: true; Seedance r2v drops 40% when a video reference is supplied
kindenumHint for content-type/extension. Default 'auto'. Values: "image" | "video" | "audio" | "auto"
output_pathstringOptional drive subpath (e.g. 'renders/spin.mp4')
output_url_pathstringOptional jq-style path to the output URL in the upstream response (e.g. 'video.url', 'images[0].url'). Only set this if auto-detect picks the wrong field

The output file is saved to the project drive and a fresh signed URL is returned. The agent typically passes the drive_path back to the user or follows up with file_read to look at the result itself.

See also

  • media-models-reference — per-slug input schemas.
  • sdk-media — the deterministic-skill equivalent, puras.media.run(slug, **inputs). Same upstream call — just from Python rather than as an agent tool.

web_search

Search the web via the platform's search provider. Returns a list of results with title, url, and a short snippet.

Inputs

FieldTypeRequiredDefaultNotes
querystringSearch query
max_resultsinteger5Max results to return (1-20, default 5)

Returns a list of {title, url, snippet}. To then load a result's full text, follow up with web_fetch.

image_search

Search for images on the web via the platform's search provider. Returns image URLs, thumbnails, dimensions, and source pages.

Inputs

FieldTypeRequiredDefaultNotes
querystringImage search query
max_resultsinteger5Max results to return (1-20, default 5)

Returns image URLs, thumbnails, dimensions, and source pages. To actually look at one of the images, follow up with download_url + file_read — see agent-attachments for the canonical search → download → look pattern.

web_fetch

Fetch a web page (HTTP GET) and return its plain text content with scripts/styles stripped. Does NOT execute JavaScript — for SPAs that render client-side this will be mostly empty. Returns the final URL (after redirects), page title, and extracted text (truncated to max_chars).

Inputs

FieldTypeRequiredDefaultNotes
urlstringThe URL to fetch (http:// or https://)
max_charsinteger20000Max chars of body text to return (500-200000, default 20000)

Does NOT execute JavaScript. For SPAs that render client-side the body will be mostly empty. Returns the final URL (after redirects), the page title, and extracted text truncated to max_chars.

file_read

Read one or more files from the project's drive and attach them to the conversation. Images (jpg/png/gif/webp) and PDFs come back as vision/document blocks you can look at directly. Text files (code, markdown, JSON, CSV, etc.) are inlined as text. Use this when you need to actually inspect contents — for listing files, use bash ls drive/ instead. Hard cap: 5MB per file, 10 files per call.

Inputs

FieldTypeRequiredDefaultNotes
pathsarray<string>Drive paths relative to the project drive root (e.g. ['uploads/photo.jpg', 'data/report.pdf']). A leading 'drive/' is accepted and stripped. (min 1 items, max 10 items)

Returns a block list — one labeled header per file, then the content. Images/PDFs become vision/document blocks the model looks at directly; text files are inlined as text.

Constraints

  • Drive paths only. A leading drive/ is accepted and stripped. For arbitrary URLs, the agent should download_url first, then file_read.
  • Hard caps: 5 MB per file, 10 paths per call, text inlined up to 100k chars then truncated.
  • On non-vision models, image/document files in the path list are skipped with an error in the result; text files still come through.

See agent-attachments for the broader attachment model (including inputs.attachments at submit time).

download_url

Download a file from a URL via plain HTTP GET and save it to the project's drive bucket. Use this for images, PDFs, CSVs, etc. Returns the drive path and a fresh signed URL. Does NOT resolve share links (Google Drive/Dropbox/YouTube) — only direct HTTP(S) URLs work. Hard cap: 50MB per file.

Inputs

FieldTypeRequiredDefaultNotes
pathstringWhere to save in the drive. Either a full path with filename ('data/report.pdf') or a directory ending in '/' ('downloads/') — in the latter case the filename is inferred from the URL
urlstringDirect file URL (http:// or https://)

Returns the resolved drive_path and a fresh signed output_url (TTL ~1h).

Constraints

  • Plain HTTP(S) only. Share links (Google Drive, Dropbox, YouTube) are not resolved.
  • 50 MB hard cap per file.
  • If path ends in /, the filename is inferred from the URL's last segment.

Lifecycle tool

set_output (auto-injected, conditional)

Record the final structured output for this job. Calling this ends the run. The argument must match the schema below.

When it's exposed

Only when the skill declares an output_schema in skill.yaml. The tool's input_schema is the skill's output_schema verbatim — so the agent gets a strongly-typed slot to fill before the run ends.

Semantics

  • Must be called exactly once.
  • Calling it ends the run; any subsequent tool calls in the same model response are ignored.
  • If the run ends without set_output being called, the job fails with agent finished without calling set_output.
  • Validation: the input is enforced against the declared output_schema before the job's final output is recorded.

Example skill.yaml fragment

yaml
entrypoint: SKILL.md
output_schema:
  type: object
  properties:
    caption: { type: string }
  required: [caption]

Inside the agent the model then calls set_output({"caption": "…"}) and the run terminates.