CitationBenchTalk to Sales
API referenceAgent

Agent files API — upload inputs and read agent-written artifacts

Upload PDFs, CSVs, transcripts, and briefs as agent inputs, and read back artifacts the agent wrote during a run. Files are addressed by path, scoped per workspace.

The file API is how you give the agent inputs (PDFs, CSVs, transcripts, briefs) and how you read back the artifacts it wrote during a run.

Conceptual overview

Files are identified by paths, not opaque IDs. A path looks like agent-workspace/notes.md or seed-docs/competitor-brief.pdf. Paths group naturally into folders:

/seed-docs/                 ← you upload these; agent reads them as input
/agent-workspace/           ← agent's own scratch space; writes notes here
/agent-output/              ← final outputs the agent surfaces in result.files

You don't have to use the same folder names — they're conventions, not enforced. But every file has a path, and the agent's response files: ["path/to/file.md", ...] is just a list of those paths.

Files are workspace-scoped. Within a workspace, paths must be unique.

Endpoints

MethodPathPurpose
POST/v1/agent/filesUpload a file
GET/v1/agent/filesList files
GET/v1/agent/files/{path}Get one file (metadata)
GET/v1/agent/files/{path}/rawGet one file (binary content)
PATCH/v1/agent/files/{path}Update metadata or move (rename)
DELETE/v1/agent/files/{path}Delete

POST /v1/agent/files

Multipart form upload.

curl -X POST https://api.citationbench.com/v1/agent/files \
  -H "Authorization: Bearer sk_live_***" \
  -H "X-Workspace-Id: ws_acme" \
  -F "path=seed-docs/competitor-brief.pdf" \
  -F "file=@./brief.pdf" \
  -F "tags=onboarding,acme-brief"
FieldRequiredNotes
pathyesTarget path (relative to workspace root). Must not already exist (use PATCH to overwrite).
fileyesThe binary content
tagsnoComma-separated tags
contentTypenoAuto-detected from the binary if absent

Response

HTTP/1.1 201 Created
Location: /v1/agent/files/seed-docs/competitor-brief.pdf

{
  "path":        "seed-docs/competitor-brief.pdf",
  "size":        184320,
  "contentType": "application/pdf",
  "checksum":    "sha256:1d8b...",
  "tags":        ["onboarding", "acme-brief"],
  "uploadedBy":  "user_***",
  "uploadedAt":  "2026-05-24T08:01:42Z"
}

Inline base64 alternative (for small files)

curl -X POST https://api.citationbench.com/v1/agent/files \
  -H "Content-Type: application/json" \
  -d '{
    "path":        "seed-docs/keywords.csv",
    "contentBase64": "a2V5d29yZCxpbnRlbnQK...",
    "contentType": "text/csv"
  }'

Max 5 MB via inline base64. Use multipart for larger.


GET /v1/agent/files

curl -G https://api.citationbench.com/v1/agent/files \
  -H "Authorization: Bearer sk_live_***" \
  --data-urlencode "prefix=agent-workspace/" \
  --data-urlencode "tag=onboarding" \
  --data-urlencode "limit=50"
ParamNotes
prefixFolder-style prefix (agent-workspace/, seed-docs/)
tagTag match
invocationIdFiles written by a specific invocation
since / untilISO timestamps for uploadedAt / writtenAt
limit, cursorPagination

Response

{
  "data": [
    {
      "path": "agent-workspace/notes.md",
      "size": 12480,
      "contentType": "text/markdown",
      "writtenBy": "agent",
      "writtenByInvocationId": "inv_***",
      "writtenAt": "2026-05-24T08:08:14Z",
      "tags": []
    },
    {
      "path": "seed-docs/competitor-brief.pdf",
      "size": 184320,
      "contentType": "application/pdf",
      "uploadedBy": "user_***",
      "uploadedAt": "2026-05-24T08:01:42Z",
      "tags": ["onboarding"]
    }
  ],
  "nextCursor": null,
  "total": 2
}

GET /v1/agent/files/{path}

Returns metadata only.

curl https://api.citationbench.com/v1/agent/files/agent-workspace/notes.md \
  -H "Authorization: Bearer sk_live_***"

GET /v1/agent/files/{path}/raw

Returns the raw binary content.

curl https://api.citationbench.com/v1/agent/files/agent-workspace/notes.md/raw \
  -H "Authorization: Bearer sk_live_***" \
  -o notes.md

For markdown / text / JSON, the response is the raw body with the right Content-Type header. For binary (PDF, image), same — just save to disk.


PATCH /v1/agent/files/{path}

Update metadata (tags) or move/rename the file.

curl -X PATCH https://api.citationbench.com/v1/agent/files/agent-workspace/notes.md \
  -H "Authorization: Bearer sk_live_***" \
  -d '{
    "newPath": "archive/2026-05/notes.md",
    "tags":    ["archived"]
  }'

To overwrite content, delete and re-upload.


DELETE /v1/agent/files/{path}

curl -X DELETE https://api.citationbench.com/v1/agent/files/agent-workspace/notes.md \
  -H "Authorization: Bearer sk_live_***"

Returns 204 No Content. Soft-deleted for 7 days then garbage-collected.


MCP

> Upload my client brief and use it as input to the bootstrap_brand skill.

Claude calls agent.files.upload (after asking you for the file path) then agent.invoke with files: ["seed-docs/your-brief.pdf"].

> Show me everything the agent wrote during the last bootstrap invocation.

Claude calls agent.files.list with invocationId= and renders each.


How agents read files

When you pass files: ["seed-docs/competitor-brief.pdf"] to agent.invoke, the agent receives them as context. Different skills handle different content types:

  • Text / markdown / JSON / CSV — read into the LLM context directly
  • PDF — extracted to text via the OCR pipeline, then read
  • Image (PNG, JPG) — handed to the LLM as a vision input
  • Audio (WAV, MP3) — transcribed first, then read

Total context size across all input files is capped per skill (defaults to 100k tokens). Skills declare their own caps in inputSchema.fileLimits.

How agents write files

During execution, agents can write any path under agent-workspace/ (their scratch space) or agent-output/ (which surfaces in the response files). Common patterns:

  • agent-workspace/notes.md — running narration
  • agent-workspace/decisions.md — why-this-not-that for key choices
  • agent-output/<resource>.csv — final structured outputs
  • agent-output/draft-v1.md — draft artifacts

Every file written carries the writing invocation's ID, so you can audit who wrote what.


Errors

StatusCodeCause
400validation_errorMissing path or content
409file_existsPath is already in use — pick a new path or use PATCH to rename
404file_not_found
413file_too_largeSingle file exceeds workspace storage limit
415unsupported_media_typeType the agent can't parse
507storage_quota_exceededWorkspace total storage exceeded

Cost

ActionCredits
Uploadfree
Listfree
Get metadatafree
Get raw contentfree
Deletefree
Storageincluded in plan; over-quota billed at $0.02/GB/mo

Agent reads of files (during invocations) are part of the invocation's LLM/processing cost.

Use cases (string things together)

A. Onboard a client by uploading their brief

# 1. Upload
curl -X POST .../v1/agent/files \
  -F "path=seed-docs/acme-brief.pdf" \
  -F "file=@./acme-brief.pdf"

# 2. Invoke with the file as context
curl -X POST .../v1/agent/invoke -d '{
  "skill": "bootstrap_brand",
  "input": { "domain": "acme.com" },
  "files": ["seed-docs/acme-brief.pdf"]
}'

# The agent reads the brief alongside its crawl + research, producing a more tailored output.

B. Audit what the agent wrote

INV=inv_***
curl -G .../v1/agent/files --data-urlencode "invocationId=$INV"
# → list of every file the invocation wrote

# Read one
curl .../v1/agent/files/agent-workspace/decisions.md/raw

C. Pre-load voice samples

curl -X POST .../v1/agent/files \
  -F "path=brand/voice-samples.md" \
  -F "file=@./voice-samples.md" \
  -F "tags=brand-voice"

# Then any content skill that mentions reading `tag:brand-voice` will use it.

D. Hand-off between agents

A custom skill can read files written by a prior invocation:

# custom skill definition
inputSchema:
  properties:
    upstreamInvocationId: { type: string }
on_start:
  - load_files: { invocationId: "{{ input.upstreamInvocationId }}" }

The agent runs with full access to the prior agent's notes.

On this page