Picsha AI
Developer Documentation

picsha.ai API Specification (v1)

1. Product Philosophy

"S3 with a Brain." Picsha.ai is a serverless, usage-based media backend designed for AI Agents and "Vibe Coders." It abstracts complex processing pipelines (TUS, Rekognition, LibreOffice, Vector Search) into simple, developer-friendly endpoints.

2. Base URL & Authentication

  • Base URL: https://api.picsha.ai/v1
  • Authentication: Bearer Token via Header.
    Authorization: Bearer sk_live_51Mx...
    

3. Core Endpoints

A. Ingest (The "Magic" Upload)

POST /assets

Handles all file ingestion types (Multipart, URL, Raw) and triggers the AI processing pipeline based on configuration.

Headers:

  • Content-Type: multipart/form-data OR application/json (for URL fetch)

Parameters (JSON Body for URL / Config):

{
  "url": "[https://example.com/files/quarterly_report.docx](https://example.com/files/quarterly_report.docx)",
  "config": {
    "auto_tag": true,          // Triggers AWS Rekognition (Faces/Objects)
    "auto_summarize": true,    // Triggers Claude Sonnet (for Docs/PDFs)
    "vectorize": true,         // Triggers Amazon Titan (for Similarity Search)
    "location_lookup": true,   // Triggers Google Maps Reverse Geocoding
    "pre_render_sizes": true   // Pre-generates common responsive sizes and thumbnails
  },
  "tags": ["finance", "report", "q4"],
  "metadata": {
    "project_id": "my_replit_app_123"
  }
}

Response (Success):

{
  "id": "as_8f92k1",
  "status": "ready",
  "created_at": "2025-11-20T10:00:00Z",
  "urls": {
    "original": "[https://cdn.picsha.ai/as_8f92k1/source.docx](https://cdn.picsha.ai/as_8f92k1/source.docx)",
    "pdf_view": "[https://cdn.picsha.ai/as_8f92k1/view.pdf](https://cdn.picsha.ai/as_8f92k1/view.pdf)",     // LibreOffice Output
    "thumbnail": "[https://cdn.picsha.ai/as_8f92k1/thumb.webp](https://cdn.picsha.ai/as_8f92k1/thumb.webp)"   // Sharp/WebP Output
  },
  "meta": {
    "format": "docx",
    "size": 40922,
    "exif": { "author": "Graphx", "created": "2025-11-19" }     // ExifTool Output
  },
  "ai": {
    "summary": "A report outlining Q4 marketing strategy...",      // Claude Sonnet Output
    "tags": ["finance", "report", "strategy", "q4"],               // Rekognition Output
    "safe_search": "verified"
  },
  "location": {
    "lat": 42.373, "lon": -71.109,
    "label": "Cambridge, MA, USA"                                  // Google Maps Output
  },
  "cost": "$0.0042"                                                // Transaction Cost
}

B. Resumable Uploads (TUS Protocol)

POST /upload/resumable

  • Standard TUS 1.0 Protocol endpoint.
  • Supports large files (>100MB) and unstable connections.
  • Implementation: Wraps internal tus-node-server.
  • SDK Support: Compatible with uppy and picsha-uploader SDK.

4. Delivery & Transformation

GET /assets/{id}/render

Dynamic, edge-cached image transformations using Sharp and Bedrock Core.

Forced Downloads You can force the browser to securely download an asset rather than displaying it by appending the download parameter:

  • ?download=true

Note: The API will automatically generate a secure filename based on the original asset's name and format (e.g. original-altered.webp). It ignores any custom filename string passed to it.

For comprehensive details on standard parameters (dimensions, cropping, formats), smart AI cropping, watermarking, and Generative AI modifiers (background removal and the MIMI natural language orchestrator), see the Transformations & Generative AI guide.


5. Search (Vector & Semantic)

POST /search

Hybrid search engine combining Elasticsearch (Metadata) and Amazon Titan (Vectors).

Request:

{
  "query": "Find photos of the team meeting in Cambridge", // Natural Language
  "filters": {
    "type": "image",
    "date_after": "2025-10-01"
  },
  "limit": 10
}

Response:

{
  "results": [
    {
      "id": "as_7g12m4",
      "score": 0.92, // Relevance confidence
      "url": "[https://cdn.picsha.ai/as_7g12m4/thumb.webp](https://cdn.picsha.ai/as_7g12m4/thumb.webp)"
    }
  ]
}

6. Billing & Usage

GET /usage

Real-time cost tracking using AWS Cost Explorer Tags.

Response:

{
  "period": "current_month",
  "total_spend": "$4.12",
  "currency": "USD",
  "breakdown": {
    "storage_gb": 10.5,
    "ai_operations": 450,
    "bandwidth_gb": 2.1
  }
}