picsha.ai API Specification (v1)
1. Product Philosophy
"S3 with a Brain." Picsha.ai is a serverless, usage-based media backend designed for AI Agents and "Vibe Coders." It abstracts complex processing pipelines (TUS, Rekognition, LibreOffice, Vector Search) into simple, developer-friendly endpoints.
2. Base URL & Authentication
- Base URL:
https://api.picsha.ai/v1 - Authentication: Bearer Token via Header.
Authorization: Bearer sk_live_51Mx...
3. Core Endpoints
A. Ingest (The "Magic" Upload)
POST /assets
Handles all file ingestion types (Multipart, URL, Raw) and triggers the AI processing pipeline based on configuration.
Headers:
Content-Type:multipart/form-dataORapplication/json(for URL fetch)
Parameters (JSON Body for URL / Config):
{
"url": "[https://example.com/files/quarterly_report.docx](https://example.com/files/quarterly_report.docx)",
"config": {
"auto_tag": true, // Triggers AWS Rekognition (Faces/Objects)
"auto_summarize": true, // Triggers Claude Sonnet (for Docs/PDFs)
"vectorize": true, // Triggers Amazon Titan (for Similarity Search)
"location_lookup": true, // Triggers Google Maps Reverse Geocoding
"pre_render_sizes": true // Pre-generates common responsive sizes and thumbnails
},
"tags": ["finance", "report", "q4"],
"metadata": {
"project_id": "my_replit_app_123"
}
}
Response (Success):
{
"id": "as_8f92k1",
"status": "ready",
"created_at": "2025-11-20T10:00:00Z",
"urls": {
"original": "[https://cdn.picsha.ai/as_8f92k1/source.docx](https://cdn.picsha.ai/as_8f92k1/source.docx)",
"pdf_view": "[https://cdn.picsha.ai/as_8f92k1/view.pdf](https://cdn.picsha.ai/as_8f92k1/view.pdf)", // LibreOffice Output
"thumbnail": "[https://cdn.picsha.ai/as_8f92k1/thumb.webp](https://cdn.picsha.ai/as_8f92k1/thumb.webp)" // Sharp/WebP Output
},
"meta": {
"format": "docx",
"size": 40922,
"exif": { "author": "Graphx", "created": "2025-11-19" } // ExifTool Output
},
"ai": {
"summary": "A report outlining Q4 marketing strategy...", // Claude Sonnet Output
"tags": ["finance", "report", "strategy", "q4"], // Rekognition Output
"safe_search": "verified"
},
"location": {
"lat": 42.373, "lon": -71.109,
"label": "Cambridge, MA, USA" // Google Maps Output
},
"cost": "$0.0042" // Transaction Cost
}
B. Resumable Uploads (TUS Protocol)
POST /upload/resumable
- Standard TUS 1.0 Protocol endpoint.
- Supports large files (>100MB) and unstable connections.
- Implementation: Wraps internal
tus-node-server. - SDK Support: Compatible with
uppyandpicsha-uploaderSDK.
4. Delivery & Transformation
GET /assets/{id}/render
Dynamic, edge-cached image transformations using Sharp and Bedrock Core.
Forced Downloads
You can force the browser to securely download an asset rather than displaying it by appending the download parameter:
?download=true
Note: The API will automatically generate a secure filename based on the original asset's name and format (e.g. original-altered.webp). It ignores any custom filename string passed to it.
For comprehensive details on standard parameters (dimensions, cropping, formats), smart AI cropping, watermarking, and Generative AI modifiers (background removal and the MIMI natural language orchestrator), see the Transformations & Generative AI guide.
5. Search (Vector & Semantic)
POST /search
Hybrid search engine combining Elasticsearch (Metadata) and Amazon Titan (Vectors).
Request:
{
"query": "Find photos of the team meeting in Cambridge", // Natural Language
"filters": {
"type": "image",
"date_after": "2025-10-01"
},
"limit": 10
}
Response:
{
"results": [
{
"id": "as_7g12m4",
"score": 0.92, // Relevance confidence
"url": "[https://cdn.picsha.ai/as_7g12m4/thumb.webp](https://cdn.picsha.ai/as_7g12m4/thumb.webp)"
}
]
}
6. Billing & Usage
GET /usage
Real-time cost tracking using AWS Cost Explorer Tags.
Response:
{
"period": "current_month",
"total_spend": "$4.12",
"currency": "USD",
"breakdown": {
"storage_gb": 10.5,
"ai_operations": 450,
"bandwidth_gb": 2.1
}
}