The Complete Media Layer for the AI Revolution

The Agentic
Media Server

Store it. Encode it. Understand it.
A flat REST API for your frontends, and a native MCP server for your AI agents.
Hardware-accelerated. Generative. Zero config.

Get API Key View Pricing

ingest.ts

// The "Magic" Upload
const response = await fetch('https://api.picsha.ai/v1/assets', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer sk_live_...' },
  body: JSON.stringify({
    url: "https://example.com/media/raw_footage.mp4",
    config: {
      auto_tag: true,          // Triggers AWS Rekognition
      auto_summarize: true,    // Triggers Claude 4.6 Sonnet
      vectorize: true,         // Triggers Amazon Titan
      location_lookup: true,   // Triggers Google Maps
      adaptive_stream: true,   // Triggers AWS MediaConvert
      generative_edit: {
        fill: true,            // Amazon Nova Canvas
        remove: ["watermark"]  // Amazon Nova Canvas
      }
    }
  })
});

Response

200 OK

{
  "id": "as_8f92k1",
  "status": "ready",
  "urls": {
    "original": "https://cdn.picsha.ai/as_8f92k1/source.mp4",
    "hls_stream": "https://cdn.picsha.ai/as_8f92k1/playlist.m3u8",
    "thumbnail": "https://cdn.picsha.ai/as_8f92k1/thumb.webp"
  },
  "ai": {
    "summary": "A 10-minute vlog detailing a trip to...",
    "tags": ["travel", "vlog", "ocean", "sunny"],
    "generative_edits": ["watermark_removed", "aspect_filled"]
  },
  "location": {
    "label": "Honolulu, HI, USA"
  },
  "cost": "$0.0075"
}

Built-in Semantic Intelligence

Intelligence isn't an add-on; it's the foundation of every asset.

Resilient Ingest

Robust encoding and TUS-protocol resumable handling of complex assets (HEIC, RAW, PSD, huge video streams) directly within queued workers.

Omni-Format Insights

Powered by Amazon Titan, AWS Rekognition, and Claude 4.6 Sonnet. We natively summarize documents, transcribe audio, and auto-tag objects.

Cross-Modal Search

Vectorized via OpenSearch from day 1. "Find the photo of the smiling dog on a beach" uses hybrid semantic searching instantly out-of-the-box.

Dynamic /Render

On-the-fly transformations via URL. Access standard transforms, smart cropping, background removal, and MIMI natural language editing instantly via our flat REST API.

{}

The Flat REST API

A comprehensive JSON REST API designed for human developers. Secure Webhooks for event streams, built-in DAM scaling, and predictable latency.

</>

Drop-in React SDK

Instantly bring the platform into your applications using <PicshaImage /> and <PicshaUploadWidget />. Zero backend wiring required.

Model Context Protocol (MCP) NATIVELY SUPPORTED

The visual cortex for autonomous agents.

As AI agents become the primary consumers of APIs, they need a place to "see," manipulate, and store media. Picsha exposes a native Model Context Protocol (MCP) Server exposing 13+ tools built via Server-Sent Events (SSE).

Zero-config Claude Desktop Integration Bridge.
Agents can dynamically invoke Generative Inpainting, Summarization, and DAM curation.
Natural language search filters dynamically compiled by LLMs.

// Connect AI Agent via standard MCP SDK

const transport = new SSEClientTransport(
new URL("https://api.picsha.ai/v1/mcp/sse"),
{ headers: { "Authorization": `Bearer ${TOKEN}` } }
);

// 13 Native Tools exposed instantly

["search_assets", "trigger_url_ingest", "generate_render_url", "moderate_asset", ...]

// Run deep searches contextually

await mcpClient.request({
method: "tools/call",
params: { name: "search_assets", arguments: { query: "sunset photos", mode: "ai" } }
});

The Agentic Media Server