Picsha AI
Developer Documentation

Python SDK

The official @picsha-ai/python-sdk is tailored specifically for data scientists, computational biologists, and ML engineers. It simplifies the integration of complex imaging datasets into Python machine learning pipelines.

Installation

We recommend using poetry:

poetry add picsha

Or pip:

pip install picsha

Quickstart (Synchronous)

import picsha

client = picsha.Client(api_key="sk_your_key_here")

# 1. Semantic Search for biological datasets
results = client.search(
    query="fluorescent cancer cell cultures", 
    mode="ai" 
)

# 2. Get transformed URLs for Pandas or PyTorch integration
for asset in results.assets:
    url = asset.generate_url(width=512, height=512, format="webp")
    print(url)

High-Throughput (Asynchronous)

For batch processing or uploading thousands of scientific imaging files in parallel, you can use the built-in AsyncClient powered by httpx and asyncio:

import picsha
import asyncio

async def main():
    async with picsha.AsyncClient(api_key="sk_your_key_here") as client:
        # Uploading large RAW/HEIC files asynchronously
        upload_result = await client.upload(
            file_path="./data/sample_01.heic",
            tags=["assay:123", "cancer_cells"]
        )
        print(f"Uploaded: {upload_result.asset.id}")

asyncio.run(main())

Key Use Cases

  • "Dark Data" Retrieval: Automatically encode heavy files (RAW, HEIC, TIFF) upon upload, extracting metadata so they can be discovered via natural language vector searches.
  • On-the-Fly ML Pre-Processing: Use asset.generate_url(width=512, height=512, format="webp") to have the Picsha edge servers standardize dimensions and formats before the byte stream hits your PyTorch/TensorFlow pipeline.