Python SDK
The official @picsha-ai/python-sdk is tailored specifically for data scientists, computational biologists, and ML engineers. It simplifies the integration of complex imaging datasets into Python machine learning pipelines.
Installation
We recommend using poetry:
poetry add picsha
Or pip:
pip install picsha
Quickstart (Synchronous)
import picsha
client = picsha.Client(api_key="sk_your_key_here")
# 1. Semantic Search for biological datasets
results = client.search(
query="fluorescent cancer cell cultures",
mode="ai"
)
# 2. Get transformed URLs for Pandas or PyTorch integration
for asset in results.assets:
url = asset.generate_url(width=512, height=512, format="webp")
print(url)
High-Throughput (Asynchronous)
For batch processing or uploading thousands of scientific imaging files in parallel, you can use the built-in AsyncClient powered by httpx and asyncio:
import picsha
import asyncio
async def main():
async with picsha.AsyncClient(api_key="sk_your_key_here") as client:
# Uploading large RAW/HEIC files asynchronously
upload_result = await client.upload(
file_path="./data/sample_01.heic",
tags=["assay:123", "cancer_cells"]
)
print(f"Uploaded: {upload_result.asset.id}")
asyncio.run(main())
Key Use Cases
- "Dark Data" Retrieval: Automatically encode heavy files (RAW, HEIC, TIFF) upon upload, extracting metadata so they can be discovered via natural language vector searches.
- On-the-Fly ML Pre-Processing: Use
asset.generate_url(width=512, height=512, format="webp")to have the Picsha edge servers standardize dimensions and formats before the byte stream hits your PyTorch/TensorFlow pipeline.