How It Works

Overview

BiomAPI receives a biometry file, routes it to the appropriate extraction engine, returns structured data in a standardized response, and optionally encrypts and stores it under a BiomPIN for secure sharing. Every successful response follows the same StandardAPIResponse shape regardless of which engine processed the file.

File Routing

When you POST /api/v1/biom/process, the server inspects the file extension and selects one of two engines:

Extension	Engine	What happens
PDF, JPG, PNG, GIF, BMP	BiomAI	LLM extraction via Gemini
JSON	BiomJSON	Schema validation + metadata preservation

Routing is purely by extension — no content sniffing. Unsupported extensions are rejected immediately with 400.

The Three Processing Paths

BiomAI — LLM Extraction

BiomAI transmits the file bytes directly to the Google Gemini API alongside an extraction prompt. Gemini responds with a structured BiometryReport JSON. The server validates this through Pydantic and wraps it in a StandardAPIResponse.

Images are sent at HIGH resolution; PDFs at MEDIUM
If Gemini returns 429 or 503 (overload), the server retries up to 3 times with exponential backoff and jitter before returning an error to the caller
A hard server-side timeout is applied (default 30 s); breach returns 504
BYOK: pass X-Gemini-API-Key to use your own Gemini quota — the server creates a temporary extraction client with your key and tracks usage under a separate biomai_byok rate limit bucket

BiomJSON — Validation and Round-Trips

BiomJSON validates JSON payloads against the BiometryReport schema. It’s the engine for the round-trip workflow: extract a PDF with BiomAI → download JSON → edit locally → re-upload.

Metadata preservation: When a re-uploaded JSON contains BiomAI provenance metadata, BiomJSON reconstructs and preserves it — you don’t lose the original LLM metrics (model, token counts, timing) just because the data passed through an editor. The input_schema_version field is populated with the schema version declared in the uploaded JSON, making schema drift detectable.

If the uploaded JSON has no recognizable BiomAI provenance (or comes from a different origin), it’s attributed as BiomDIRECT.

Schema versioning: BiomJSON checks that the major version of the uploaded JSON’s schema_version matches the server’s current version. Minor/patch differences are tolerated; major version mismatch returns 422.

BiomDIRECT — Direct Data Entry

BiomDIRECT is an attribution label, not a separate engine. Any biometry data not extracted by the Gemini LLM is tagged method: "BiomDIRECT" in the response metadata:

Manual transcription via the web UI Transcribe tab
JSON constructed by an external script, EHR export, or automated pipeline
Re-uploaded JSONs without BiomAI provenance

BiomDIRECT metadata is minimal: method, timestamp, filename, and input_schema_version. This creates a complete audit trail — every result is attributable to either an LLM run or a direct data construction.

After any successful extraction, the server can optionally encrypt the full response and store it under a PIN. BiomPIN is opt-in (pass biompin=true in the request).

Two-part PIN design

word-word  -  123456
└──────┘      └────┘
 share_id   numeric PIN
(stored)   (never stored)

The word-word share ID is the database primary key — it’s stored in plain text and used to look up the record. The 6-digit numeric PIN is the encryption secret; it is never stored anywhere on the server. The decryption key is derived from the numeric PIN using Argon2id (a memory-hard key derivation function), with the SHA-256 hash of the share ID as salt.

Because the numeric PIN is never stored, the server cannot decrypt data without it. Even full database read access doesn’t compromise stored biometry data.

Brute-force protection

After 3 wrong numeric PIN attempts, the database record is permanently deleted. This eliminates the stored ciphertext, making further brute-force attempts pointless. The response is 404 — there is no lockout period.

Expiry and cleanup

Records expire after 744 hours (31 days) by default. Expired records are purged lazily after each new store operation — there is no background cleanup process.

Rate Limiting

BiomAPI tracks usage across four independent engine buckets:

Bucket	Covers
`biomai`	PDF/image extraction, shared server Gemini quota
`biomai_byok`	PDF/image extraction, user-supplied Gemini key
`biomjson`	JSON validation — no LLM call
`retrieve`	BiomPIN retrieval

Post-processing application: Rate limits are consumed only after a successful operation. A file that fails validation, triggers an LLM error, or times out does not count against your quota. This is intentional — failed attempts shouldn’t penalize legitimate usage.

Dual tracking: Every request is tracked by both the client IP and the authenticated user ID (if present). Public callers share per-IP limits; authenticated callers have custom per-user quotas. Both are checked independently.

Sliding window: The window is a continuous 24 hours (not a midnight calendar reset). Usage timestamps roll off exactly 24 hours after recording.

Response Structure

Every successful response is a StandardAPIResponse with four top-level fields:

data         → BiometryReport  (biometer, patient, right_eye, left_eye)
extra_data   → ExtraReport | null  (notes, posterior_keratometry)
metadata     → ResponseMetadata  (schema_version, app_version, extraction)
biompin      → BiomPINInfo | null  (pin, expires_at, db_id)

Why data and extra_data are separate: BiometryReport in data contains the 12 core measurements present on virtually every device. extra_data holds optional, device-dependent fields — currently posterior keratometry (PK1/PK2) and notes. This separation means adding new optional fields doesn’t require bumping the core schema version.

Why the metadata discriminated union: metadata.extraction is either BiomAIMetadata (with full LLM metrics) or BiomDIRECTMetadata (minimal). Clients can branch on method to decide whether to surface token usage, processing time, etc. The type is determined by what actually happened, not by the endpoint called.

The db_id field: biompin.db_id (and GET /api/v1/status’s db_id) identifies the database instance. It’s stable across server restarts but changes when the BiomPIN database is wiped. Client apps should check this value on startup to detect a database reset and purge stale local history entries. The BiomAPI web app handles this automatically.