How It Works
Overview
Section titled “Overview”BiomAPI receives a biometry file, routes it to the appropriate extraction engine, returns structured data in a standardized response, and optionally encrypts and stores it under a BiomPIN for secure sharing. Every successful response follows the same StandardAPIResponse shape regardless of which engine processed the file.
File Routing
Section titled “File Routing”When you POST /api/v1/biom/process, the server inspects the file extension and selects one of two engines:
| Extension | Engine | What happens |
|---|---|---|
| PDF, JPG, PNG, GIF, BMP | BiomAI | LLM extraction via Gemini |
| JSON | BiomJSON | Schema validation + metadata preservation |
Routing is purely by extension — no content sniffing. Unsupported extensions are rejected immediately with 400.
The Three Processing Paths
Section titled “The Three Processing Paths”BiomAI — LLM Extraction
Section titled “BiomAI — LLM Extraction”BiomAI transmits the file bytes directly to the Google Gemini API alongside an extraction prompt. Gemini responds with a structured BiometryReport JSON. The server validates this through Pydantic and wraps it in a StandardAPIResponse.
- Images are sent at HIGH resolution; PDFs at MEDIUM
- If Gemini returns 429 or 503 (overload), the server retries up to 3 times with exponential backoff and jitter before returning an error to the caller
- A hard server-side timeout is applied (default 30 s); breach returns 504
- BYOK: pass
X-Gemini-API-Keyto use your own Gemini quota — the server creates a temporary extraction client with your key and tracks usage under a separatebiomai_byokrate limit bucket
BiomJSON — Validation and Round-Trips
Section titled “BiomJSON — Validation and Round-Trips”BiomJSON validates JSON payloads against the BiometryReport schema. It’s the engine for the round-trip workflow: extract a PDF with BiomAI → download JSON → edit locally → re-upload.
Metadata preservation: When a re-uploaded JSON contains BiomAI provenance metadata, BiomJSON reconstructs and preserves it — you don’t lose the original LLM metrics (model, token counts, timing) just because the data passed through an editor. The input_schema_version field is populated with the schema version declared in the uploaded JSON, making schema drift detectable.
If the uploaded JSON has no recognizable BiomAI provenance (or comes from a different origin), it’s attributed as BiomDIRECT.
Schema versioning: BiomJSON checks that the major version of the uploaded JSON’s schema_version matches the server’s current version. Minor/patch differences are tolerated; major version mismatch returns 422.
BiomDIRECT — Direct Data Entry
Section titled “BiomDIRECT — Direct Data Entry”BiomDIRECT is an attribution label, not a separate engine. Any biometry data not extracted by the Gemini LLM is tagged method: "BiomDIRECT" in the response metadata:
- Manual transcription via the web UI Transcribe tab
- JSON constructed by an external script, EHR export, or automated pipeline
- Re-uploaded JSONs without BiomAI provenance
BiomDIRECT metadata is minimal: method, timestamp, filename, and input_schema_version. This creates a complete audit trail — every result is attributable to either an LLM run or a direct data construction.
BiomPIN — Secure Sharing
Section titled “BiomPIN — Secure Sharing”After any successful extraction, the server can optionally encrypt the full response and store it under a PIN. BiomPIN is opt-in (pass biompin=true in the request).
Two-part PIN design
Section titled “Two-part PIN design”word-word - 123456└──────┘ └────┘ share_id numeric PIN(stored) (never stored)The word-word share ID is the database primary key — it’s stored in plain text and used to look up the record. The 6-digit numeric PIN is the encryption secret; it is never stored anywhere on the server. The decryption key is derived from the numeric PIN using Argon2id (a memory-hard key derivation function), with the SHA-256 hash of the share ID as salt.
Because the numeric PIN is never stored, the server cannot decrypt data without it. Even full database read access doesn’t compromise stored biometry data.
Brute-force protection
Section titled “Brute-force protection”After 3 wrong numeric PIN attempts, the database record is permanently deleted. This eliminates the stored ciphertext, making further brute-force attempts pointless. The response is 404 — there is no lockout period.
Expiry and cleanup
Section titled “Expiry and cleanup”Records expire after 744 hours (31 days) by default. Expired records are purged lazily after each new store operation — there is no background cleanup process.
Rate Limiting
Section titled “Rate Limiting”BiomAPI tracks usage across four independent engine buckets:
| Bucket | Covers |
|---|---|
biomai | PDF/image extraction, shared server Gemini quota |
biomai_byok | PDF/image extraction, user-supplied Gemini key |
biomjson | JSON validation — no LLM call |
retrieve | BiomPIN retrieval |
Post-processing application: Rate limits are consumed only after a successful operation. A file that fails validation, triggers an LLM error, or times out does not count against your quota. This is intentional — failed attempts shouldn’t penalize legitimate usage.
Dual tracking: Every request is tracked by both the client IP and the authenticated user ID (if present). Public callers share per-IP limits; authenticated callers have custom per-user quotas. Both are checked independently.
Sliding window: The window is a continuous 24 hours (not a midnight calendar reset). Usage timestamps roll off exactly 24 hours after recording.
Response Structure
Section titled “Response Structure”Every successful response is a StandardAPIResponse with four top-level fields:
data → BiometryReport (biometer, patient, right_eye, left_eye)extra_data → ExtraReport | null (notes, posterior_keratometry)metadata → ResponseMetadata (schema_version, app_version, extraction)biompin → BiomPINInfo | null (pin, expires_at, db_id)Why data and extra_data are separate: BiometryReport in data contains the 12 core measurements present on virtually every device. extra_data holds optional, device-dependent fields — currently posterior keratometry (PK1/PK2) and notes. This separation means adding new optional fields doesn’t require bumping the core schema version.
Why the metadata discriminated union: metadata.extraction is either BiomAIMetadata (with full LLM metrics) or BiomDIRECTMetadata (minimal). Clients can branch on method to decide whether to surface token usage, processing time, etc. The type is determined by what actually happened, not by the endpoint called.
The db_id field: biompin.db_id (and GET /api/v1/status’s db_id) identifies the database instance. It’s stable across server restarts but changes when the BiomPIN database is wiped. Client apps should check this value on startup to detect a database reset and purge stale local history entries. The BiomAPI web app handles this automatically.