Base URL
All API requests should be made to:Endpoints
| Method | Endpoint | Description | Auth Required |
|---|---|---|---|
POST | /v1/extract | Extract data from documents | Yes |
POST | /v1/playground | Test extraction (rate limited) | No |
GET | /health | Health check | No |
GET | / | API information | No |
Request Format
All document extraction requests use multipart/form-data encoding:Parameters
| Parameter | Type | Description |
|---|---|---|
file | File | The document to process (PDF or DOCX, max 10MB) |
output_schema | String | JSON Schema defining the extraction structure |
Response Format
All successful responses follow this structure:Response Fields
The extracted data matching your schema structure.
Processing information for the request.
Content Types
Request
- Content-Type:
multipart/form-data
Response
- Content-Type:
application/json
Supported File Types
| Format | Extension | Max Size | Processing |
|---|---|---|---|
.pdf | 10 MB | Native multimodal AI | |
| Microsoft Word | .docx | 10 MB | Markdown conversion |
SDKs
We provide official SDKs for popular languages:Python
Use Pydantic models for type-safe extraction
Node.js
Use Zod schemas with full TypeScript support
