API Overview - Parsefy

Base URL

All API requests should be made to:

https://api.parsefy.io

Overview

Parsefy is an API that turns financial PDFs (invoices, receipts, bills) into structured JSON with validation and risk signals.

Endpoints

Method	Endpoint	Description	Auth Required
`POST`	`/v1/extract`	Extract data from documents	Yes
`POST`	`/v1/playground`	Test extraction (rate limited)	No
`GET`	`/health`	Health check	No
`GET`	`/`	API information	No

Request Format

All document extraction requests use multipart/form-data encoding:

curl -X POST "https://api.parsefy.io/v1/extract" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F 'output_schema={"type": "object", "properties": {...}}' \
  -F "confidence_threshold=0.85"

Parameters

Parameter	Type	Default	Description
`file`	File	required	The document to process (PDF or DOCX, max 10MB)
`output_schema`	String	required	JSON Schema defining the extraction structure
`confidence_threshold`	Number	`0.85`	Minimum confidence before accepting results
`enable_verification`	Boolean	`false`	Enable math verification (totals, subtotals, taxes)

Response Format

All successful responses include field-level confidence and evidence:

{
  "object": {
    "invoice_number": "INV-2024-0042",
    "subtotal": 1150.00,
    "tax": 100.00,
    "total": 1250.00,
    "_meta": {
      "confidence_score": 0.94,
      "field_confidence": [
        { "field": "$.invoice_number", "score": 0.98, "reason": "Exact match", "page": 1, "text": "Invoice # INV-2024-0042" },
        { "field": "$.subtotal", "score": 0.95, "reason": "Exact match", "page": 1, "text": "Subtotal: $1,150.00" },
        { "field": "$.tax", "score": 0.95, "reason": "Exact match", "page": 1, "text": "Tax: $100.00" },
        { "field": "$.total", "score": 0.92, "reason": "Formatting ambiguous", "page": 1, "text": "Total: $1,250.00" }
      ],
      "issues": []
    }
  },
  "metadata": {
    "processing_time_ms": 2340,
    "credits": 1,
    "fallback_triggered": false
  },
  "verification": {
    "status": "PASSED",
    "checks_passed": 1,
    "checks_failed": 0,
    "cannot_verify_count": 0,
    "checks_run": [
      {
        "type": "HORIZONTAL_SUM",
        "status": "PASSED",
        "fields": ["total", "subtotal", "tax"],
        "passed": true,
        "delta": 0.0,
        "expected": 1250.00,
        "actual": 1250.00
      }
    ]
  }
}

Response Fields

object

required

The extracted data matching your schema structure.

Show properties

_meta

object

Automatically injected quality metrics and evidence.

Show properties

confidence_score

number

Overall extraction confidence from 0.0 to 1.0

field_confidence

array

Per-field confidence with evidence (field path, score, reason, page, source text)

issues

array

Array of human-readable issue descriptions

metadata

object

required

Processing information for the request.

Show properties

processing_time_ms

integer

Total processing time in milliseconds

credits

integer

Credits consumed (~1 per page, more if fallback triggered)

fallback_triggered

boolean

Whether the Tier 2 fallback model was used

verification

object

Math verification results (only present if enable_verification was true).

Show properties

status

string

Overall status: PASSED, FAILED, PARTIAL, CANNOT_VERIFY, or NO_RULES

checks_passed

integer

Number of verification checks that passed

checks_failed

integer

Number of verification checks that failed

cannot_verify_count

integer

Number of checks that could not be verified

checks_run

array

Detailed results for each verification check

Content Types

Request

Content-Type: multipart/form-data

Response

Content-Type: application/json

Supported File Types

Format	Extension	Max Size	Processing
PDF	`.pdf`	10 MB	Native multimodal AI
Microsoft Word	`.docx`	10 MB	Markdown conversion

SDKs

We provide official SDKs for popular languages:

Python

Use Pydantic models for type-safe extraction

Node.js

Use Zod schemas with full TypeScript support

Quick Links

Authentication

API key setup and usage

Rate Limits

Request and credit limits

Errors

Error codes and handling

Overview

Endpoints

​Base URL

​Overview

​Endpoints

​Request Format

​Parameters

​Response Format

​Response Fields

​Content Types

​Request

​Response

​Supported File Types

​SDKs

Python

Node.js

​Quick Links

Authentication

Rate Limits

Errors

Base URL

Overview

Endpoints

Request Format

Parameters

Response Format

Response Fields

Content Types

Request

Response

Supported File Types

SDKs

Quick Links