Skip to main content

Base URL

All API requests should be made to:
https://api.parsefy.io

Overview

Parsefy is an API that turns financial PDFs (invoices, receipts, bills) into structured JSON with validation and risk signals.

Endpoints

MethodEndpointDescriptionAuth Required
POST/v1/extractExtract data from documentsYes
POST/v1/playgroundTest extraction (rate limited)No
GET/healthHealth checkNo
GET/API informationNo

Request Format

All document extraction requests use multipart/form-data encoding:
curl -X POST "https://api.parsefy.io/v1/extract" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F 'output_schema={"type": "object", "properties": {...}}' \
  -F "confidence_threshold=0.85"

Parameters

ParameterTypeDefaultDescription
fileFilerequiredThe document to process (PDF or DOCX, max 10MB)
output_schemaStringrequiredJSON Schema defining the extraction structure
confidence_thresholdNumber0.85Minimum confidence before accepting results
enable_verificationBooleanfalseEnable math verification (totals, subtotals, taxes)

Response Format

All successful responses include field-level confidence and evidence:
{
  "object": {
    "invoice_number": "INV-2024-0042",
    "subtotal": 1150.00,
    "tax": 100.00,
    "total": 1250.00,
    "_meta": {
      "confidence_score": 0.94,
      "field_confidence": [
        { "field": "$.invoice_number", "score": 0.98, "reason": "Exact match", "page": 1, "text": "Invoice # INV-2024-0042" },
        { "field": "$.subtotal", "score": 0.95, "reason": "Exact match", "page": 1, "text": "Subtotal: $1,150.00" },
        { "field": "$.tax", "score": 0.95, "reason": "Exact match", "page": 1, "text": "Tax: $100.00" },
        { "field": "$.total", "score": 0.92, "reason": "Formatting ambiguous", "page": 1, "text": "Total: $1,250.00" }
      ],
      "issues": []
    }
  },
  "metadata": {
    "processing_time_ms": 2340,
    "credits": 1,
    "fallback_triggered": false
  },
  "verification": {
    "status": "PASSED",
    "checks_passed": 1,
    "checks_failed": 0,
    "cannot_verify_count": 0,
    "checks_run": [
      {
        "type": "HORIZONTAL_SUM",
        "status": "PASSED",
        "fields": ["total", "subtotal", "tax"],
        "passed": true,
        "delta": 0.0,
        "expected": 1250.00,
        "actual": 1250.00
      }
    ]
  }
}

Response Fields

object
object
required
The extracted data matching your schema structure.
metadata
object
required
Processing information for the request.
verification
object
Math verification results (only present if enable_verification was true).

Content Types

Request

  • Content-Type: multipart/form-data

Response

  • Content-Type: application/json

Supported File Types

FormatExtensionMax SizeProcessing
PDF.pdf10 MBNative multimodal AI
Microsoft Word.docx10 MBMarkdown conversion

SDKs

We provide official SDKs for popular languages: