Parsefy provides field-level confidence scoring with evidence tracking. Every extracted field comes with:
A confidence score (0.0 to 1.0)
The source text evidence
The page number where it was found
A reason explaining the score
Our goal: 0% silent errors. If a required field can’t be extracted with sufficient confidence, the API triggers a fallback model or fails with clear reasons; never returns unreliable data silently.
More accurate: Triggers Tier 2 fallback more often
Financial reconciliation
Lower confidence_threshold = faster and cheaper (accepts Tier 1 more often).
Higher confidence_threshold = more accurate but more expensive (triggers Tier 2 fallback more often).
Parsefy uses a two-tier model architecture for reliability:
1
Tier 1 Extraction
Your document is first processed by a fast, efficient model.
2
Confidence Check
If any required field returns null or falls below confidence_threshold, the extraction is automatically re-run.
3
Tier 2 Fallback
A more powerful (and more expensive) model processes the document for improved accuracy.
Important: If a required field can’t be extracted with sufficient confidence, it triggers the fallback model. This is critical for billing. See the section on Required vs Optional Fields.
The metadata.fallback_triggered field tells you if the fallback was used:
If a required field returns null or falls below the confidence_threshold, the API triggers the fallback model (Tier 2), which is significantly more expensive.
Mark fields as optional if they might be missing in >20% of your documents:
Copy
const schema = z.object({ // REQUIRED - Always present on invoices, keep required invoice_number: z.string().describe('The invoice number'), total: z.number().describe('Total amount including tax'), // OPTIONAL - May not appear on all documents, mark optional! vendor: z.string().optional().describe('Vendor name'), // Not all invoices have vendor name tax_id: z.string().optional().describe('Tax ID number'), // Rarely present notes: z.string().optional().describe('Additional notes'), // Usually empty due_date: z.string().optional().describe('Payment due date'),// Sometimes missing});
Rule of thumb: If a field might be missing in >20% of your documents, mark it as optional.
result = client.extract( file="document.pdf", schema=Invoice, confidence_threshold=0.85, enable_verification=True # Enable math verification)if result.error is None: # Overall confidence from meta if result.meta: print(f"Overall confidence: {result.meta.confidence_score}") # Check individual field confidence for fc in result.meta.field_confidence: print(f"{fc.field}: {fc.score} ({fc.reason}) - '{fc.text}'") if fc.score < 0.80: print(f" Low confidence on {fc.field}") # Check for issues if result.meta.issues: print("Issues:", result.meta.issues) # Check verification results if result.verification: print(f"Verification: {result.verification.status}") for check in result.verification.checks_run: print(f"{check.type}: {'PASSED' if check.passed else 'FAILED'}")