All errors follow a consistent JSON format:
{
"detail": "Human-readable error message"
}
HTTP Status Codes
| Code | Name | Description |
|---|
200 | OK | Request succeeded |
400 | Bad Request | Invalid request (file type, schema, or empty file) |
401 | Unauthorized | Invalid or missing API key |
403 | Forbidden | Origin not allowed (playground) |
429 | Too Many Requests | Rate limit exceeded |
500 | Server Error | Extraction or processing failed |
Error Types
400 Bad Request
Returned when the request is malformed or contains invalid data.
{
"detail": "Invalid file type. Supported formats: PDF, DOCX"
}
Solution: Ensure you’re uploading a .pdf or .docx file.
{
"detail": "File size exceeds maximum limit of 10MB"
}
Solution: Compress or split your document to be under 10MB.
{
"detail": "File is empty or cannot be read"
}
Solution: Ensure the file has content and is not corrupted.
{
"detail": "Invalid JSON schema: Expecting property name enclosed in double quotes"
}
Solution: Validate your JSON schema syntax. Use a JSON validator.
{
"detail": "No file provided in request"
}
Solution: Include the file field in your multipart form data.
{
"detail": "No output_schema provided in request"
}
Solution: Include the output_schema field in your request.
401 Unauthorized
Authentication failed.
{
"detail": "Invalid or missing API key"
}
Common causes:
- Missing
Authorization header
- Incorrect API key
- Typo in the Bearer token format
- Using an expired or revoked key
Solution: Verify your API key and ensure the header format is Authorization: Bearer pk_your_key.
403 Forbidden
Origin not allowed (playground endpoint only).
{
"detail": "Origin not allowed"
}
Solution: The playground is only accessible through parsefy.io. For production use, join the waitlist to get an API key and use /v1/extract.
429 Too Many Requests
Rate limit exceeded.
{
"detail": "Rate limit exceeded. Please retry after 1 second."
}
Or for credit limits:
{
"detail": "Daily credit limit exceeded. Resets at midnight UTC."
}
Solution: Implement retry logic with exponential backoff. See Rate Limits.
500 Server Error
An unexpected error occurred during processing.
{
"detail": "Extraction failed: Unable to process document"
}
Common causes:
- Corrupted document
- Document format not actually PDF/DOCX
- Encrypted or password-protected file
- Temporary service issues
Solution: Verify your document is valid and try again. If the issue persists, contact support.
SDK Error Handling
Python
from parsefy import Parsefy, APIError, ValidationError, ParsefyError
client = Parsefy()
try:
result = client.extract(file="document.pdf", schema=Invoice)
# Check for extraction-level errors
if result.error is not None:
print(f"Extraction error: [{result.error.code}] {result.error.message}")
return
# Success
print(result.data)
except ValidationError as e:
# Client-side validation failed
# (file not found, wrong file type, missing API key, etc.)
print(f"Validation error: {e.message}")
except APIError as e:
# HTTP error from API (401, 429, 500, etc.)
print(f"API error {e.status_code}: {e.message}")
except ParsefyError as e:
# Other Parsefy-related errors
print(f"Parsefy error: {e.message}")
JavaScript/TypeScript
import { Parsefy, APIError, ValidationError, ParsefyError } from 'parsefy';
const client = new Parsefy();
try {
const { object, error, metadata } = await client.extract({
file: './document.pdf',
schema,
});
// Check for extraction-level errors
if (error) {
console.error(`Extraction error: [${error.code}] ${error.message}`);
console.log(`Tokens used: ${metadata.inputTokens} in, ${metadata.outputTokens} out`);
return;
}
// Success
console.log(object);
} catch (err) {
if (err instanceof APIError) {
// HTTP error from API (401, 429, 500, etc.)
console.error(`API Error ${err.statusCode}: ${err.message}`);
} else if (err instanceof ValidationError) {
// Client-side validation failed
console.error(`Validation Error: ${err.message}`);
} else if (err instanceof ParsefyError) {
// Other Parsefy-related errors
console.error(`Parsefy Error: ${err.message}`);
}
}
Even when the HTTP request succeeds (200), the extraction itself might fail. These errors are returned in the response:
{
"object": null,
"metadata": {
"processing_time_ms": 1500,
"input_tokens": 500,
"output_tokens": 0,
"credits": 1,
"fallback_triggered": false
},
"error": {
"code": "EXTRACTION_FAILED",
"message": "Unable to extract data from document"
}
}
Error Codes
| Code | Description |
|---|
EXTRACTION_FAILED | General extraction failure |
LLM_ERROR | AI model error during processing |
PARSING_ERROR | Failed to parse AI response into schema |
TIMEOUT_ERROR | Extraction timed out |
When extraction fails, you’re still charged credits for the processing attempt. The metadata field shows resource usage.
Best Practices
Always Check Errors
Check both HTTP errors (try/catch) and extraction errors (response.error).
Implement Retries
Use exponential backoff for 429 and 500 errors.
Log Errors
Log error details for debugging. Include the document name and schema.
Validate Inputs
Check file type and size before sending to avoid 400 errors.
Comprehensive Error Handling Example
import time
from parsefy import Parsefy, APIError, ValidationError
def extract_with_handling(file_path, schema, max_retries=3):
client = Parsefy()
for attempt in range(max_retries):
try:
result = client.extract(file=file_path, schema=schema)
if result.error is not None:
# Extraction failed but request succeeded
if result.error.code == "TIMEOUT_ERROR":
# Might succeed on retry
continue
else:
# Non-retryable extraction error
raise Exception(f"Extraction failed: {result.error.message}")
# Success
return result.data
except APIError as e:
if e.status_code == 429:
# Rate limited - wait and retry
wait_time = 2 ** attempt
time.sleep(wait_time)
continue
elif e.status_code >= 500:
# Server error - might be temporary
time.sleep(1)
continue
else:
# Client error - don't retry
raise
except ValidationError:
# Invalid input - don't retry
raise
raise Exception(f"Failed after {max_retries} attempts")