Get Parse Result
Check the status and retrieve results of parsing jobs
Overview
The Get Parse Job Status endpoint allows you to check the current status of parsing jobs and retrieve the complete results when processing is complete. This endpoint is specifically designed for the parsing API and returns comprehensive document analysis including text extraction, image recognition, table parsing, and OCR data.Parameters
POST /parse.false.chunks array in the response. Defaults to true.false.pdf_url, file_url, output_file_url, segment image, configuration.input_file_url) in the response. Defaults to false, in which case these fields are returned as null so the response (and any log of it) does not expose the storage bucket, region, or path. exports URLs are always returned regardless of this setting. Can also be set via the include-url header.keep_segment_types). Omit to include every type.merge_tables. No merging happens at read time; the job’s own setting takes precedence, so passing this for a job created without merge_tables has no effect. Defaults to false.Response
Starting, Queued, Processing, Succeeded, Failed, or Cancelled. Jobs created through the v2 presigned-upload flow can also report AwaitingUpload before the file is uploaded.xml_citation is enabled or from the job record.null until a worker picks the job up (Starting and Queued).Succeeded or Failed.0 until status is Succeeded.0 until processing begins; populated while status is Processing.Succeeded (chunk content is only downloaded for succeeded jobs).null unless include_url=true is set.export_format was specified in the parse request and the export has completed. Keys are format names (e.g. "docx"), values are presigned S3 URLs valid for 1 hour. If export failed, contains {"docx_error": "..."} instead. Always returned with URLs, regardless of include_url.null unless include_url=true is set.Failed.output_file=true the parsed content is replaced by a result object carrying output_file_url (a presigned URL to the raw output JSON) instead of inline chunks; and a merge_tables job whose merged result is already available may return the content nested under an output key with a task_id instead of the standard top-level fields.Job Status Values
Starting
Starting
AwaitingUpload
AwaitingUpload
Queued
Queued
Processing
Processing
Succeeded
Succeeded
Failed
Failed
Cancelled
Cancelled
Polling Strategy
For long-running parsing jobs, implement a polling strategy to check status periodically:Segment Types
When a job succeeds, the response includes detailed analysis of different document segments:Title
Top-level document titles, distinct from section headers.SectionHeader
Document headers and titles that define section boundaries.Text
Regular text content including paragraphs, sentences, and individual text elements.ListItem
Individual items within ordered or unordered lists.Table
Tabular data with structured rows and columns.Picture
Images and graphics within the document, including logos, charts, and illustrations.Caption
Text captions associated with images or figures.Formula
Mathematical or chemical formulas detected within the document.Footnote
Footnote text appearing at the bottom of a page.PageHeader
Recurring header content appearing at the top of pages.PageFooter
Recurring footer content appearing at the bottom of pages.Page
A full-page segment when the document is processed without fine-grained layout analysis. Each segment includes:- segment_type: Type of content detected
- content: Extracted text content
- image: URL to extracted image (if applicable)
- page_number: Page where the segment appears
- confidence: Confidence score for the extraction
- bbox: Precise coordinates of the segment
- html: HTML-formatted content
- markdown: Markdown-formatted content
- ocr: Detailed OCR data with individual text elements
- chart_data: Structured chart data (chart type, series, labels, legend) for Picture segments identified as charts, when chart extraction is enabled
- cell_references: Spreadsheet cell-range references (
{sheet, address, ref}) for Excel segments - references: Research-paper citations attached to the segment, when
xml_citationis enabled - merged_page_bboxes: Per-page bounding boxes for tables merged across page breaks, when
merge_tablesis enabled
Error Handling
Common Error Scenarios
- Job Not Found: Invalid or expired job ID returns a
404response. - Unauthorized: Missing or invalid API key returns a
401response. - Forbidden: Valid API key but no permission to access this task returns a
403response. - Rate Limiting: This GET endpoint is not rate limited by the application; only the submit endpoints are. Poll responsibly regardless.
- Client-Side Polling Timeout: The job did not complete within the time your polling logic allows. This is not a server-returned error; implement a reasonable client-side timeout and handle it gracefully.
- Server Error: Internal processing error returns a
500response.
Best Practices
- Polling Frequency: Check status every 5-10 seconds for long-running jobs
- Timeout Handling: Implement reasonable timeouts to prevent infinite polling
- Error Recovery: Handle failed jobs gracefully with retry logic
- API Key Security: Keep your API key secure and never expose it in client-side code
Rate Limits
- Concurrent Jobs: Limited number of active parsing jobs per API key
- Request Frequency: Avoid excessive polling (recommended: 5-10 second intervals)
Authorizations
API key for authentication. Use 'Bearer <your_api_key>'
Path Parameters
Job ID returned by POST /parse.
Query Parameters
Include the chunks array in the response. Defaults to true.
Return segment images as base64-encoded data URIs instead of S3 presigned URLs.
Defaults to false.
Return a presigned S3 URL to the raw output JSON file instead of inlining the
full response body. Also accepted as the output-file request header.
Defaults to false.
Apply enhanced table post-processing when assembling the response — improves
cell-merge accuracy and structure recovery for complex tables, at the cost
of extra latency. Also accepted as the enhanced-table request header.
Defaults to false.
Apply the cross-page table-merge post-processing pass when assembling the
response. Has no effect unless the job was parsed with merge_tables=true
(the merge work runs at parse time). Defaults to false.
Include file URLs (pdf_url, file_url, segment images, exports,
configuration.input_file_url) in the response. When false (default),
every URL-bearing field is nulled so the response — and any log of it —
does not leak the storage bucket/region/path. Also accepted as the
include-url request header.
Response
Job status and results. Output fields (chunks, total_chunks, page_count, pdf_url) are present only when status is Succeeded.
Response body for GET /parse/{job_id}.
Fields marked as optional appear only when the job has reached the relevant status.
ISO 8601 timestamp when the job was created.
Job identifier.
Citation or job metadata. Populated when xml_citation is enabled or from the job record.
Current job status: Starting, Processing, Succeeded, Failed, or Cancelled.
Array of document chunks with segments and extracted content. Present when status is Succeeded.
Configuration used for this job (mirrors the parameters submitted at creation time).
The effective merge_tables value lives at configuration.merge_tables.
Credits used for this job.
Presigned download URLs for exported file formats.
Only present when export_format was specified in the parse request and the export has completed.
Keys are format names (e.g. "docx"), values are presigned S3 URLs valid for 1 hour.
Example: {"docx": "https://s3.amazonaws.com/..."}.
If export failed, contains {"docx_error": "..."} instead.
Original file name from the job record.
MIME type of the uploaded file.
S3 URL of the original uploaded file.
ISO 8601 timestamp when processing completed. Present when status is Succeeded or Failed.
Error or status detail message. Present when status is Failed.
Number of pages in the document. Present when status is Succeeded.
Presigned S3 URL to the generated PDF. Present when status is Succeeded.
ISO 8601 timestamp when processing started. Present when status is not Starting.
Total number of document chunks. Present when status is Succeeded.

