Skip to main content
This quickstart covers the parsing endpoint and is the fastest way to try Unsiloed AI. If you’d rather start with another capability, see the Extraction quickstart, the Classification guide, or the Splitting guide.
By the end of this guide, you’ll have a working script that uploads a PDF to the /parse endpoint, polls until parsing finishes, and saves the parsed result to disk as both JSON and Markdown. The full script is available in the dropdown below if you’d rather copy it and skip the walkthrough.
Set UNSILOED_API_KEY in your environment and save the document you want to parse as document.pdf in the same directory before running.
parse_document.py
import json
import os
import time
import requests

API_KEY = os.environ["UNSILOED_API_KEY"]
BASE_URL = "https://prod.visionapi.unsiloed.ai"

with open("document.pdf", "rb") as f:
    response = requests.post(
        f"{BASE_URL}/parse",
        headers={"api-key": API_KEY},
        files={"file": ("document.pdf", f, "application/pdf")},
    )
response.raise_for_status()

job_id = response.json()["job_id"]
print(f"Job submitted: {job_id}")

max_attempts = 60  # roughly 5 minutes at 5 seconds per poll
attempts = 0
while True:
    result = requests.get(
        f"{BASE_URL}/parse/{job_id}",
        headers={"api-key": API_KEY},
    ).json()
    print(f"Status: {result['status']}")
    if result["status"] == "Succeeded":
        break
    if result["status"] == "Failed":
        raise RuntimeError(result.get("message", "parse job failed"))
    attempts += 1
    if attempts >= max_attempts:
        raise TimeoutError("Parse job did not finish within 5 minutes")
    time.sleep(5)

with open("result.json", "w") as f:
    json.dump(result, f, indent=2)

with open("output.md", "w") as f:
    f.write("\n\n".join(chunk["embed"] for chunk in result["chunks"]))

print(f"Saved {result['total_chunks']} chunks to result.json and output.md")

Step 1: Set Up Your Environment

Before writing any code, we need three things: an API key, a document, and the runtime for our chosen language.

1.1 Get an Unsiloed AI API Key

To get API access, sign up on Unsiloed AI. Export your key as an environment variable named UNSILOED_API_KEY so it stays out of source control:
export UNSILOED_API_KEY="your-api-key"

1.2 Pick a Document to Parse

The /parse endpoint supports PDF, DOCX, PPTX, JPG, PNG, and other formats. The walkthrough below assumes a PDF saved as document.pdf in your working directory. To use a different format, update the filename and content type in the snippets to match your file. If you don’t have a document handy, download our sample PDF (a one-page Q1 2024 Sales Report) and save it as document.pdf.

1.3 Install Dependencies

You need Python 3.8 or newer. Install the requests package:
pip install requests

Step 2: Submit a Document

The /parse endpoint accepts a multipart upload and returns a job_id we can poll for results. All requests go to https://prod.visionapi.unsiloed.ai with the API key in the api-key header.

2.1 Set Up the Script

Create a file called parse_document.py and start with the imports and configuration:
parse_document.py
import json
import os
import time
import requests

API_KEY = os.environ["UNSILOED_API_KEY"]
BASE_URL = "https://prod.visionapi.unsiloed.ai"
API_KEY reads your key from the environment so it doesn’t get hard-coded into the file, and BASE_URL points at the Unsiloed AI production endpoint. We’ll reuse both in every request below.

2.2 Upload the Document

Send the file as a multipart upload to /parse. The endpoint expects the document under the form field name file.
Continue the file by uploading the document:
parse_document.py
with open("document.pdf", "rb") as f:
    response = requests.post(
        f"{BASE_URL}/parse",
        headers={"api-key": API_KEY},
        files={"file": ("document.pdf", f, "application/pdf")},
    )
response.raise_for_status()
The raise_for_status() call throws an HTTPError on any non-2xx response, so we don’t need to check .status_code ourselves.

2.3 Capture the Job ID

Next, read and print the job_id:
parse_document.py
job_id = response.json()["job_id"]
print(f"Job submitted: {job_id}")
Run the script:
python parse_document.py
The output should be a single line like Job submitted: 1699d429-9c2e-464e-b311-d4b68a8444b8.

Step 3: Poll for Results

The job runs asynchronously. We GET /parse/{job_id} repeatedly until the status is Succeeded, then save the parsed output to disk. A status of Succeeded means the result is ready; Failed means the job errored; any other value (Starting, Processing, and so on) means the job is still running.

3.1 Write the Polling Loop

Then drop in a polling loop. The max_attempts cap stops the loop if the job hangs:
parse_document.py
max_attempts = 60  # roughly 5 minutes at 5 seconds per poll
attempts = 0
while True:
    result = requests.get(
        f"{BASE_URL}/parse/{job_id}",
        headers={"api-key": API_KEY},
    ).json()
    print(f"Status: {result['status']}")
    if result["status"] == "Succeeded":
        break
    if result["status"] == "Failed":
        raise RuntimeError(result.get("message", "parse job failed"))
    attempts += 1
    if attempts >= max_attempts:
        raise TimeoutError("Parse job did not finish within 5 minutes")
    time.sleep(5)

3.2 Save the Parsed Output

Persist the result to disk so downstream code can read it. We’ll write two files: result.json (the full response, including job metadata and segment-level layout) and output.md (the concatenated Markdown, suitable for previewing or feeding into a RAG pipeline).
Finally, write the result to disk:
parse_document.py
with open("result.json", "w") as f:
    json.dump(result, f, indent=2)

with open("output.md", "w") as f:
    f.write("\n\n".join(chunk["embed"] for chunk in result["chunks"]))

print(f"Saved {result['total_chunks']} chunks to result.json and output.md")
Run the script:
python parse_document.py
You should see a few Status: Processing lines, then Status: Succeeded, then a final summary line. The two output files appear in the working directory.

Error Responses

Failures fall into two buckets: HTTP errors raised before the job is queued, and a Failed status on a job that started but couldn’t complete.

HTTP Errors

The /parse endpoint returns plain-text bodies on HTTP errors, not JSON. Calling response.json() on them raises, so check the status code before parsing. The common cases:
  • 401 Unauthorized: body is Invalid API key: Invalid API key. The api-key header is missing or wrong.
  • 400 Bad Request: body is Content type error. The upload form is malformed, usually because the file field isn’t multipart.
  • 404 Not Found: body is Task not found. The job_id you polled doesn’t exist.

Failed Jobs

A job that was accepted but couldn’t be processed comes back with status: "Failed". The response shape matches a successful one, but chunks is empty and the message field describes what went wrong. For example, submitting a corrupt PDF returns:
{
  "job_id": "7b31a7d7-e810-4a0b-931e-fbed0879bab2",
  "status": "Failed",
  "file_name": "bad.pdf",
  "message": "Failed to initialize task",
  "chunks": [],
  "total_chunks": 0
}

Response Shape

A successful response contains job metadata plus a chunks[] array. Each chunk has an embed Markdown string and an array of layout segments with their original positions on the page.
{
  "job_id": "1699d429-9c2e-464e-b311-d4b68a8444b8",
  "status": "Succeeded",
  "file_name": "document.pdf",
  "page_count": 1,
  "total_chunks": 1,
  "credit_used": 1,
  "chunks": [
    {
      "chunk_id": "6b2eca3a-d14f-4164-ba9a-0a3a58fcaf45",
      "embed": "## Q1 2024 Sales Report\nThe following table summarises...",
      "segments": [
        {
          "segment_id": "c60d89b1-373e-428d-9950-544e7c903b61",
          "segment_type": "SectionHeader",
          "content": "Q1 2024 Sales Report",
          "markdown": "## Q1 2024 Sales Report",
          "html": "<h2>Q1 2024 Sales Report</h2>",
          "bbox": { "left": 427.6, "top": 67.8, "width": 344.7, "height": 36.5 },
          "page_number": 1,
          "image": "https://s3.us-east-1.amazonaws.com/...",
          "ocr": [ "..." ],
          "confidence": 0.35
        }
      ]
    }
  ],
  "pdf_url": "https://s3.us-east-1.amazonaws.com/..."
}
The fields you’ll actually use depend on what you’re building. They fall into three broad categories: For RAG and embeddings:
  • chunks[].embed: the chunk’s content rolled up as Markdown, ready to pass to an embedder. This is the field the walkthrough writes to output.md.
For layout, source highlighting, and visual overlays:
  • chunks[].segments[]: the layout primitives the chunk is built from
  • segments[].segment_type: the region’s type, for example Text, Table, or SectionHeader (see Element Types for the full set)
  • segments[].bbox: the segment’s position on its page; pair it with page_number to identify which page
  • segments[].markdown, html, content: the segment rendered as Markdown, HTML, or plain text
  • segments[].image: signed URL to a cropped image of the segment
  • segments[].ocr: word-level OCR boxes for highlighting matches in the source PDF
  • segments[].confidence: the parser’s confidence in this segment’s classification, on a 0-1 scale. Lower values flag ambiguous regions but don’t necessarily mean the content is wrong, so treat it as a debugging signal rather than a hard threshold.
For job and usage tracking:
  • status: Succeeded, Failed, or one of the in-progress values (Starting, Processing, and so on)
  • total_chunks: number of chunks in the result
  • credit_used: credits consumed by this job

Sample Markdown Output

Running the script with the sample PDF writes this to output.md:
## Q1 2024 Sales Report
The following table summarises regional sales performance for Q1
2024.
| Region | Sales Rep | Units Sold | Revenue ($) | Target ($) | % of Target |
| --- | --- | --- | --- | --- | --- |
| North | Alice Brown | 1,240 | 186,000 | 175,000 | 106% |
| South | Bob Smith | 980 | 147,000 | 160,000 | 92% |
| East | Carol Jones | 1,510 | 226,500 | 200,000 | 113% |
| West | David Lee | 870 | 130,500 | 150,000 | 87% |
| Central | Eve Martinez | 1,100 | 165,000 | 155,000 | 106% |
This is chunks[].embed joined with blank lines. The parser keeps headings, paragraphs, and tables as Markdown, so the output is ready to embed for RAG without further processing.

Next Steps

For more on parsing, including element types, processing modes, response format, and presigned URLs, see the Parsing overview.

Parsing Options

Configure chunking strategies, segment filters, and the OCR backend.

Structured Extraction

Pull typed fields out of a document using a JSON schema.

API Reference

Browse the full request and response specs for every endpoint.

FAQ

Check limits, supported formats, and answers to common questions.