Classification picks the best-fit label for a document from a list of candidate categories we supply. The endpoint returns the matched category and a confidence score, ready to feed into routing logic. For raw Markdown or structured field extraction instead, see the Parse quickstart or the Extraction quickstart.
/classify, waits for the verdict, and saves the matched category and confidence score to disk. The accordion below has the full script if you’d rather copy and run it directly.
Show the Full Script
Show the Full Script
Set
UNSILOED_API_KEY in your environment and save the document you want to classify as document.pdf in the same directory before running.- Python
- JavaScript
- cURL
classify_document.py
Step 1: Set Up Your Environment
Before writing any code, gather three things: an API key, a document, and the runtime for the chosen language.1.1 Get an Unsiloed AI API Key
To get API access, sign up on Unsiloed AI. Export your key as an environment variable namedUNSILOED_API_KEY so it stays out of source control:
1.2 Pick a Document to Classify
The/classify endpoint supports PDF, DOCX, PPTX, JPG, PNG, and other formats. The walkthrough below assumes a PDF saved as document.pdf in your working directory. To use a different format, update the filename and content type in the snippets to match your file.
If you don’t have a document handy, download our sample PDF (a one-page lab report from Riverside Diagnostic Laboratory) and save it as document.pdf. The walkthrough scores it against three candidate categories so we can see a clear winner.
1.3 Install Dependencies
- Python
- JavaScript
- cURL
You need Python 3.8 or newer. Install the
requests package:Step 2: Submit a Document With Categories
The request bundles two fields:pdf_file for the document and categories for a JSON-stringified array of category objects, each with a name and an optional description. The categories list is the model’s entire vocabulary for this call, so clear and distinct names matter more than they might appear. The endpoint returns a job_id to poll. All requests go to https://prod.visionapi.unsiloed.ai with the API key in the api-key header.
2.1 Set Up the Script
- Python
- JavaScript
- cURL
Create a file called
classify_document.py and start with the imports, configuration, and category list:classify_document.py
API_KEY reads your key from the environment so it doesn’t get hard-coded into the file, and BASE_URL points at the Unsiloed AI production endpoint. The categories list defines the candidate labels the model picks from. Only the names guide the result; a description key is accepted but not used by classification.2.2 Upload the Document
Send the file and the JSON-encoded category list as a multipart upload to/classify. The document goes under pdf_file and the categories under categories.
- Python
- JavaScript
- cURL
Continue the file by uploading the document:
classify_document.py
raise_for_status() throws an HTTPError on any non-2xx response, so there’s no need to check .status_code separately.2.3 Capture the Job ID
- Python
- JavaScript
- cURL
Next, read and print the Run the script:The output should be a single line like
job_id:classify_document.py
Job submitted: 2c231adf-ad5e-4e2e-8c0c-10cd7025c09b.Step 3: Poll for Results
The job runs asynchronously. GET/classify/{job_id} repeatedly until the status is completed, then save the classification to disk.
A status of completed means the result is ready. A status of failed means the job errored. Any other value (such as processing) means the job is still running.
3.1 Write the Polling Loop
- Python
- JavaScript
- cURL
Drop in a polling loop. The
max_attempts cap stops the loop if the job hangs:classify_document.py
3.2 Save the Classification
Persist the result to disk so downstream code can read it. The full response, including the per-page breakdown, goes toclassification.json.
- Python
- JavaScript
- cURL
Finally, write the result to disk and print a summary:Run the script:You should see one or two
classify_document.py
Status: processing lines, then Status: completed, then a summary line like Classification: Medical Record (100.00% confidence). The classification.json file appears in the working directory.Error Responses
Failures fall into two buckets: HTTP errors raised before the job is queued, and afailed status on a job that started but could not complete.
HTTP Errors
The/classify endpoint returns JSON error bodies under a detail field. The common cases are:
401 Unauthorized:{"detail":"Invalid API key"}. Theapi-keyheader is missing or wrong.400 Bad Request:{"detail":"Either pdf_file or file_url must be provided"}or{"detail":"At least one category is required"}. The submit form is missing a required field.422 Unprocessable Entity:{"detail":[{"type":"missing","loc":["body","categories"],"msg":"Field required","input":null}]}. A required form field, usuallycategories, is missing entirely.404 Not Found:{"detail":"Job not found"}. Thejob_idyou polled doesn’t exist.
Failed Jobs
A job that was accepted but could not be processed returnsstatus: "failed" on the polling endpoint. The response shape matches a successful one, but result is absent and the error field describes what went wrong:
Response Shape
A completed job returns job metadata plus a nestedresult object that contains the overall classification, a confidence score, and per-page results.
result.classification: the overall predicted category for the document, drawn from thenamevalues you submitted. This is the field the walkthrough prints.result.confidence: confidence score for the overall classification, on a 0-1 scale. Treat it as a soft signal: high values rarely need review, low values flag documents worth a human look.
result.page_results[]: the per-page classifications the overall result is built frompage_results[].page: 1-indexed page numberpage_results[].classification: the predicted category for that pagepage_results[].raw_result: the model’s raw output before normalization to a category name; usually identical toclassificationpage_results[].confidence: the page-level confidence score on a 0-1 scale
status:completed,failed, or an in-progress value such asprocessingprogress: human-readable progress messageerror: error message if the job failed, otherwisenullresult.total_pagesandresult.processed_pages: how much of the document the classifier got through
Sample Output
Running the script against the sample lab report and the three categories above writes the verdict toclassification.json:
Medical Record bucket with full confidence. Swap in your own document and category list to see how the classifier handles ambiguous cases.
Next Steps
For more on classification, including category design tips and the canonical response shape, see the Classification overview.
Classification Overview
Understand how the classifier scores pages and when to reach for it.
Response Format
Browse the full classification response with examples for each job state.
API Reference
Browse the full request and response specs for the classify endpoint.
Splitting
Split a mixed bundle into separate documents by section.

