Tutorial

How to Perform Multi-Page OCR Using Python

This guide shows how to use Python and the Eden AI API to perform multi-page OCR. You'll learn how to launch an OCR job, retrieve results, and process large documents efficiently using Eden AI’s asynchronous job handling and multiple providers.

How to Perform Multi-Page OCR Using Python
TABLE OF CONTENTS

Optical Character Recognition (OCR) is a powerful technique for extracting text from images or scanned documents. With Eden AI’s Multi-Page OCR capabilities, you can easily process documents spanning multiple pages with just a few lines of Python code.

In this tutorial, you’ll learn how to implement multi-page OCR using the Eden AI API in Python, including launching a job and retrieving results.

What is Multipage OCR?

Multipage OCR (Optical Character Recognition) is a technology that allows users to extract text from documents with multiple pages, such as PDFs or image-based files.

By scanning each page of the document, it recognizes and converts printed text into machine-readable, editable, and searchable formats, making it easier to work with large or scanned documents without manual data entry.

How to use Multipage OCR

Set Up Your Eden AI Account

1. Sign Up: If you don't have an Eden AI account, create a free here. Once you do that you can obtain your API key, which you can use to access Multipage OCR.

2. Access OCR Tools: Once logged in, go to the document parsing section of the platform.

3. Choose the Multipage OCR Feature: Select the Multipage OCR tool. You can also explore advanced parsing options based on your specific requirements

Implementing Multipage OCR using Python

Step 1: Installing the Requests Library

Before getting started, ensure the requests module is installed. This is the library used to make HTTP requests to Eden AI's endpoints.


pip install requests

Eden AI uses asynchronous processing for large or complex files (like multi-page PDFs). This means you first launch the job, and then poll the API to retrieve the results once the processing is complete. This two-step flow helps with performance, reliability, and scalability.

Launching the OCR Job (POST Request)

The first step is to submit your document to Eden AI for OCR processing.


import requests

headers = {"Authorization": "Bearer 🔑 Your_API_Key"}

url = "https://api.edenai.run/v2/ocr/ocr_async"
json_payload = {
    "providers": "amazon",
    "file_url": "🔗 URL of your image"
}

response = requests.post(url, json=json_payload, headers=headers)

result = response.json()
print(result)

What This Does:

  • Authorization: Uses your API key to authenticate.
  • file_url: Link to your PDF or multi-page image.
  • providers: Specifies which OCR engine to use (Amazon in this case).
  • This POST request launches an asynchronous OCR job, and returns a public_id used to retrieve results later.

Retrieving OCR Results (GET Request)

Once the job is submitted, you’ll get a public_id. Use it to retrieve the result.


import requests

url = "https://api.edenai.run/v2/ocr/ocr_async/public_id/"  # Replace with actual public_id

headers = {"accept": "application/json"}

response = requests.get(url, headers=headers)

print(response.json())

Interpreting the Results

Here’s what a typical response might include:


{
  "status": "completed",
  "results": {
    "amazon": {
      "extracted_text": "Page 1 text...\nPage 2 text...",
      "pages": [
        {"page_number": 1, "text": "Page 1 text"},
        {"page_number": 2, "text": "Page 2 text"}
      ]
    }
  }
}

Key Fields:

  • status: Shows if the job is completed.
  • extracted_text: Full text extracted across all pages.
  • pages: A breakdown of text per page — useful for pagination or summaries.

Going Further

For better management of your OCR tasks, Eden AI provides additional endpoints. These extra endpoints allow you to track and manage your jobs more effectively. You can:

  1. OCR Async List Job (GET):
    Retrieve a list of all jobs launched for OCR. Use the job IDs to track the status and retrieve the results.
    API Documentation
  2. OCR Async Delete Jobs (DELETE):
    Delete jobs that are no longer needed, keeping your workspace organized and clutter-free.
    API Documentation

These endpoints enhance flexibility and control, helping you manage and clean up OCR tasks efficiently. For further details, refer to the full documentation!

Why Eden AI is the Best Tool for Multipage OCR

Eden AI provides several advantages.

Multiple AI Providers

You can choose between different AI services, helping you compare results for the best performance.

Easy Integration

Streamline development with one API key that gives access to multiple AI services. Skip the complexity of separate integrations and launch faster.

Cost Efficiency

Only pay for what you use. No upfront costs, just flexible access to multiple AI services with a single API key.

Conclusion

In just two steps, launching the OCR job and retrieving the results, you can extract structured text from multi-page documents using Eden AI and Python.

This is a powerful tool for automating workflows like document analysis, data extraction, and digital archiving.

Multi-page OCR doesn’t have to be complicated. With Eden AI’s simple API and Python’s ease of use, you can integrate this functionality into your tools or workflows with minimal setup.

Start Your AI Journey Today

  • Access 100+ AI APIs in a single platform.
  • Compare and deploy AI models effortlessly.
  • Pay-as-you-go with no upfront fees.
Start building FREE

Related Posts

Try Eden AI for free.

You can directly start building now. If you have any questions, feel free to chat with us!

Get startedContact sales
X

Start Your AI Journey Today

Sign up now with free credits to explore 100+ AI APIs.
Get my FREE credits now