Download cross-platform Python OCR library

Browse our Products

Are you looking for an evaluation version of a product?

If so you can download any of the below versions for testing. The product will function as normal except for an evaluation limitation. At the time of purchase we provide a license file via email that will allow the product to work in its full capacity. If you would also like an evaluation license to test without any restrictions for 30 days, please follow the directions provided here.

Are you having troubles in downloading?

If you experience errors, when you try to download a file, make sure your network policies (enforced by your company or ISP) allow downloading ZIP and/or MSI files.

Support Forum

Download Aspose.OCR for Python via Java for image recognition.

Search and extract text from scans, photos, PDF documents, screenshots and other graphical files on any platform with Python 3.6 and Java Runtime Environment (JRE) 8. You can use the same library regardless of the operating system and do not have to adjust the code.

Aspose.OCR for Python via Java can be installed from a local file or form PyPi with the following pip command:

Release Notes Download

     pip install aspose-ocr-python-java

     pip install <downloaded-package-path>

Key features

Global applications - supporting over 130 languages, the library allows you to recognize texts in Latin, Cyrillic and Asian scrips.
Read everything - retrieve text from any file obtained through a scanner or camera, and process images directly from web links.
Reliable results - achieve the highest recognition accuracy for all images, including those that are out-of-focus, rotated, distorted, and noisy.
Batch processing - bulk-recognize all images from folders and archives; read multi-page PDF documents, TIFF images and DjVu files.
Layout detection - identify and categorize content blocks in images to ensure the correct order of extracted text, regardless of layout.
Suitable for any content - image preprocessing and customizable document structure detection enable text extraction from virtually any image, ranging from high-quality scans to street photos.
Universal - use the same code and one library package on any platform.
Optimized - the library balances recognition speed, quality, and resource utilization for each specific use case.

Supported file formats

.PDF - Portable Document Format
.JPG - JPEG, the most popular format for smartphone photos
.PNG - Portable Network Graphics, 24-bit with transparency
.TIFF - Tag Image File Format, commonly used for high quality scanning
.GIF - Graphics Interchange Format, limited to 256 colors
.BMP - Bitmap image file
.WBMP - Monochrome graphics file optimized for mobile devices

Multi-page PDF documents and TIFF images are fully supported.

Code snippet

Aspose.OCR for Python via Java is extremely easy to use, regardless of the application’s scale and complexity. Let’s try to create a very simple application that can extract text from images and output it to the console.

Install the latest version of the aspose-ocr package using pip.
Import aspose module into the application.
Create an instance of AsposeOcr class.
Create an instance of OcrInput class and add one or more images to it.
Extract text from the image using recognize method.
Output the extracted text to the console.

Full code:

import aspose as ocr

# Initialize OCR engine
api = ocr.AsposeOcr()

# Initialize OCR input
filters = ocr.PreprocessingFilter()
filters.add(ocr.PreprocessingFilter.auto_skew())
input = ocr.OcrInput(ocr.InputType.SINGLE_IMAGE, filters)
input.add("1.png")
input.add("2.jpg")

# Recognize images
settings = ocr.RecognitionSettings()
settings.set_detect_areas_mode(ocr.DetectAreasMode.PHOTO)
result = api.recognize(input)

# Print result
print(result[0].recognition_text)
print(result[1].recognition_text)