Browse our Products

Aspose.OCR for Python via Java 24.7.0 - Release Notes

Deprecation warning

What was changed

KeySummaryCategory
OCRPY‑71Added Arabic language recognition and recognition of texts in mixed Arabic/English.New feature
OCRPY‑71Added Persian (Farsi) language recognition and recognition of texts in mixed Persian/English.New feature
OCRPY‑71Added Urdu language recognition and recognition of texts in mixed Persian/English.New feature
OCRPY‑71Added Uyghur language recognition and recognition of texts in mixed Persian/English.New feature
OCRPY‑71Automatic detection of problematic areas of an image that can significantly impact the accuracy of OCR.New feature
OCRPY‑71Embedding of user-specified fonts in recognition results saved as PDFs.New feature
OCRPY‑71Improved saving of recognition results as searchable PDFs.Enhancement

Public API changes and backwards compatibility

This section lists all public API changes introduced in Aspose.OCR for Python via Java 24.7.0 that may affect the code of existing applications.

Added public APIs:

The following public APIs have been added to Aspose.OCR for Java 24.7.0 release:

detect_defects() method

Automatically find potentially problematic areas of image and return the information on the type of defect and its coordinates.

DefectType enumeration

Image defects that can be detected automatically:

DefectValueDescription
Salt-and-pepper noiseSALT_PEPPER_NOISEAppears as random white and black pixels scattered across the area. Often occurs in digital photographs.
Low contrast between text and backgroundLOW_CONTRASTHighlights and shadows typically appear on curved pages.
BlurBLURThe entire image or some of its areas are out of focus.
Important: This detection algorithm can only identify the entire image as blurry. Specific areas cannot be detected.
GlareGLAREHighlight areas in an image caused by uneven lighting, such as spot lights or flash.
All supported defectsALLAll above-mentioned defects.

DefectAreas class

Image areas containing a certain type of defect.

PropertyTypeDescription
defectTypeDefectTypeDefect type (see DefectType enumeration above).
rectanglesRectangle[]Image areas where the defect was found.

DefectOutput class

Image areas containing a certain type of defect.

PropertyTypeDescription
sourcestringThe full path to the file or URL, if any. Empty for streams, byte arrays, and Base64 encoded files.
pageintThe page number for multi-page images and PDFs.
defectAreasDefectAreas[]The array of image defects and areas where they were found (see DefectAreas class above).

save_multipage_document_user_font() method

Save recognition results into a PDF document with embedded TrueType (.TTF) or OpenType (.OTF) font.

Updated public APIs:

The following public APIs have been changed in Aspose.OCR for Python via Java 24.7.0 release:

Language enumeration

Aspose.OCR for Python via Java can now recognize 4 new alphabets, including texts in mixed languages:

ValueAlphabet
Language.ARAArabic and English
Language.PESPersian (Farsi) and English
Language.UIGUyghur and English
Language.URDUrdu and English

The following public APIs have been introduced in this release:

Removed public APIs:

No changes

Examples

The code samples below illustrate the changes introduced in this release:

Recognize Arabic text

import aspose as ocr

api = ocr.AsposeOcr()
images = ocr.OcrInput(ocr.InputType.SINGLE_IMAGE)
images.add("source.png")

recognitionSettings = RecognitionSettings()
recognitionSettings.set_language(ocr.Language.ARA)

result = api.recognize(images, recognitionSettings)
print(result[0].recognition_text)

Embed custom font into saved PDF

import aspose as ocr

api = ocr.AsposeOcr()
images = ocr.OcrInput(ocr.InputType.PDF)
images.add("source.pdf")

result = api.recognize(images, recognitionSettings)
api.save_multipage_document_user_font("results.pdf", Format.PDF, result, "fonts/AdobeMingStd-Light.otf")