Browse our Products

Aspose.OCR for Python via .NET 24.5.0 - Release Notes

Deprecation warning

What was changed

KeySummaryCategory
OCRPY‑68Automatic detection of problematic areas of an image that can significantly impact the accuracy of OCR.New feature
OCRPY‑69Added recognition of Arabic text and recognition of texts in mixed Arabic/English.New feature
OCRPY‑69Added Persian (Farsi) language recognition and recognition of texts in mixed Persian/English.New feature
OCRPY‑69Added Urdu language recognition and recognition of texts in mixed Persian/English.New feature
OCRPY‑69Added Uyghur language recognition and recognition of texts in mixed Persian/English.New feature
OCRPY‑69Significantly improved recognition of languages based on the Latin alphabet.Enhancement
OCRPY‑69Added support for TIFF images with 16 bits per pixel bit depth.Enhancement
OCRPY‑69Improved saving of recognition results as searchable PDFs.Enhancement
OCRPY‑69Improved DetectAreasMode.PHOTO document areas detection mode.Enhancement
OCRPY‑69Fixed character bounding boxes detection.Fix

Public API changes and backwards compatibility

This section lists all public API changes introduced in Aspose.OCR for Python via .NET 24.5.0 that may affect the code of existing applications.

Added public APIs:

detect_defects() method

Automatically find potentially problematic areas of image and return the information on the type of defect and its coordinates.

DefectType enumeration

Image defects that can be detected automatically:

DefectValueDescription
Salt-and-pepper noiseDefectType.SALT_PEPPER_NOISEAppears as random white and black pixels scattered across the area. Often occurs in digital photographs.
Low contrast between text and backgroundDefectType.LOW_CONTRASTHighlights and shadows typically appear on curved pages.
BlurDefectType.BLURThe entire image or some of its areas are out of focus.
Important: This detection algorithm can only identify the entire image as blurry. Specific areas cannot be detected.
GlareDefectType.GLAREHighlight areas in an image caused by uneven lighting, such as spot lights or flash.
All supported defectsDefectType.ALLAll above-mentioned defects.

DefectAreas class

Image areas containing a certain type of defect.

PropertyDescription
defect_typeType of defect (DefectType enumeration value).
rectanglesImage areas where the defect was found.

DefectOutput class

Image areas containing a certain type of defect.

PropertyDescription
sourceThe full path to the file or URL, if any.
pageThe page number for multi-page images and PDFs.
defect_areasThe list of image defects and areas where they were found.

Updated public APIs:

The following public APIs have been introduced in this release:

Language enumeration

Aspose.OCR for Python via .NET 24.5.0 adds support for several new languages:

ValueAlphabet
Language.ARAArabic, including texts in mixed Arabic/English
Language.PESPersian (Farsi), including texts in mixed Persian/English
Language.UIGUyghur, including texts in mixed Uyghur/English
Language.URDUrdu, including texts in mixed Urdu/English

Removed public APIs:

No changes.

Changes to application logic

We have significantly improved an OCR model for all languages based on Latin alphabet:

  • English
  • Indonesian
  • Italian
  • Malay (Melayu)
  • Hausa
  • Swahili
  • Yoruba
  • Oromo
  • Dutch
  • Malagasy
  • Zhuang
  • Somali
  • Chichewa (Chewa, Nyanja)
  • Rwanda
  • Min Bei
  • Zulu
  • Min Dong
  • Hiligaynon
  • Hmong
  • Shona (Karanga)
  • Xhosa
  • Betawi
  • Afrikaans
  • Minangkabau
  • Sotho (Southern)
  • Bikol
  • Kanuri
  • Tswana
  • Luo
  • Sukuma
  • Tsonga
  • Bemba (Chibemba)
  • Nandi
  • Palembang
  • Umbundu
  • Sotho (Northern)
  • Waray-Waray
  • Lamani (Lambadi)
  • Musi
  • Pu-Xian
  • Kapampangan
  • Bouyei (Buyi, Giáy)
  • Ndebele
  • Sasak
  • Swati (Swazi)
  • Gusii
  • Meru
  • Wolaytta
  • Dong
  • Pangasinan
  • Makassar (Makasar)
  • Tumbuka
  • Serer-Sine
  • LaTonga
  • Luguru
  • Latin

Examples

The code samples below illustrate the changes introduced in this release:

Recognize Arabic text

# Instantiate Aspose.OCR API
api = AsposeOcr()
# Add image to the recognition batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("source.png")
# Enable Arabic text recognition
recognitionSettings = RecognitionSettings()
recognitionSettings.language = Language.ARA
# Recognize the image
result = api.recognize(input, recognitionSettings)
# Print recognition result
print(result[0].recognition_text)
input("Press Enter to continue...")

Detect shadows and highlights

# Instantiate Aspose.OCR API
api = AsposeOcr()
# Add image to the recognition batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("source.png")
# Find shadows and highlights
defects = api.detect_defects(input, DefectType.LOW_CONTRAST)
print(det[0].source)
print(det[0].defect_areas[0].defect_type)