RecognitionSettings Struct Reference

Settings for the image recognition. Contains elements that allow customizing the recognition process. More...

#include <C:/Users/figor/programming/project/aspose.ocr-for-cpp/doc_gen/aspose_ocr.h>

Public Attributes

bool all_image = false
 Disabled (false) by default. Turning on means recognizing the image as a single area. Useless in methods: asposeocr_recognize_receipt, aspose::ocr::recognize_receipt.
 
bool correct_skew = true
 Enabled (true) by default. Detects orientation and auto-rotate image if needed.
 
bool upscale_small_font = false
 Allows you to use additional algorithms specifically for small font recognition. Useful for images with small-size characters.
 
bool lines_filtration = false
 Disabled (false) by default. Allows to recognize text in the tables (regions surrounded lines).
 
const wchar_t * alphabet = NULL
 L"" by default (all alphabet allowed). Set of allowed characters in the alphabet (symbols for recognition) .
 
const wchar_t * ignoredCharacters = L""
 L"" by default (all alphabet allowed). Sets blacklist for recognition symbols.
 
export_format format = export_format::text
 Choose result format: simple text or JSON-formatted text saved in wchar_t* buffer. Default simple text. Supported formats: text, json.
 
rectrectangles = NULL
 Choose areas for recognition. rect rectangles[2] = { { 3, 50, 100, 70 }, { 3, 160, 100, 75 } };.
 
size_t rectangles_size = 0
 Set areas for recognition size.
 
rectpreprocess_area = NULL
 User area to be pre-processed rect are = {3 , 50, 100, 100}.
 
double skew = 0
 Rotate image on specified angle. Doesn't work if rectangles aDere specified.
 
language language_alphabet = language::none
 Multi-language by default. Language used for OCR. Supported languages: English (en), German (de), Portuguese (pt), Spanish (es), French (fr), Italian (it), Czech (cze), Danish (dan), Dutch (dum), Estonian (est), Finnish (fin), Latvian (lav), Lithuanian (lit), Norwegian (nor), Polish (pol), Romanian (rum), Serbo-Croatian (srp_hrv), Slovak (slk), Slovene (slv), Swedish (swe), Chinese (chi)
 
file_format save_format = file_format::txt
 Choose result save format for "page_save" method. Default format - txt. Supported formats: file_format::docx, file_format::txt, file_format::pdf, file_format::xlsx, file_format::json, file_format::xml, file_fromat::rtf. Doesn't work for other methods.
 
int threshold_value = 0
 Sets custom threshold value for image binarization. Range from 1 to 255.
 
custom_preprocessing_filters filters
 Allows to prepare the image for OCR by adjusting pre-processing methods. Allows to set 12 filters. Example to set: RecognitionSettings settings; settings.filters.filter_1 = OCR_IMG_PREPROCESS_GRAYSCALE; settings.filters.filter_2 = OCR_IMG_PREPROCESS_SCALE(2); settings.filters.filter_3 = OCR_IMG_PREPROCESS_THRESHOLD(200);.
 
characters_allowed_type allowed_characters = characters_allowed_type::ALL
 Allowed characters set. Determines the type of characters allowed for recognition result. allowed_characters contains enum characters_allowed_type value.
 
bool auto_contrast = false
 Allows using an additional contrast correction algorithm for the image before recognition.
 
bool auto_denoising = false
 Enables the use of an additional neural network for the image before recognition. Useful for images with noice, spots, flares, gradients, foreign elements.
 
detect_areas_mode_enum detect_areas_mode = detect_areas_mode_enum::DOCUMENT
 Allows to select the optimal mode for document type areas: document, photo, plain text, column, image. Useless in methods: asposeocr_recognize_receipt, aspose::ocr::recognize_receipt.
 
unsigned int defects = defect_type::ASPOSE_OCR_NONE
 They will determine what types of defects need to be recognized at the moment. use case 1: defect_type = defect_type::ASPOSE_OCR_DETECT_DARK_IMAGES | defect_type::ASPOSE_OCR_DETECT_SALT_PEPPER_NOISE; use case 2: defect_type = defect_type::ASPOSE_OCR_DETECT_ALL;.
 

Detailed Description

Settings for the image recognition. Contains elements that allow customizing the recognition process.

Member Data Documentation

◆ all_image

bool RecognitionSettings::all_image = false

Disabled (false) by default. Turning on means recognizing the image as a single area. Useless in methods: asposeocr_recognize_receipt, aspose::ocr::recognize_receipt.

◆ allowed_characters

characters_allowed_type RecognitionSettings::allowed_characters = characters_allowed_type::ALL

Allowed characters set. Determines the type of characters allowed for recognition result. allowed_characters contains enum characters_allowed_type value.

◆ alphabet

const wchar_t* RecognitionSettings::alphabet = NULL

L"" by default (all alphabet allowed). Set of allowed characters in the alphabet (symbols for recognition) .

◆ auto_contrast

bool RecognitionSettings::auto_contrast = false

Allows using an additional contrast correction algorithm for the image before recognition.

◆ auto_denoising

bool RecognitionSettings::auto_denoising = false

Enables the use of an additional neural network for the image before recognition. Useful for images with noice, spots, flares, gradients, foreign elements.

◆ correct_skew

bool RecognitionSettings::correct_skew = true

Enabled (true) by default. Detects orientation and auto-rotate image if needed.

◆ defects

unsigned int RecognitionSettings::defects = defect_type::ASPOSE_OCR_NONE

They will determine what types of defects need to be recognized at the moment. use case 1: defect_type = defect_type::ASPOSE_OCR_DETECT_DARK_IMAGES | defect_type::ASPOSE_OCR_DETECT_SALT_PEPPER_NOISE; use case 2: defect_type = defect_type::ASPOSE_OCR_DETECT_ALL;.

◆ detect_areas_mode

detect_areas_mode_enum RecognitionSettings::detect_areas_mode = detect_areas_mode_enum::DOCUMENT

Allows to select the optimal mode for document type areas: document, photo, plain text, column, image. Useless in methods: asposeocr_recognize_receipt, aspose::ocr::recognize_receipt.

◆ filters

custom_preprocessing_filters RecognitionSettings::filters

Allows to prepare the image for OCR by adjusting pre-processing methods. Allows to set 12 filters. Example to set: RecognitionSettings settings; settings.filters.filter_1 = OCR_IMG_PREPROCESS_GRAYSCALE; settings.filters.filter_2 = OCR_IMG_PREPROCESS_SCALE(2); settings.filters.filter_3 = OCR_IMG_PREPROCESS_THRESHOLD(200);.

◆ format

export_format RecognitionSettings::format = export_format::text

Choose result format: simple text or JSON-formatted text saved in wchar_t* buffer. Default simple text. Supported formats: text, json.

◆ ignoredCharacters

const wchar_t* RecognitionSettings::ignoredCharacters = L""

L"" by default (all alphabet allowed). Sets blacklist for recognition symbols.

◆ language_alphabet

language RecognitionSettings::language_alphabet = language::none

Multi-language by default. Language used for OCR. Supported languages: English (en), German (de), Portuguese (pt), Spanish (es), French (fr), Italian (it), Czech (cze), Danish (dan), Dutch (dum), Estonian (est), Finnish (fin), Latvian (lav), Lithuanian (lit), Norwegian (nor), Polish (pol), Romanian (rum), Serbo-Croatian (srp_hrv), Slovak (slk), Slovene (slv), Swedish (swe), Chinese (chi)

◆ lines_filtration

bool RecognitionSettings::lines_filtration = false

Disabled (false) by default. Allows to recognize text in the tables (regions surrounded lines).

false allows increase performance and don't detect tables and remove lines; otherwise true. The default is false.

◆ preprocess_area

rect* RecognitionSettings::preprocess_area = NULL

User area to be pre-processed rect are = {3 , 50, 100, 100}.

◆ rectangles

rect* RecognitionSettings::rectangles = NULL

Choose areas for recognition. rect rectangles[2] = { { 3, 50, 100, 70 }, { 3, 160, 100, 75 } };.

◆ rectangles_size

size_t RecognitionSettings::rectangles_size = 0

Set areas for recognition size.

◆ save_format

file_format RecognitionSettings::save_format = file_format::txt

Choose result save format for "page_save" method. Default format - txt. Supported formats: file_format::docx, file_format::txt, file_format::pdf, file_format::xlsx, file_format::json, file_format::xml, file_fromat::rtf. Doesn't work for other methods.

◆ skew

double RecognitionSettings::skew = 0

Rotate image on specified angle. Doesn't work if rectangles aDere specified.

◆ threshold_value

int RecognitionSettings::threshold_value = 0

Sets custom threshold value for image binarization. Range from 1 to 255.

◆ upscale_small_font

bool RecognitionSettings::upscale_small_font = false

Allows you to use additional algorithms specifically for small font recognition. Useful for images with small-size characters.