RecognitionSettings Struct Reference

Settings for the image recognition. Contains elements that allow customizing the recognition process. More...

Public Attributes

bool all_image = false
 Disabled (false) by default. Turning on means recognizing the image as a single area. More...
 
bool correct_skew = true
 Enabled (true) by default. Detects orientation and auto-rotate image if needed. More...
 
bool lines_filtration = false
 Disabled (false) by default. Allows to recognize text in the tables (regions surrounded lines). More...
 
const wchar_t * alphabet = NULL
 L"" by default (all alphabet allowed). Set of allowed characters in the alphabet (symbols for recognition) . More...
 
const wchar_t * ignoredCharacters = L""
 L"" by default (all alphabet allowed). Sets blacklist for recognition symbols. More...
 
export_format format = export_format::text
 Choose result format: simple text or JSON-formatted text saved in wchar_t* buffer. Default simple text. Supported formats: text, json. More...
 
rectrectangles = NULL
 Choose areas for recognition. rect rectangles[2] = { { 3, 50, 100, 70 }, { 3, 160, 100, 75 } }; More...
 
size_t rectangles_size = 0
 Set areas for recognition size More...
 
double skew = 0
 Rotate image on specified angle. Doesn't work if rectangles are specified. More...
 
language language_alphabet = language::none
 Multi-language by default. Language used for OCR. Supported languages: English (en), German (de), Portuguese (pt), Spanish (es), French (fr), Italian (it), Czech (cze), Danish (dan), Dutch (dum), Estonian (est), Finnish (fin), Latvian (lav), Lithuanian (lit), Norwegian (nor), Polish (pol), Romanian (rum), Serbo-Croatian (srp_hrv), Slovak (slk), Slovene (slv), Swedish (swe), Chinese (chi) More...
 
file_format save_format = file_format::txt
 Choose result save format for "page_save" method. Default format - txt. Supported formats: docx, txt, pdf. Doesn't work for other methods. More...
 
int threshold_value = 0
 Sets custom threshold value for image binarization. Range from 1 to 255. More...
 

Detailed Description

Settings for the image recognition. Contains elements that allow customizing the recognition process.

Member Data Documentation

§ all_image

bool RecognitionSettings::all_image = false

Disabled (false) by default. Turning on means recognizing the image as a single area.

§ alphabet

const wchar_t* RecognitionSettings::alphabet = NULL

L"" by default (all alphabet allowed). Set of allowed characters in the alphabet (symbols for recognition) .

§ correct_skew

bool RecognitionSettings::correct_skew = true

Enabled (true) by default. Detects orientation and auto-rotate image if needed.

§ format

export_format RecognitionSettings::format = export_format::text

Choose result format: simple text or JSON-formatted text saved in wchar_t* buffer. Default simple text. Supported formats: text, json.

§ ignoredCharacters

const wchar_t* RecognitionSettings::ignoredCharacters = L""

L"" by default (all alphabet allowed). Sets blacklist for recognition symbols.

§ language_alphabet

language RecognitionSettings::language_alphabet = language::none

Multi-language by default. Language used for OCR. Supported languages: English (en), German (de), Portuguese (pt), Spanish (es), French (fr), Italian (it), Czech (cze), Danish (dan), Dutch (dum), Estonian (est), Finnish (fin), Latvian (lav), Lithuanian (lit), Norwegian (nor), Polish (pol), Romanian (rum), Serbo-Croatian (srp_hrv), Slovak (slk), Slovene (slv), Swedish (swe), Chinese (chi)

§ lines_filtration

bool RecognitionSettings::lines_filtration = false

Disabled (false) by default. Allows to recognize text in the tables (regions surrounded lines).

false allows increase performance and don't detect tables and remove lines; otherwise true. The default is false.

§ rectangles

rect* RecognitionSettings::rectangles = NULL

Choose areas for recognition. rect rectangles[2] = { { 3, 50, 100, 70 }, { 3, 160, 100, 75 } };

§ rectangles_size

size_t RecognitionSettings::rectangles_size = 0

Set areas for recognition size

§ save_format

file_format RecognitionSettings::save_format = file_format::txt

Choose result save format for "page_save" method. Default format - txt. Supported formats: docx, txt, pdf. Doesn't work for other methods.

§ skew

double RecognitionSettings::skew = 0

Rotate image on specified angle. Doesn't work if rectangles are specified.

§ threshold_value

int RecognitionSettings::threshold_value = 0

Sets custom threshold value for image binarization. Range from 1 to 255.