Aspose::Pdf::Text::ParagraphAbsorber Class Reference

Represents an absorber object of page structure objects such as sections and paragraphs. Performs search for sections and paragraphs of text and provides access for rectangles and polydons that describes it in text coordinate space. Also performs text segments search and provides access to search results via TextFragments collections grouped by structure elements. More...

#include "ParagraphAbsorber.h"

Inherits System::Object.

Public Member Functions

ASPOSE_PDF_SHARED_API System::SharedPtr< System::Collections::Generic::List< System::SharedPtr< PageMarkup > > > get_PageMarkups () const
 Gets collection of PageMarkup that were absorbed. More...
 
ASPOSE_PDF_SHARED_API int32_t get_SectionsSearchDepth () const
 Gets value that instructs how many times sequential searches for more fine elements of structure will be performed. Default search depth is 3. It means three searches for horizontally divided sections (headers, paragraphs etc) and three searches for vertically divided ones (columns). More...
 
ASPOSE_PDF_SHARED_API void set_SectionsSearchDepth (int32_t value)
 Sets value that instructs how many times sequential searches for more fine elements of structure will be performed. Default search depth is 3. It means three searches for horizontally divided sections (headers, paragraphs etc) and three searches for vertically divided ones (columns). More...
 
ASPOSE_PDF_SHARED_API bool get_IsMulticolumnParagraphsAllowed () const
 Gets value that indicates whether starting text lines of a next section may be treated as continuation of the last paragraph of a previous section. More...
 
ASPOSE_PDF_SHARED_API void set_IsMulticolumnParagraphsAllowed (bool value)
 Sets value that indicates whether starting text lines of a next section may be treated as continuation of the last paragraph of a previous section. More...
 
ASPOSE_PDF_SHARED_API System::SharedPtr< Aspose::Pdf::Text::ParagraphAbsorberOptionsget_ParagraphAbsorberOptions () const
 Gets the ParagraphAbsorberOptions. More...
 
ASPOSE_PDF_SHARED_API void set_ParagraphAbsorberOptions (System::SharedPtr< Aspose::Pdf::Text::ParagraphAbsorberOptions > value)
 Sets the ParagraphAbsorberOptions. More...
 
ASPOSE_PDF_SHARED_API ParagraphAbsorber ()
 Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page. More...
 
ASPOSE_PDF_SHARED_API ParagraphAbsorber (int32_t sectionsSearchDepth)
 Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page. More...
 
ASPOSE_PDF_SHARED_API ParagraphAbsorber (System::SharedPtr< Aspose::Pdf::Text::ParagraphAbsorberOptions > paragraphAbsorberOptions)
 Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page with the specified parameters. More...
 
ASPOSE_PDF_SHARED_API ParagraphAbsorber (int32_t sectionsSearchDepth, System::SharedPtr< Aspose::Pdf::Text::ParagraphAbsorberOptions > paragraphAbsorberOptions)
 Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page with the specified parameters. More...
 
ASPOSE_PDF_SHARED_API void Visit (System::SharedPtr< Document > doc)
 Performs search for sections and paragraphs on the specified Document. More...
 
ASPOSE_PDF_SHARED_API void Visit (System::SharedPtr< Page > page)
 Performs search on the specified Page. More...
 
- Public Member Functions inherited from System::Object
ASPOSECPP_SHARED_API Object ()
 Creates object. Initializes all internal data structures. More...
 
virtual ASPOSECPP_SHARED_API ~Object ()
 Destroys object. Frees all internal data structures. More...
 
ASPOSECPP_SHARED_API Object (Object const &x)
 Copy constructor. Doesn't copy anything, really, just initializes new object and enables copy constructing subclasses. More...
 
Objectoperator= (Object const &x)
 Assignment operator. Doesn't copy anything, really, just initializes new object and enables copy constructing subclasses. More...
 
ObjectSharedRefAdded ()
 Increments shared reference count. Shouldn't be called directly; instead, use smart pointers or ThisProtector. More...
 
int SharedRefRemovedSafe ()
 Decrements and returns shared reference count. Shouldn't be called directly; instead, use smart pointers or ThisProtector. More...
 
int RemovedSharedRefs (int count)
 Decreases shared reference count by specified value. More...
 
Detail::SmartPtrCounter * WeakRefAdded ()
 Increments weak reference count. Shouldn't be called directly; instead, use smart pointers or ThisProtector. More...
 
void WeakRefRemoved ()
 Decrements weak reference count. Shouldn't be called directly; instead, use smart pointers or ThisProtector. More...
 
Detail::SmartPtrCounter * GetCounter ()
 Gets reference counter data structure associated with the object. More...
 
int SharedCount () const
 Gets current value of shared reference counter. More...
 
ASPOSECPP_SHARED_API void Lock ()
 Implements C# lock() statement locking. Call directly or use LockContext sentry object. More...
 
ASPOSECPP_SHARED_API void Unlock ()
 Implements C# lock() statement unlocking. Call directly or use LockContext sentry object. More...
 
virtual ASPOSECPP_SHARED_API bool Equals (ptr obj)
 Compares objects using C# Object.Equals semantics. More...
 
virtual ASPOSECPP_SHARED_API int32_t GetHashCode () const
 Analog of C# Object.GetHashCode() method. Enables hashing of custom objects. More...
 
virtual ASPOSECPP_SHARED_API String ToString () const
 Analog of C# Object.ToString() method. Enables converting custom objects to string. More...
 
virtual ASPOSECPP_SHARED_API ptr MemberwiseClone () const
 Analog of C# Object.MemberwiseClone() method. Enables cloning custom types. More...
 
virtual ASPOSECPP_SHARED_API const TypeInfoGetType () const
 Gets actual type of object. Analog of C# System.Object.GetType() call. More...
 
virtual ASPOSECPP_SHARED_API bool Is (const TypeInfo &targetType) const
 Check if object represents an instance of type described by targetType. Analog of C# 'is' operator. More...
 
virtual ASPOSECPP_SHARED_API void SetTemplateWeakPtr (uint32_t argument)
 Set n'th template argument a weak pointer (rather than shared). Allows switching pointers in containers to weak mode. More...
 
virtual ASPOSECPP_SHARED_API bool FastCast (const Details::FastRttiBase &helper, void **out_ptr) const
 For internal purposes only. More...
 
template<>
bool Equals (float const &objA, float const &objB)
 Emulates C#-style floating point comparison where two NaNs are considered equal even though according to IEC 60559:1989 NaN is not equal to any value, including NaN. More...
 
template<>
bool Equals (double const &objA, double const &objB)
 Emulates C#-style floating point comparison where two NaNs are considered equal even though according to IEC 60559:1989 NaN is not equal to any value, including NaN. More...
 
template<>
bool ReferenceEquals (String const &str, std::nullptr_t)
 Specialization of Object::ReferenceEquals for case of string and nullptr. More...
 
template<>
bool ReferenceEquals (String const &str1, String const &str2)
 Specialization of Object::ReferenceEquals for case of strings. More...
 

Static Protected Member Functions

static System::ArrayPtr< int32_t > GetGridIndicesForRect (System::SharedPtr< Rectangle > rect, double scaleFactor)
 
static System::SharedPtr< RectangleGetRectFromGridByIndices (System::Details::ArrayView< int32_t > indices, double scaleFactor)
 
static int32_t GetGridIndexForX (double x, double scaleFactor)
 
static bool IsNullOrWhitespace (System::String str)
 
static bool IsCapitalLetter (char16_t c)
 

Additional Inherited Members

- Public Types inherited from System::Object
typedef SmartPtr< Objectptr
 Alias for smart pointer type. More...
 
- Static Public Member Functions inherited from System::Object
static bool ReferenceEquals (ptr const &objA, ptr const &objB)
 Compares objects by reference. More...
 
template<typename T >
static std::enable_if<!IsSmartPtr< T >::value, bool >::type ReferenceEquals (T const &objA, T const &objB)
 Compares objects by reference. More...
 
template<typename T >
static std::enable_if<!IsSmartPtr< T >::value, bool >::type ReferenceEquals (T const &objA, std::nullptr_t)
 Reference-compares value type object with nullptr. More...
 
template<typename T1 , typename T2 >
static std::enable_if< IsSmartPtr< T1 >::value &&IsSmartPtr< T2 >::value, bool >::type Equals (T1 const &objA, T2 const &objB)
 Compares reference type objects in C# style. More...
 
template<typename T1 , typename T2 >
static std::enable_if<!IsSmartPtr< T1 >::value &&!IsSmartPtr< T2 >::value, bool >::type Equals (T1 const &objA, T2 const &objB)
 Compares value type objects in C# style. More...
 
static const TypeInfoType ()
 Implements C# typeof(System.Object) construct. More...
 

Detailed Description

Represents an absorber object of page structure objects such as sections and paragraphs. Performs search for sections and paragraphs of text and provides access for rectangles and polydons that describes it in text coordinate space. Also performs text segments search and provides access to search results via TextFragments collections grouped by structure elements.

When the search is completed the ParagraphAbsorber::PageMarkups collection will contains PageMarkup objects that represents page structure by collections of MarkupSection and MarkupParagraph. The TextFragment object provides access to the search occurrence text, text properties, and allows to edit text and change the text state (font, font size, color etc).

Constructor & Destructor Documentation

◆ ParagraphAbsorber() [1/4]

ASPOSE_PDF_SHARED_API Aspose::Pdf::Text::ParagraphAbsorber::ParagraphAbsorber ( )

Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page.

◆ ParagraphAbsorber() [2/4]

ASPOSE_PDF_SHARED_API Aspose::Pdf::Text::ParagraphAbsorber::ParagraphAbsorber ( int32_t  sectionsSearchDepth)

Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page.

See ParagraphAbsorber::SectionsSearchDepth property for more hints about the parameter.

Parameters
sectionsSearchDepthNumber of sequential searches for more fine elements of structure that will be performed.

◆ ParagraphAbsorber() [3/4]

ASPOSE_PDF_SHARED_API Aspose::Pdf::Text::ParagraphAbsorber::ParagraphAbsorber ( System::SharedPtr< Aspose::Pdf::Text::ParagraphAbsorberOptions paragraphAbsorberOptions)

Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page with the specified parameters.

Parameters
paragraphAbsorberOptionsThe ParagraphAbsorberOptions.

◆ ParagraphAbsorber() [4/4]

ASPOSE_PDF_SHARED_API Aspose::Pdf::Text::ParagraphAbsorber::ParagraphAbsorber ( int32_t  sectionsSearchDepth,
System::SharedPtr< Aspose::Pdf::Text::ParagraphAbsorberOptions paragraphAbsorberOptions 
)

Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page with the specified parameters.

Parameters
sectionsSearchDepthNumber of sequential searches for more fine elements of structure that will be performed.
paragraphAbsorberOptionsThe ParagraphAbsorberOptions.

Member Function Documentation

◆ get_IsMulticolumnParagraphsAllowed()

ASPOSE_PDF_SHARED_API bool Aspose::Pdf::Text::ParagraphAbsorber::get_IsMulticolumnParagraphsAllowed ( ) const

Gets value that indicates whether starting text lines of a next section may be treated as continuation of the last paragraph of a previous section.

◆ get_PageMarkups()

ASPOSE_PDF_SHARED_API System::SharedPtr<System::Collections::Generic::List<System::SharedPtr<PageMarkup> > > Aspose::Pdf::Text::ParagraphAbsorber::get_PageMarkups ( ) const

Gets collection of PageMarkup that were absorbed.

◆ get_ParagraphAbsorberOptions()

ASPOSE_PDF_SHARED_API System::SharedPtr<Aspose::Pdf::Text::ParagraphAbsorberOptions> Aspose::Pdf::Text::ParagraphAbsorber::get_ParagraphAbsorberOptions ( ) const

◆ get_SectionsSearchDepth()

ASPOSE_PDF_SHARED_API int32_t Aspose::Pdf::Text::ParagraphAbsorber::get_SectionsSearchDepth ( ) const

Gets value that instructs how many times sequential searches for more fine elements of structure will be performed. Default search depth is 3. It means three searches for horizontally divided sections (headers, paragraphs etc) and three searches for vertically divided ones (columns).

Increasing of this value may lead to minor decreasing performance with no visible changes in search result. Decreasing of this value may lead to incorrect determination of paragraphs in sections. We are not recommend to set value less than default if you aren't desire to get only 'rough' elements of page structure.

◆ GetGridIndexForX()

static int32_t Aspose::Pdf::Text::ParagraphAbsorber::GetGridIndexForX ( double  x,
double  scaleFactor 
)
staticprotected

◆ GetGridIndicesForRect()

static System::ArrayPtr<int32_t> Aspose::Pdf::Text::ParagraphAbsorber::GetGridIndicesForRect ( System::SharedPtr< Rectangle rect,
double  scaleFactor 
)
staticprotected

◆ GetRectFromGridByIndices()

static System::SharedPtr<Rectangle> Aspose::Pdf::Text::ParagraphAbsorber::GetRectFromGridByIndices ( System::Details::ArrayView< int32_t >  indices,
double  scaleFactor 
)
staticprotected

◆ IsCapitalLetter()

static bool Aspose::Pdf::Text::ParagraphAbsorber::IsCapitalLetter ( char16_t  c)
staticprotected

◆ IsNullOrWhitespace()

static bool Aspose::Pdf::Text::ParagraphAbsorber::IsNullOrWhitespace ( System::String  str)
staticprotected

◆ set_IsMulticolumnParagraphsAllowed()

ASPOSE_PDF_SHARED_API void Aspose::Pdf::Text::ParagraphAbsorber::set_IsMulticolumnParagraphsAllowed ( bool  value)

Sets value that indicates whether starting text lines of a next section may be treated as continuation of the last paragraph of a previous section.

◆ set_ParagraphAbsorberOptions()

ASPOSE_PDF_SHARED_API void Aspose::Pdf::Text::ParagraphAbsorber::set_ParagraphAbsorberOptions ( System::SharedPtr< Aspose::Pdf::Text::ParagraphAbsorberOptions value)

◆ set_SectionsSearchDepth()

ASPOSE_PDF_SHARED_API void Aspose::Pdf::Text::ParagraphAbsorber::set_SectionsSearchDepth ( int32_t  value)

Sets value that instructs how many times sequential searches for more fine elements of structure will be performed. Default search depth is 3. It means three searches for horizontally divided sections (headers, paragraphs etc) and three searches for vertically divided ones (columns).

Increasing of this value may lead to minor decreasing performance with no visible changes in search result. Decreasing of this value may lead to incorrect determination of paragraphs in sections. We are not recommend to set value less than default if you aren't desire to get only 'rough' elements of page structure.

◆ Visit() [1/2]

ASPOSE_PDF_SHARED_API void Aspose::Pdf::Text::ParagraphAbsorber::Visit ( System::SharedPtr< Document doc)

Performs search for sections and paragraphs on the specified Document.

Parameters
docPdf document object.

◆ Visit() [2/2]

ASPOSE_PDF_SHARED_API void Aspose::Pdf::Text::ParagraphAbsorber::Visit ( System::SharedPtr< Page page)

Performs search on the specified Page.

Parameters
pagePdf document page object.