PdfExtractor Class
Class for extracting images and text from PDF document.
Inheritance Hierarchy

Namespace: Aspose.Pdf.Facades
Assembly: Aspose.PDF (in Aspose.PDF.dll) Version: 19.10
Syntax
public sealed class PdfExtractor : Facade

The PdfExtractor type exposes the following members.

Constructors
  NameDescription
Public methodPdfExtractor
Initializes new PdfExtractor object.
Public methodPdfExtractor(Document)
Initializes new PdfExtractor object on base of the document.
Properties
  NameDescription
Public propertyDocument
Gets the document facade is working on.
(Inherited from Facade.)
Public propertyCode exampleEndPage
Gets or sets end page in the page range where extracting operation will be performed.
Examples
C#
PdfExtractor ext = new PdfExtractor();
ext.BindBdf("sample.pdf");
ext.StartPage = 2;
ext.EndPage = 3;
ext.ExtractText();
Public propertyExtractImageMode
Sets the mode for extract images process.
Public propertyCode exampleExtractTextMode
Sets the mode for extract text's result.
Public propertyIsBidi
Is true when text has hebriew or arabic symbols. This case must be specially considered because string functions change their behaviour and start process text from right to left (except numbers and other non text chars).
Public propertyPassword
Gets or sets input file's password.
Public propertyResolution
Set or gets resolution for extracted images. Default value is 150. Images which have greater resolution value are more clear. However increasing resolution value results in increasing time and memory needed to extract images. Usually to get clear image it's enough to set resolution to 150 or 300.
Public propertyCode exampleStartPage
Gets or sets start page in the page range where extracting operation will be performed.
Examples
C#
PdfExtractor ext = new PdfExtractor();
ext.BindBdf("sample.pdf");
ext.StartPage = 2;
ext.EndPage = 5;
ext.ExtractText();
Public propertyTextSearchOptions
Gets or sets text search options.
Methods
  NameDescription
Public methodCode exampleBindPdf(Stream)
Binds PDF document from stream.
(Overrides FacadeBindPdf(Stream).)
Public methodCode exampleBindPdf(String)
Bind input PDF file.
(Overrides FacadeBindPdf(String).)
Public methodBindPdf(Document)
Initializes the facade.
(Inherited from Facade.)
Public methodClose
Disposes Aspose.Pdf.Document bound with a facade.
(Inherited from Facade.)
Public methodDispose
Disposes the facade.
(Inherited from Facade.)
Public methodEquals
Determines whether the specified Object is equal to the current Object.
(Inherited from Object.)
Public methodExtractAttachment
Extracts attachments from a Pdf document.
Public methodExtractAttachment(String)
Extracts attachment to PDF file by attachment name.
Public methodCode exampleExtractImage
Extract images from PDF file.
Public methodCode exampleExtractText
Extracts text from a Pdf document using Unicode encoding.
Public methodCode exampleExtractText(Encoding)
Extracts text from a Pdf document using specified encoding.
Public methodCode exampleGetAttachment
Saves all the attachment file to streams.
Public methodGetAttachment(String)
Stores attachment into file.
Public methodGetAttachmentInfo
Gets the list of attachments.
Public methodCode exampleGetAttachNames
Returns list of attachments in PDF file. Note: ExtractAttachments must be called befor using this method.
Public methodGetHashCode
Serves as a hash function for a particular type.
(Inherited from Object.)
Public methodGetNextImage(Stream)
Retreive next image from PDF file and stores it into stream.
Public methodCode exampleGetNextImage(String)
Retreives next image from PDF document. Note: ExtractImage must be called before using of this method.
Public methodGetNextImage(Stream, ImageFormat)
Retreive next image from PDF file and stores it into stream with given image format.
Public methodGetNextImage(String, ImageFormat)
Retreives next image from PDF document with given image format. Note: ExtractImage must be called before using of this method.
Public methodCode exampleGetNextPageText(Stream)
Saves one page's text to stream.
Public methodCode exampleGetNextPageText(String)
Saves one page's text to file.
Public methodGetText(Stream)
Public methodGetText(String)
Public methodGetText(Stream, Boolean)
Public methodGetType
Gets the type of the current instance.
(Inherited from Object.)
Public methodCode exampleHasNextImage
Checks if more images are accessible in PDF document. Note: ExtractImage must be called before using of this method.
Public methodCode exampleHasNextPageText
Indicates that whether can get more texts or not.
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
See Also