Class TextDevice

TextDevice class

Represents class for converting pdf document pages into text.

public sealed class TextDevice : PageDevice

Constructors

NameDescription
TextDevice()Initializes a new instance of the TextDevice with the Raw text formatting mode and Unicode text encoding.
TextDevice(Encoding)Initializes a new instance of the TextDevice for the specified encoding.
TextDevice(TextExtractionOptions)Initializes a new instance of the TextDevice with text extraction options.
TextDevice(TextExtractionOptions, Encoding)Initializes a new instance of the TextDevice for the specified encoding with text extraction options.

Properties

NameDescription
Encoding { get; set; }Gets or sets encoding of extracted text.
ExtractionOptions { get; set; }Gets or sets text extraction options.

Methods

NameDescription
override Process(Page, Stream)Convert page and save it as text stream.
Process(Page, string)Perfoms some operation on the given page and saves results into the file.

Remarks

The TextDevice object is basically used to extract text from pdf page.

Examples

The example demonstrates how to extract text on the first PDF document page.

Document doc = new Document(inFile);
string extractedText;

using (MemoryStream ms = new MemoryStream())
{
    // create text device
    TextDevice device = new TextDevice();

    // convert the page and save text to the stream
    device.Process(doc.Pages[1], ms);

    // use the extracted text
    ms.Close();
    extractedText = Encoding.Unicode.GetString(ms.ToArray());
}

See Also