public final class TextDevice extends PageDevice
Represents class for converting pdf document pages into text.
The example demonstrates how to extract text on the first PDF document page.
Document doc = new Document(inFile);
String extractedText;
ByteArrayOutputStream ms = new ByteArrayOutputStream();
try
{
// create text device
TextDevice device = new TextDevice();
// convert the page and save text to the stream
device.process(doc.getPages().get_Item(1), ms);
// use the extracted text
extractedText = Encoding.getUnicode().getString(ms.toByteArray());
ms.close();
} catch (IOException e) {
e.printStackTrace();
}
The TextDevice object is basically used to extract text from pdf page.
| Constructor and Description |
|---|
TextDevice()
Initializes a new instance of the
TextDevice with the Raw text formatting mode and
Unicode text encoding. |
TextDevice(Charset encoding)
Initializes a new instance of the
TextDevice for the specified encoding. |
TextDevice(TextEncodingInternal encoding)
Initializes a new instance of the
TextDevice for the specified encoding. |
TextDevice(TextExtractionOptions extractionOptions)
Initializes a new instance of the
TextDevice with text extraction options. |
TextDevice(TextExtractionOptions extractionOptions,
Charset encoding)
Initializes a new instance of the
TextDevice for the specified encoding with text
extraction options. |
TextDevice(TextExtractionOptions extractionOptions,
TextEncodingInternal encoding)
Initializes a new instance of the
TextDevice for the specified encoding with text
extraction options. |
| Modifier and Type | Method and Description |
|---|---|
Charset |
getEncoding()
Gets encoding of extracted text.
|
TextEncodingInternal |
getEncodingInternal()
Gets encoding of extracted text.
|
TextExtractionOptions |
getExtractionOptions()
Gets text extraction options.
|
void |
process(Page page,
OutputStream output)
Convert page and save it as text stream.
|
void |
processInternal(Page page,
com.aspose.ms.System.IO.Stream output)
Convert page and save it as text stream.
|
void |
setEncoding(Charset value)
Sets encoding of extracted text.
|
void |
setEncodingInternal(TextEncodingInternal value)
Sets encoding of extracted text.
|
void |
setExtractionOptions(TextExtractionOptions value)
Sets text extraction options.
|
process, processpublic TextDevice(TextExtractionOptions extractionOptions)
Initializes a new instance of the TextDevice with text extraction options.
extractionOptions - Text extraction options.public TextDevice()
Initializes a new instance of the TextDevice with the Raw text formatting mode and
Unicode text encoding.
public TextDevice(TextEncodingInternal encoding)
Initializes a new instance of the TextDevice for the specified encoding.
encoding - Encoding of extracted textpublic TextDevice(Charset encoding)
Initializes a new instance of the TextDevice for the specified encoding.
encoding - Encoding of extracted textpublic TextDevice(TextExtractionOptions extractionOptions, TextEncodingInternal encoding)
Initializes a new instance of the TextDevice for the specified encoding with text
extraction options.
extractionOptions - Text extraction options.encoding - Encoding of extracted text.public TextDevice(TextExtractionOptions extractionOptions, Charset encoding)
Initializes a new instance of the TextDevice for the specified encoding with text
extraction options.
extractionOptions - Text extraction options.encoding - Encoding of extracted text.public TextExtractionOptions getExtractionOptions()
Gets text extraction options.
The example demonstrates how to extracted text in raw order. Document doc = new Document(inFile); String extractedText; // create text device TextDevice device = new TextDevice(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Raw)); // convert the page and save text to the stream device.process(doc.getPages().get_Item(1), outFile);
public void setExtractionOptions(TextExtractionOptions value)
Sets text extraction options.
value - TextExtractionOptions element
The example demonstrates how to extracted text in raw order.
Document doc = new Document(inFile);
String extractedText;
// create text device
TextDevice device = new TextDevice(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Raw));
// convert the page and save text to the stream
device.process(doc.getPages().get_Item(1), outFile);
public TextEncodingInternal getEncodingInternal()
Gets encoding of extracted text.
The example demonstrates how to represent extracted text in UTF-8 encoding.
Document doc = new Document(inFile);
String extractedText;
// create text device
TextDevice device = new TextDevice(java.nio.charset.Charset.forName("UTF-8"));
// convert the page and save text to the stream
device.process(doc.getPages().get_Item(1), outFile);
public Charset getEncoding()
Gets encoding of extracted text.
The example demonstrates how to represent extracted text in UTF-8 encoding.
Document doc = new Document(inFile);
String extractedText;
// create text device
TextDevice device = new TextDevice(java.nio.charset.Charset.forName("UTF-8"));
// convert the page and save text to the stream
device.process(doc.getPages().get_Item(1), outFile);
public void setEncodingInternal(TextEncodingInternal value)
Sets encoding of extracted text.
value - TextEncodingInternal element
The example demonstrates how to represent extracted text in UTF-8 encoding.
Document doc = new Document(inFile);
String extractedText;
// create text device
TextDevice device = new TextDevice(TextEncodingInternal.getUTF8());
// convert the page and save text to the stream
device.process(doc.getPages().get_Item(1), outFile);
public void setEncoding(Charset value)
Sets encoding of extracted text.
value - Charset element
The example demonstrates how to represent extracted text in UTF-8 encoding.
Document doc = new Document(inFile);
String extractedText;
// create text device
TextDevice device = new TextDevice(java.nio.charset.Charset.forName("UTF-8"));
// convert the page and save text to the stream
device.process(doc.getPages().get_Item(1), outFile);
public void processInternal(Page page, com.aspose.ms.System.IO.Stream output)
Convert page and save it as text stream.
The example demonstrates how to extract text on the first PDF document page. Document doc = new Document(inFile); String extractedText; ByteArrayOutputStream ms = new ByteArrayOutputStream(); // create text device TextDevice device = new TextDevice(); // convert the page and save text to the stream device.process(doc.getPages().get_Item(1), ms); // use the extracted text extractedText = Encoding.getUnicode().getString(ms.toByteArray()); ms.close();
processInternal in class PageDevicepage - The page to convert.output - Result stream.public void process(Page page, OutputStream output)
Convert page and save it as text stream.
The example demonstrates how to extract text on the first PDF document page. Document doc = new Document(inFile); String extractedText; ByteArrayOutputStream ms = new ByteArrayOutputStream(); // create text device TextDevice device = new TextDevice(); // convert the page and save text to the stream device.process(doc.getPages().get_Item(1), ms); // use the extracted text extractedText = Encoding.getUnicode().getString(ms.toByteArray()); ms.close();
process in class PageDevicepage - The page to convert.output - Result stream.Copyright © 2025 Aspose. All Rights Reserved.