public class FieldStart
A complete field in a Microsoft Word document is a complex structure consisting of a field start character, field code, field separator character, field result and field end character. Some fields only have field start, field code and field end.
To easily insert a new field into a document, use the
Example:
Shows how to work with a document's field collection.public void fieldCollection() throws Exception { // Open a document that has fields Document doc = new Document(getMyDir() + "Document.ContainsFields.docx"); // Get the collection that contains all the fields in a document FieldCollection fields = doc.getRange().getFields(); Assert.assertEquals(fields.getCount(), 6); // Iterate over the field collection and print contents and type of every field using a custom visitor implementation FieldVisitor fieldVisitor = new FieldVisitor(); Iterator<Field> fieldEnumerator = fields.iterator(); while (fieldEnumerator.hasNext()) { if (fieldEnumerator.next() != null) { Field currentField = fieldEnumerator.next(); currentField.getStart().accept(fieldVisitor); if (currentField.getSeparator() != null) { currentField.getSeparator().accept(fieldVisitor); } currentField.getEnd().accept(fieldVisitor); } else { System.out.println("There are no fields in the document."); } } System.out.println(fieldVisitor.getText()); // Get a field to remove itself fields.get(0).remove(); Assert.assertEquals(fields.getCount(), 5); // Remove a field by reference Field lastField = fields.get(3); fields.remove(lastField); Assert.assertEquals(fields.getCount(), 4); // Remove a field by index fields.removeAt(2); Assert.assertEquals(fields.getCount(), 3); // Remove all fields from the document fields.clear(); Assert.assertEquals(fields.getCount(), 0); } /// <summary> /// Document visitor implementation that prints field info /// </summary> public static class FieldVisitor extends DocumentVisitor { public FieldVisitor() { mBuilder = new StringBuilder(); } /// <summary> /// Gets the plain text of the document that was accumulated by the visitor. /// </summary> public String getText() { return mBuilder.toString(); } /// <summary> /// Called when a FieldStart node is encountered in the document. /// </summary> public int visitFieldStart(final FieldStart fieldStart) { mBuilder.append("Found field: " + fieldStart.getFieldType() + "\r\n"); mBuilder.append("\tField code: " + fieldStart.getField().getFieldCode() + "\r\n"); mBuilder.append("\tDisplayed as: " + fieldStart.getField().getResult() + "\r\n"); return VisitorAction.CONTINUE; } /// <summary> /// Called when a FieldSeparator node is encountered in the document. /// </summary> public int visitFieldSeparator(final FieldSeparator fieldSeparator) { mBuilder.append("\tFound separator: " + fieldSeparator.getText() + "\r\n"); return VisitorAction.CONTINUE; } /// <summary> /// Called when a FieldEnd node is encountered in the document. /// </summary> public int visitFieldEnd(final FieldEnd fieldEnd) { mBuilder.append("End of field: " + fieldEnd.getFieldType() + "\r\n"); return VisitorAction.CONTINUE; } private StringBuilder mBuilder; }
Example:
Finds all hyperlinks in a Word document and changes their URL and display name.import com.aspose.words.*; import org.testng.annotations.Test; import java.util.regex.Matcher; import java.util.regex.Pattern; /** * Shows how to replace hyperlinks in a Word document. */ public class ExReplaceHyperlinks extends ApiExampleBase { /** * Finds all hyperlinks in a Word document and changes their URL and display name. */ public void replaceHyperlinks() throws Exception { // Specify your document name here. Document doc = new Document(getMyDir() + "ReplaceHyperlinks.doc"); // Hyperlinks in a Word documents are fields, select all field start nodes so we can find the hyperlinks. NodeList fieldStarts = doc.selectNodes("//FieldStart"); for (FieldStart fieldStart : (Iterable<FieldStart>) fieldStarts) { if (fieldStart.getFieldType() == FieldType.FIELD_HYPERLINK) { // The field is a hyperlink field, use the "facade" class to help to deal with the field. Hyperlink hyperlink = new Hyperlink(fieldStart); // Some hyperlinks can be local (links to bookmarks inside the document), ignore these. if (hyperlink.isLocal()) continue; // The ApiExamples.Tests.Examples.Hyperlink class allows to set the target URL and the display name // of the link easily by setting the properties. hyperlink.setTarget(NEW_URL); hyperlink.setName(NEW_NAME); } } doc.save(getArtifactsDir() + "ReplaceHyperlinks.doc"); } private static final String NEW_URL = "http://www.aspose.com"; private static final String NEW_NAME = "Aspose - The .NET & Java Component Publisher"; } /** * This "facade" class makes it easier to work with a hyperlink field in a Word document. * <p> * A hyperlink is represented by a HYPERLINK field in a Word document. A field in Aspose.Words * consists of several nodes and it might be difficult to work with all those nodes directly. * Note this is a simple implementation and will work only if the hyperlink code and name * each consist of one Run only. * <p> * [FieldStart][Run - field code][FieldSeparator][Run - field result][FieldEnd] * <p> * The field code contains a String in one of these formats: * HYPERLINK "url" * HYPERLINK \l "bookmark name" * <p> * The field result contains text that is displayed to the user. */ class Hyperlink { Hyperlink(final FieldStart fieldStart) throws Exception { if (fieldStart == null) { throw new IllegalArgumentException("fieldStart"); } if (fieldStart.getFieldType() != FieldType.FIELD_HYPERLINK) { throw new IllegalArgumentException("Field start type must be FieldHyperlink."); } mFieldStart = fieldStart; // Find the field separator node. mFieldSeparator = findNextSibling(mFieldStart, NodeType.FIELD_SEPARATOR); if (mFieldSeparator == null) { throw new IllegalStateException("Cannot find field separator."); } // Find the field end node. Normally field end will always be found, but in the example document // there happens to be a paragraph break included in the hyperlink and this puts the field end // in the next paragraph. It will be much more complicated to handle fields which span several // paragraphs correctly, but in this case allowing field end to be null is enough for our purposes. mFieldEnd = findNextSibling(mFieldSeparator, NodeType.FIELD_END); // Field code looks something like [ HYPERLINK "http:\\www.myurl.com" ], but it can consist of several runs. String fieldCode = getTextSameParent(mFieldStart.getNextSibling(), mFieldSeparator); Matcher matcher = G_REGEX.matcher(fieldCode.trim()); matcher.find(); mIsLocal = (matcher.group(1) != null) && (matcher.group(1).length() > 0); //The link is local if \l is present in the field code. mTarget = matcher.group(2).toString(); } /** * Gets or sets the display name of the hyperlink. */ String getName() throws Exception { return getTextSameParent(mFieldSeparator, mFieldEnd); } void setName(final String value) throws Exception { // ApiExamples.Tests.Examples.Hyperlink display name is stored in the field result which is a Run // node between field separator and field end. Run fieldResult = (Run) mFieldSeparator.getNextSibling(); fieldResult.setText(value); // But sometimes the field result can consist of more than one run, delete these runs. removeSameParent(fieldResult.getNextSibling(), mFieldEnd); } /** * Gets or sets the target url or bookmark name of the hyperlink. */ String getTarget() { return mTarget; } void setTarget(final String value) throws Exception { mTarget = value; updateFieldCode(); } /** * True if the hyperlink's target is a bookmark inside the document. False if the hyperlink is a url. */ boolean isLocal() { return mIsLocal; } void isLocal(final boolean value) throws Exception { mIsLocal = value; updateFieldCode(); } private void updateFieldCode() throws Exception { // Field code is stored in a Run node between field start and field separator. Run fieldCode = (Run) mFieldStart.getNextSibling(); fieldCode.setText(java.text.MessageFormat.format("HYPERLINK {0}\"{1}\"", ((mIsLocal) ? "\\l " : ""), mTarget)); // But sometimes the field code can consist of more than one run, delete these runs. removeSameParent(fieldCode.getNextSibling(), mFieldSeparator); } /** * Goes through siblings starting from the start node until it finds a node of the specified type or null. */ private static Node findNextSibling(final Node startNode, final int nodeType) { for (Node node = startNode; node != null; node = node.getNextSibling()) { if (node.getNodeType() == nodeType) return node; } return null; } /** * Retrieves text from start up to but not including the end node. */ private static String getTextSameParent(final Node startNode, final Node endNode) { if ((endNode != null) && (startNode.getParentNode() != endNode.getParentNode())) { throw new IllegalArgumentException("Start and end nodes are expected to have the same parent."); } StringBuilder builder = new StringBuilder(); for (Node child = startNode; !child.equals(endNode); child = child.getNextSibling()) { builder.append(child.getText()); } return builder.toString(); } /** * Removes nodes from start up to but not including the end node. * Start and end are assumed to have the same parent. */ private static void removeSameParent(final Node startNode, final Node endNode) { if ((endNode != null) && (startNode.getParentNode() != endNode.getParentNode())) { throw new IllegalArgumentException("Start and end nodes are expected to have the same parent."); } Node curChild = startNode; while ((curChild != null) && (curChild != endNode)) { Node nextChild = curChild.getNextSibling(); curChild.remove(); curChild = nextChild; } } private final Node mFieldStart; private final Node mFieldSeparator; private final Node mFieldEnd; private boolean mIsLocal; private String mTarget; /** * RK I am notoriously bad at regexes. It seems I don't understand their way of thinking. */ private static final Pattern G_REGEX = Pattern.compile("\\S+" + // one or more non spaces HYPERLINK or other word in other languages "\\s+" + // one or more spaces "(?:\"\"\\s+)?" + // non capturing optional "" and one or more spaces, found in one of the customers files. "(\\\\l\\s+)?" + // optional \l flag followed by one or more spaces "\"" + // one apostrophe "([^\"]+)" + // one or more chars except apostrophe (hyperlink target) "\"" // one closing apostrophe ); }
Property Getters/Setters Summary | ||
---|---|---|
DocumentBase | getDocument() | |
Gets the document to which this node belongs.
|
||
int | getFieldType() | |
Returns the type of the field.
The value of the property is FieldType integer constant. |
||
Font | getFont() | |
Provides access to the font formatting of this object.
|
||
boolean | isComposite() | |
Returns true if this node can contain other nodes.
|
||
boolean | isDeleteRevision() | |
Returns true if this object was deleted in Microsoft Word while change tracking was enabled.
|
||
boolean | isDirty() | |
void | isDirty(booleanvalue) | |
Gets or sets whether the current result of the field is no longer correct (stale) due to other modifications made to the document. | ||
boolean | isFormatRevision() | |
Returns true if formatting of the object was changed in Microsoft Word while change tracking was enabled.
|
||
boolean | isInsertRevision() | |
Returns true if this object was inserted in Microsoft Word while change tracking was enabled.
|
||
boolean | isLocked() | |
void | isLocked(booleanvalue) | |
Gets or sets whether the parent field is locked (should not recalculate its result). | ||
boolean | isMoveFromRevision() | |
Returns true if this object was moved (deleted) in Microsoft Word while change tracking was enabled.
|
||
boolean | isMoveToRevision() | |
Returns true if this object was moved (inserted) in Microsoft Word while change tracking was enabled.
|
||
Node | getNextSibling() | |
Gets the node immediately following this node.
|
||
int | getNodeType() | |
Returns |
||
CompositeNode | getParentNode() | |
Gets the immediate parent of this node.
|
||
Paragraph | getParentParagraph() | |
Retrieves the parent |
||
Node | getPreviousSibling() | |
Gets the node immediately preceding this node.
|
||
Range | getRange() | |
Returns a Range object that represents the portion of a document that is contained in this node.
|
Method Summary | ||
---|---|---|
boolean | accept(DocumentVisitor visitor) | |
Accepts a visitor.
|
||
Node | deepClone(boolean isCloneChildren) | |
CompositeNode | getAncestor(int ancestorType) | |
Gets the first ancestor of the specified |
||
CompositeNode | getAncestor(java.lang.Class ancestorType) | |
Gets the first ancestor of the specified object type.
|
||
Field | getField() | |
Returns a field for the field char.
|
||
java.lang.String | getText() | |
Gets the special character that this node represents.
|
||
Node | nextPreOrder(Node rootNode) | |
Gets next node according to the pre-order tree traversal algorithm.
|
||
Node | previousPreOrder(Node rootNode) | |
Gets the previous node according to the pre-order tree traversal algorithm.
|
||
void | remove() | |
Removes itself from the parent.
|
||
java.lang.String | toString(SaveOptions saveOptions) | |
Exports the content of the node into a string using the specified save options.
|
||
java.lang.String | toString(int saveFormat) | |
Exports the content of the node into a string in the specified format.
|
public DocumentBase getDocument()
The node always belongs to a document even if it has just been created and not yet added to the tree, or if it has been removed from the tree.
Example:
Shows that when you create any node, it requires a document that will own the node.// Open a file from disk. Document doc = new Document(); // Creating a new node of any type requires a document passed into the constructor. Paragraph para = new Paragraph(doc); // The new paragraph node does not yet have a parent. System.out.println("Paragraph has no parent node: " + (para.getParentNode() == null)); // But the paragraph node knows its document. System.out.println("Both nodes' documents are the same: " + (para.getDocument() == doc)); // The fact that a node always belongs to a document allows us to access and modify // properties that reference the document-wide data such as styles or lists. para.getParagraphFormat().setStyleName("Heading 1"); // Now add the paragraph to the main text of the first section. doc.getFirstSection().getBody().appendChild(para); // The paragraph node is now a child of the Body node. System.out.println("Paragraph has a parent node: " + (para.getParentNode() != null));
public int getFieldType()
Example:
Demonstrates how to retrieve the field class from an existing FieldStart node in the document.Document doc = new Document(getMyDir() + "Document.TableOfContents.doc"); FieldChar fieldStart = (FieldChar) doc.getChild(NodeType.FIELD_START, 0, true); Assert.assertEquals(fieldStart.getFieldType(), FieldType.FIELD_TOC); Assert.assertEquals(fieldStart.isDirty(), true); Assert.assertEquals(fieldStart.isLocked(), false); // Retrieve the facade object which represents the field in the document. Field field = fieldStart.getField(); Assert.assertEquals(false, field.isLocked()); Assert.assertEquals(" TOC \\o \"1-3\" \\h \\z \\u ", field.getFieldCode()); // This updates only this field in the document. field.update();
public Font getFont()
Example:
Creates a simple document from scratch using the Aspose.Words object model.// Create an "empty" document. Note that like in Microsoft Word, // the empty document has one section, body and one paragraph in it. Document doc = new Document(); // This truly makes the document empty. No sections (not possible in Microsoft Word). doc.removeAllChildren(); // Create a new section node. // Note that the section has not yet been added to the document, // but we have to specify the parent document. Section section = new Section(doc); // Append the section to the document. doc.appendChild(section); // Lets set some properties for the section. section.getPageSetup().setSectionStart(SectionStart.NEW_PAGE); section.getPageSetup().setPaperSize(PaperSize.LETTER); // The section that we created is empty, lets populate it. The section needs at least the Body node. Body body = new Body(doc); section.appendChild(body); // The body needs to have at least one paragraph. // Note that the paragraph has not yet been added to the document, // but we have to specify the parent document. // The parent document is needed so the paragraph can correctly work // with styles and other document-wide information. Paragraph para = new Paragraph(doc); body.appendChild(para); // We can set some formatting for the paragraph para.getParagraphFormat().setStyleName("Heading 1"); para.getParagraphFormat().setAlignment(ParagraphAlignment.CENTER); // So far we have one empty paragraph in the document. // The document is valid and can be saved, but lets add some text before saving. // Create a new run of text and add it to our paragraph. Run run = new Run(doc); run.setText("Hello World!"); run.getFont().setColor(Color.RED); para.appendChild(run); // As a matter of interest, you can retrieve text of the whole document and // see that \x000c is automatically appended. \x000c is the end of section character. System.out.println(doc.getText()); // Save the document. doc.save(getArtifactsDir() + "Section.CreateFromScratch.doc");
public boolean isComposite()
Example:
Shows how to efficiently visit all direct and indirect children of a composite node.public void recurseAllNodes() throws Exception { // Open a document. Document doc = new Document(getMyDir() + "Node.RecurseAllNodes.doc"); // Invoke the recursive function that will walk the tree. traverseAllNodes(doc); } /** * A simple function that will walk through all children of a specified node recursively * and print the type of each node to the screen. */ @Test(enabled = false) public void traverseAllNodes(final CompositeNode parentNode) { // This is the most efficient way to loop through immediate children of a node. for (Node childNode = parentNode.getFirstChild(); childNode != null; childNode = childNode.getNextSibling()) { // Do some useful work. System.out.println(Node.nodeTypeToString(childNode.getNodeType())); // Recurse into the node if it is a composite node. if (childNode.isComposite()) { traverseAllNodes((CompositeNode) childNode); } } }
public boolean isDeleteRevision()
Example:
Shows how to process revision-related properties of Inline nodes.Document doc = new Document(getMyDir() + "Inline.Revisions.docx"); // This document has 6 revisions Assert.assertEquals(doc.getRevisions().getCount(), 6); // The parent node of a revision is the run that the revision concerns, which is an Inline node Run run = (Run) doc.getRevisions().get(0).getParentNode(); // Get the parent paragraph Paragraph firstParagraph = run.getParentParagraph(); RunCollection runs = firstParagraph.getRuns(); Assert.assertEquals(runs.getCount(), 6); // The text in the run at index #2 was typed after revisions were tracked, so it will count as an insert revision // The font was changed, so it will also be a format revision Assert.assertTrue(runs.get(2).isInsertRevision()); Assert.assertTrue(runs.get(2).isFormatRevision()); // If one node was moved from one place to another while changes were tracked, // the node will be placed at the departure location as a "move to revision", // and a "move from revision" node will be left behind at the origin, in case we want to reject changes // Highlighting text and dragging it to another place with the mouse and cut-and-pasting (but not copy-pasting) both count as "move revisions" // The node with the "IsMoveToRevision" flag is the arrival of the move operation, and the node with the "IsMoveFromRevision" flag is the departure point Assert.assertTrue(runs.get(1).isMoveToRevision()); Assert.assertTrue(runs.get(4).isMoveFromRevision()); // If an Inline node gets deleted while changes are being tracked, it will leave behind a node with the IsDeleteRevision flag set to true until changes are accepted Assert.assertTrue(runs.get(5).isDeleteRevision());
public boolean isDirty() / public void isDirty(boolean value)
Example:
Demonstrates how to retrieve the field class from an existing FieldStart node in the document.Document doc = new Document(getMyDir() + "Document.TableOfContents.doc"); FieldChar fieldStart = (FieldChar) doc.getChild(NodeType.FIELD_START, 0, true); Assert.assertEquals(fieldStart.getFieldType(), FieldType.FIELD_TOC); Assert.assertEquals(fieldStart.isDirty(), true); Assert.assertEquals(fieldStart.isLocked(), false); // Retrieve the facade object which represents the field in the document. Field field = fieldStart.getField(); Assert.assertEquals(false, field.isLocked()); Assert.assertEquals(" TOC \\o \"1-3\" \\h \\z \\u ", field.getFieldCode()); // This updates only this field in the document. field.update();
public boolean isFormatRevision()
Example:
Shows how to process revision-related properties of Inline nodes.Document doc = new Document(getMyDir() + "Inline.Revisions.docx"); // This document has 6 revisions Assert.assertEquals(doc.getRevisions().getCount(), 6); // The parent node of a revision is the run that the revision concerns, which is an Inline node Run run = (Run) doc.getRevisions().get(0).getParentNode(); // Get the parent paragraph Paragraph firstParagraph = run.getParentParagraph(); RunCollection runs = firstParagraph.getRuns(); Assert.assertEquals(runs.getCount(), 6); // The text in the run at index #2 was typed after revisions were tracked, so it will count as an insert revision // The font was changed, so it will also be a format revision Assert.assertTrue(runs.get(2).isInsertRevision()); Assert.assertTrue(runs.get(2).isFormatRevision()); // If one node was moved from one place to another while changes were tracked, // the node will be placed at the departure location as a "move to revision", // and a "move from revision" node will be left behind at the origin, in case we want to reject changes // Highlighting text and dragging it to another place with the mouse and cut-and-pasting (but not copy-pasting) both count as "move revisions" // The node with the "IsMoveToRevision" flag is the arrival of the move operation, and the node with the "IsMoveFromRevision" flag is the departure point Assert.assertTrue(runs.get(1).isMoveToRevision()); Assert.assertTrue(runs.get(4).isMoveFromRevision()); // If an Inline node gets deleted while changes are being tracked, it will leave behind a node with the IsDeleteRevision flag set to true until changes are accepted Assert.assertTrue(runs.get(5).isDeleteRevision());
public boolean isInsertRevision()
Example:
Shows how to process revision-related properties of Inline nodes.Document doc = new Document(getMyDir() + "Inline.Revisions.docx"); // This document has 6 revisions Assert.assertEquals(doc.getRevisions().getCount(), 6); // The parent node of a revision is the run that the revision concerns, which is an Inline node Run run = (Run) doc.getRevisions().get(0).getParentNode(); // Get the parent paragraph Paragraph firstParagraph = run.getParentParagraph(); RunCollection runs = firstParagraph.getRuns(); Assert.assertEquals(runs.getCount(), 6); // The text in the run at index #2 was typed after revisions were tracked, so it will count as an insert revision // The font was changed, so it will also be a format revision Assert.assertTrue(runs.get(2).isInsertRevision()); Assert.assertTrue(runs.get(2).isFormatRevision()); // If one node was moved from one place to another while changes were tracked, // the node will be placed at the departure location as a "move to revision", // and a "move from revision" node will be left behind at the origin, in case we want to reject changes // Highlighting text and dragging it to another place with the mouse and cut-and-pasting (but not copy-pasting) both count as "move revisions" // The node with the "IsMoveToRevision" flag is the arrival of the move operation, and the node with the "IsMoveFromRevision" flag is the departure point Assert.assertTrue(runs.get(1).isMoveToRevision()); Assert.assertTrue(runs.get(4).isMoveFromRevision()); // If an Inline node gets deleted while changes are being tracked, it will leave behind a node with the IsDeleteRevision flag set to true until changes are accepted Assert.assertTrue(runs.get(5).isDeleteRevision());
public boolean isLocked() / public void isLocked(boolean value)
Example:
Demonstrates how to retrieve the field class from an existing FieldStart node in the document.Document doc = new Document(getMyDir() + "Document.TableOfContents.doc"); FieldChar fieldStart = (FieldChar) doc.getChild(NodeType.FIELD_START, 0, true); Assert.assertEquals(fieldStart.getFieldType(), FieldType.FIELD_TOC); Assert.assertEquals(fieldStart.isDirty(), true); Assert.assertEquals(fieldStart.isLocked(), false); // Retrieve the facade object which represents the field in the document. Field field = fieldStart.getField(); Assert.assertEquals(false, field.isLocked()); Assert.assertEquals(" TOC \\o \"1-3\" \\h \\z \\u ", field.getFieldCode()); // This updates only this field in the document. field.update();
public boolean isMoveFromRevision()
Example:
Shows how to process revision-related properties of Inline nodes.Document doc = new Document(getMyDir() + "Inline.Revisions.docx"); // This document has 6 revisions Assert.assertEquals(doc.getRevisions().getCount(), 6); // The parent node of a revision is the run that the revision concerns, which is an Inline node Run run = (Run) doc.getRevisions().get(0).getParentNode(); // Get the parent paragraph Paragraph firstParagraph = run.getParentParagraph(); RunCollection runs = firstParagraph.getRuns(); Assert.assertEquals(runs.getCount(), 6); // The text in the run at index #2 was typed after revisions were tracked, so it will count as an insert revision // The font was changed, so it will also be a format revision Assert.assertTrue(runs.get(2).isInsertRevision()); Assert.assertTrue(runs.get(2).isFormatRevision()); // If one node was moved from one place to another while changes were tracked, // the node will be placed at the departure location as a "move to revision", // and a "move from revision" node will be left behind at the origin, in case we want to reject changes // Highlighting text and dragging it to another place with the mouse and cut-and-pasting (but not copy-pasting) both count as "move revisions" // The node with the "IsMoveToRevision" flag is the arrival of the move operation, and the node with the "IsMoveFromRevision" flag is the departure point Assert.assertTrue(runs.get(1).isMoveToRevision()); Assert.assertTrue(runs.get(4).isMoveFromRevision()); // If an Inline node gets deleted while changes are being tracked, it will leave behind a node with the IsDeleteRevision flag set to true until changes are accepted Assert.assertTrue(runs.get(5).isDeleteRevision());
public boolean isMoveToRevision()
Example:
Shows how to process revision-related properties of Inline nodes.Document doc = new Document(getMyDir() + "Inline.Revisions.docx"); // This document has 6 revisions Assert.assertEquals(doc.getRevisions().getCount(), 6); // The parent node of a revision is the run that the revision concerns, which is an Inline node Run run = (Run) doc.getRevisions().get(0).getParentNode(); // Get the parent paragraph Paragraph firstParagraph = run.getParentParagraph(); RunCollection runs = firstParagraph.getRuns(); Assert.assertEquals(runs.getCount(), 6); // The text in the run at index #2 was typed after revisions were tracked, so it will count as an insert revision // The font was changed, so it will also be a format revision Assert.assertTrue(runs.get(2).isInsertRevision()); Assert.assertTrue(runs.get(2).isFormatRevision()); // If one node was moved from one place to another while changes were tracked, // the node will be placed at the departure location as a "move to revision", // and a "move from revision" node will be left behind at the origin, in case we want to reject changes // Highlighting text and dragging it to another place with the mouse and cut-and-pasting (but not copy-pasting) both count as "move revisions" // The node with the "IsMoveToRevision" flag is the arrival of the move operation, and the node with the "IsMoveFromRevision" flag is the departure point Assert.assertTrue(runs.get(1).isMoveToRevision()); Assert.assertTrue(runs.get(4).isMoveFromRevision()); // If an Inline node gets deleted while changes are being tracked, it will leave behind a node with the IsDeleteRevision flag set to true until changes are accepted Assert.assertTrue(runs.get(5).isDeleteRevision());
public Node getNextSibling()
Example:
Shows how to enumerate immediate child nodes of a composite node using NextSibling. In this example we enumerate all paragraphs of a section body.// Get the section that we want to work on. Section section = doc.getSections().get(0); Body body = section.getBody(); // Loop starting from the first child until we reach null. for (Node node = body.getFirstChild(); node != null; node = node.getNextSibling()) { // Output the types of the nodes that we come across. System.out.println(Node.nodeTypeToString(node.getNodeType())); }
Example:
Shows how to efficiently visit all direct and indirect children of a composite node.public void recurseAllNodes() throws Exception { // Open a document. Document doc = new Document(getMyDir() + "Node.RecurseAllNodes.doc"); // Invoke the recursive function that will walk the tree. traverseAllNodes(doc); } /** * A simple function that will walk through all children of a specified node recursively * and print the type of each node to the screen. */ @Test(enabled = false) public void traverseAllNodes(final CompositeNode parentNode) { // This is the most efficient way to loop through immediate children of a node. for (Node childNode = parentNode.getFirstChild(); childNode != null; childNode = childNode.getNextSibling()) { // Do some useful work. System.out.println(Node.nodeTypeToString(childNode.getNodeType())); // Recurse into the node if it is a composite node. if (childNode.isComposite()) { traverseAllNodes((CompositeNode) childNode); } } }
public int getNodeType()
Example:
Shows how to retrieve the NodeType enumeration of nodes.Document doc = new Document(getMyDir() + "Document.doc"); // Let's pick a node that we can't be quite sure of what type it is. // In this case lets pick the first node of the first paragraph in the body of the document Node node = doc.getFirstSection().getBody().getFirstParagraph().getFirstChild(); System.out.println("NodeType of first child: " + Node.nodeTypeToString(node.getNodeType())); // This time let's pick a node that we know the type of. Create a new paragraph and a table node. Paragraph para = new Paragraph(doc); Table table = new Table(doc); // Access to NodeType for typed nodes will always return their specific NodeType. // i.e A paragraph node will always return NodeType.Paragraph, a table node will always return NodeType.Table. System.out.println("NodeType of Paragraph: " + Node.nodeTypeToString(para.getNodeType())); System.out.println("NodeType of Table: " + Node.nodeTypeToString(table.getNodeType()));
public CompositeNode getParentNode()
If a node has just been created and not yet added to the tree, or if it has been removed from the tree, the parent is null.
Example:
Shows how to access the parent node.// Create a new empty document. It has one section. Document doc = new Document(); // The section is the first child node of the document. Node section = doc.getFirstChild(); // The section's parent node is the document. System.out.println("Section parent is the document: " + (doc == section.getParentNode()));
Example:
Shows that when you create any node, it requires a document that will own the node.// Open a file from disk. Document doc = new Document(); // Creating a new node of any type requires a document passed into the constructor. Paragraph para = new Paragraph(doc); // The new paragraph node does not yet have a parent. System.out.println("Paragraph has no parent node: " + (para.getParentNode() == null)); // But the paragraph node knows its document. System.out.println("Both nodes' documents are the same: " + (para.getDocument() == doc)); // The fact that a node always belongs to a document allows us to access and modify // properties that reference the document-wide data such as styles or lists. para.getParagraphFormat().setStyleName("Heading 1"); // Now add the paragraph to the main text of the first section. doc.getFirstSection().getBody().appendChild(para); // The paragraph node is now a child of the Body node. System.out.println("Paragraph has a parent node: " + (para.getParentNode() != null));
public Paragraph getParentParagraph()
Example:
Shows how to process revision-related properties of Inline nodes.Document doc = new Document(getMyDir() + "Inline.Revisions.docx"); // This document has 6 revisions Assert.assertEquals(doc.getRevisions().getCount(), 6); // The parent node of a revision is the run that the revision concerns, which is an Inline node Run run = (Run) doc.getRevisions().get(0).getParentNode(); // Get the parent paragraph Paragraph firstParagraph = run.getParentParagraph(); RunCollection runs = firstParagraph.getRuns(); Assert.assertEquals(runs.getCount(), 6); // The text in the run at index #2 was typed after revisions were tracked, so it will count as an insert revision // The font was changed, so it will also be a format revision Assert.assertTrue(runs.get(2).isInsertRevision()); Assert.assertTrue(runs.get(2).isFormatRevision()); // If one node was moved from one place to another while changes were tracked, // the node will be placed at the departure location as a "move to revision", // and a "move from revision" node will be left behind at the origin, in case we want to reject changes // Highlighting text and dragging it to another place with the mouse and cut-and-pasting (but not copy-pasting) both count as "move revisions" // The node with the "IsMoveToRevision" flag is the arrival of the move operation, and the node with the "IsMoveFromRevision" flag is the departure point Assert.assertTrue(runs.get(1).isMoveToRevision()); Assert.assertTrue(runs.get(4).isMoveFromRevision()); // If an Inline node gets deleted while changes are being tracked, it will leave behind a node with the IsDeleteRevision flag set to true until changes are accepted Assert.assertTrue(runs.get(5).isDeleteRevision());
public Node getPreviousSibling()
Example:
Demonstrates use of methods of Node and CompositeNode to remove a section before the last section in the document.// Document is a CompositeNode and LastChild returns the last child node in the Document node. // Since the Document can contain only Section nodes, the last child is the last section. Node lastSection = doc.getLastChild(); // Each node knows its next and previous sibling nodes. // Previous sibling of a section is a section before the specified section. // If the node is the first child, PreviousSibling will return null. Node sectionBeforeLast = lastSection.getPreviousSibling(); if (sectionBeforeLast != null) { doc.removeChild(sectionBeforeLast); }
public Range getRange()
Example:
Shows how to delete all characters of a range.// Open Word document. Document doc = new Document(getMyDir() + "Range.DeleteSection.doc"); // The document contains two sections. Each section has a paragraph of text. System.out.println(doc.getText()); // Delete the first section from the document. doc.getSections().get(0).getRange().delete(); // Check the first section was deleted by looking at the text of the whole document again. System.out.println(doc.getText());
public boolean accept(DocumentVisitor visitor) throws java.lang.Exception
Calls
For more info see the Visitor design pattern.
visitor
- The visitor that will visit the node.Example:
Shows how to work with a document's field collection.public void fieldCollection() throws Exception { // Open a document that has fields Document doc = new Document(getMyDir() + "Document.ContainsFields.docx"); // Get the collection that contains all the fields in a document FieldCollection fields = doc.getRange().getFields(); Assert.assertEquals(fields.getCount(), 6); // Iterate over the field collection and print contents and type of every field using a custom visitor implementation FieldVisitor fieldVisitor = new FieldVisitor(); Iterator<Field> fieldEnumerator = fields.iterator(); while (fieldEnumerator.hasNext()) { if (fieldEnumerator.next() != null) { Field currentField = fieldEnumerator.next(); currentField.getStart().accept(fieldVisitor); if (currentField.getSeparator() != null) { currentField.getSeparator().accept(fieldVisitor); } currentField.getEnd().accept(fieldVisitor); } else { System.out.println("There are no fields in the document."); } } System.out.println(fieldVisitor.getText()); // Get a field to remove itself fields.get(0).remove(); Assert.assertEquals(fields.getCount(), 5); // Remove a field by reference Field lastField = fields.get(3); fields.remove(lastField); Assert.assertEquals(fields.getCount(), 4); // Remove a field by index fields.removeAt(2); Assert.assertEquals(fields.getCount(), 3); // Remove all fields from the document fields.clear(); Assert.assertEquals(fields.getCount(), 0); } /// <summary> /// Document visitor implementation that prints field info /// </summary> public static class FieldVisitor extends DocumentVisitor { public FieldVisitor() { mBuilder = new StringBuilder(); } /// <summary> /// Gets the plain text of the document that was accumulated by the visitor. /// </summary> public String getText() { return mBuilder.toString(); } /// <summary> /// Called when a FieldStart node is encountered in the document. /// </summary> public int visitFieldStart(final FieldStart fieldStart) { mBuilder.append("Found field: " + fieldStart.getFieldType() + "\r\n"); mBuilder.append("\tField code: " + fieldStart.getField().getFieldCode() + "\r\n"); mBuilder.append("\tDisplayed as: " + fieldStart.getField().getResult() + "\r\n"); return VisitorAction.CONTINUE; } /// <summary> /// Called when a FieldSeparator node is encountered in the document. /// </summary> public int visitFieldSeparator(final FieldSeparator fieldSeparator) { mBuilder.append("\tFound separator: " + fieldSeparator.getText() + "\r\n"); return VisitorAction.CONTINUE; } /// <summary> /// Called when a FieldEnd node is encountered in the document. /// </summary> public int visitFieldEnd(final FieldEnd fieldEnd) { mBuilder.append("End of field: " + fieldEnd.getFieldType() + "\r\n"); return VisitorAction.CONTINUE; } private StringBuilder mBuilder; }
public Node deepClone(boolean isCloneChildren)
Example:
Shows how to clone composite nodes with and without their child nodes.// Create a new empty document. Document doc = new Document(); // Add some text to the first paragraph Paragraph para = doc.getFirstSection().getBody().getFirstParagraph(); para.appendChild(new Run(doc, "Some text")); // Clone the paragraph and the child nodes. Node cloneWithChildren = para.deepClone(true); // Only clone the paragraph and no child nodes. Node cloneWithoutChildren = para.deepClone(false);
public CompositeNode getAncestor(int ancestorType)
ancestorType
- A Example:
Shows how to find out if a table contains another table or if the table itself is nested inside another table.public void calculateDepthOfNestedTables() throws Exception { Document doc = new Document(getMyDir() + "Table.NestedTables.doc"); int tableIndex = 0; for (Table table : (Iterable<Table>) doc.getChildNodes(NodeType.TABLE, true)) { // First lets find if any cells in the table have tables themselves as children. int count = getChildTableCount(table); System.out.println(MessageFormat.format("Table #{0} has {1} tables directly within its cells", tableIndex, count)); // Now let's try the other way around, lets try find if the table is nested inside another table and at what depth. int tableDepth = getNestedDepthOfTable(table); if (tableDepth > 0) { System.out.println(MessageFormat.format("Table #{0} is nested inside another table at depth of {1}", tableIndex, tableDepth)); } else { System.out.println(MessageFormat.format("Table #{0} is a non nested table (is not a child of another table)", tableIndex)); } tableIndex++; } } /** * Calculates what level a table is nested inside other tables. * * @returns An integer containing the level the table is nested at. * 0 = Table is not nested inside any other table * 1 = Table is nested within one parent table * 2 = Table is nested within two parent tables etc.. */ private static int getNestedDepthOfTable(final Table table) { int depth = 0; int type = table.getNodeType(); // The parent of the table will be a Cell, instead attempt to find a grandparent that is of type Table Node parent = table.getAncestor(table.getNodeType()); while (parent != null) { // Every time we find a table a level up we increase the depth counter and then try to find an // ancestor of type table from the parent. depth++; parent = parent.getAncestor(Table.class); } return depth; } /** * Determines if a table contains any immediate child table within its cells. * Does not recursively traverse through those tables to check for further tables. * * @returns Returns true if at least one child cell contains a table. * Returns false if no cells in the table contains a table. */ private static int getChildTableCount(final Table table) { int tableCount = 0; // Iterate through all child rows in the table for (Row row : table.getRows()) { // Iterate through all child cells in the row for (Cell cell : row.getCells()) { // Retrieve the collection of child tables of this cell TableCollection childTables = cell.getTables(); // If this cell has a table as a child then return true if (childTables.getCount() > 0) tableCount++; } } // No cell contains a table return tableCount; }
public CompositeNode getAncestor(java.lang.Class ancestorType)
The ancestor type matches if it is equal to ancestorType or derived from ancestorType.
ancestorType
- The object type of the ancestor to retrieve.Example:
Shows how to find out if a table contains another table or if the table itself is nested inside another table.public void calculateDepthOfNestedTables() throws Exception { Document doc = new Document(getMyDir() + "Table.NestedTables.doc"); int tableIndex = 0; for (Table table : (Iterable<Table>) doc.getChildNodes(NodeType.TABLE, true)) { // First lets find if any cells in the table have tables themselves as children. int count = getChildTableCount(table); System.out.println(MessageFormat.format("Table #{0} has {1} tables directly within its cells", tableIndex, count)); // Now let's try the other way around, lets try find if the table is nested inside another table and at what depth. int tableDepth = getNestedDepthOfTable(table); if (tableDepth > 0) { System.out.println(MessageFormat.format("Table #{0} is nested inside another table at depth of {1}", tableIndex, tableDepth)); } else { System.out.println(MessageFormat.format("Table #{0} is a non nested table (is not a child of another table)", tableIndex)); } tableIndex++; } } /** * Calculates what level a table is nested inside other tables. * * @returns An integer containing the level the table is nested at. * 0 = Table is not nested inside any other table * 1 = Table is nested within one parent table * 2 = Table is nested within two parent tables etc.. */ private static int getNestedDepthOfTable(final Table table) { int depth = 0; int type = table.getNodeType(); // The parent of the table will be a Cell, instead attempt to find a grandparent that is of type Table Node parent = table.getAncestor(table.getNodeType()); while (parent != null) { // Every time we find a table a level up we increase the depth counter and then try to find an // ancestor of type table from the parent. depth++; parent = parent.getAncestor(Table.class); } return depth; } /** * Determines if a table contains any immediate child table within its cells. * Does not recursively traverse through those tables to check for further tables. * * @returns Returns true if at least one child cell contains a table. * Returns false if no cells in the table contains a table. */ private static int getChildTableCount(final Table table) { int tableCount = 0; // Iterate through all child rows in the table for (Row row : table.getRows()) { // Iterate through all child cells in the row for (Cell cell : row.getCells()) { // Retrieve the collection of child tables of this cell TableCollection childTables = cell.getTables(); // If this cell has a table as a child then return true if (childTables.getCount() > 0) tableCount++; } } // No cell contains a table return tableCount; }
public Field getField()
Example:
Demonstrates how to retrieve the field class from an existing FieldStart node in the document.Document doc = new Document(getMyDir() + "Document.TableOfContents.doc"); FieldChar fieldStart = (FieldChar) doc.getChild(NodeType.FIELD_START, 0, true); Assert.assertEquals(fieldStart.getFieldType(), FieldType.FIELD_TOC); Assert.assertEquals(fieldStart.isDirty(), true); Assert.assertEquals(fieldStart.isLocked(), false); // Retrieve the facade object which represents the field in the document. Field field = fieldStart.getField(); Assert.assertEquals(false, field.isLocked()); Assert.assertEquals(" TOC \\o \"1-3\" \\h \\z \\u ", field.getFieldCode()); // This updates only this field in the document. field.update();
public java.lang.String getText()
public Node nextPreOrder(Node rootNode)
rootNode
- The top node (limit) of traversal.Example:
Shows how to delete all images from a document using pre-order tree traversal.Node curNode = doc; while (curNode != null) { Node nextNode = curNode.nextPreOrder(doc); if (curNode.previousPreOrder(doc) != null && nextNode != null) { Assert.assertEquals(curNode, nextNode.previousPreOrder(doc)); } if (curNode.getNodeType() == NodeType.SHAPE) { Shape shape = (Shape) curNode; // Several shape types can have an image including image shapes and OLE objects. if (shape.hasImage()) { shape.remove(); } } curNode = nextNode; }
public Node previousPreOrder(Node rootNode)
rootNode
- The top node (limit) of traversal.Example:
Shows how to delete all images from a document using pre-order tree traversal.Node curNode = doc; while (curNode != null) { Node nextNode = curNode.nextPreOrder(doc); if (curNode.previousPreOrder(doc) != null && nextNode != null) { Assert.assertEquals(curNode, nextNode.previousPreOrder(doc)); } if (curNode.getNodeType() == NodeType.SHAPE) { Shape shape = (Shape) curNode; // Several shape types can have an image including image shapes and OLE objects. if (shape.hasImage()) { shape.remove(); } } curNode = nextNode; }
public void remove()
Example:
Shows how to remove all nodes of a specific type from a composite node. In this example we remove tables from a section body.// Get the section that we want to work on. Section section = doc.getSections().get(0); Body body = section.getBody(); // Select the first child node in the body. Node curNode = body.getFirstChild(); while (curNode != null) { // Save the pointer to the next sibling node because if the current // node is removed from the parent in the next step, we will have // no way of finding the next node to continue the loop. Node nextNode = curNode.getNextSibling(); // A section body can contain Paragraph and Table nodes. // If the node is a Table, remove it from the parent. if (curNode.getNodeType() == NodeType.TABLE) { curNode.remove(); } // Continue going through child nodes until null (no more siblings) is reached. curNode = nextNode; }
Example:
Shows how to delete all images from a document.// Here we get all shapes from the document node, but you can do this for any smaller // node too, for example delete shapes from a single section or a paragraph. NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true); // We cannot delete shape nodes while we enumerate through the collection. // One solution is to add nodes that we want to delete to a temporary array and delete afterwards. ArrayList shapesToDelete = new ArrayList(); for (Shape shape : (Iterable<Shape>) shapes) { // Several shape types can have an image including image shapes and OLE objects. if (shape.hasImage()) { shapesToDelete.add(shape); } } // Now we can delete shapes. for (Shape shape : (Iterable<Shape>) shapesToDelete) { shape.remove(); }
public java.lang.String toString(SaveOptions saveOptions) throws java.lang.Exception
saveOptions
- Specifies the options that control how the node is saved.Example:
Exports the content of a node to string in HTML format using custom specified options.Document doc = new Document(getMyDir() + "Document.doc"); // Extract the last paragraph in the document to convert to HTML. Node node = doc.getLastSection().getBody().getLastParagraph(); // Create an instance of HtmlSaveOptions and set a few options. HtmlSaveOptions saveOptions = new HtmlSaveOptions(); saveOptions.setExportHeadersFootersMode(ExportHeadersFootersMode.PER_SECTION); saveOptions.setExportRelativeFontSize(true); // Convert the document to HTML and return as a string. Pass the instance of HtmlSaveOptions to // to use the specified options during the conversion. String nodeAsHtml = node.toString(saveOptions);
public java.lang.String toString(int saveFormat) throws java.lang.Exception
saveFormat
- A SaveFormat value.Example:
Shows how to extract the label of each paragraph in a list as a value or a String.Document doc = new Document(getMyDir() + "Lists.PrintOutAllLists.doc"); doc.updateListLabels(); int listParaCount = 1; for (Paragraph paragraph : (Iterable<Paragraph>) doc.getChildNodes(NodeType.PARAGRAPH, true)) { // Find if we have the paragraph list. In our document our list uses plain arabic numbers, // which start at three and ends at six. if (paragraph.getListFormat().isListItem()) { System.out.println(MessageFormat.format("Paragraph #{0}", listParaCount)); // This is the text we get when actually getting when we output this node to text format. // The list labels are not included in this text output. Trim any paragraph formatting characters. String paragraphText = paragraph.toString(SaveFormat.TEXT).trim(); System.out.println("Exported Text: " + paragraphText); ListLabel label = paragraph.getListLabel(); // This gets the position of the paragraph in current level of the list. If we have a list with multiple level then this // will tell us what position it is on that particular level. System.out.println("Numerical Id: " + label.getLabelValue()); // Combine them together to include the list label with the text in the output. System.out.println("List label combined with text: " + label.getLabelString() + " " + paragraphText); listParaCount++; } }
Example:
Shows the difference between calling the GetText and ToString methods on a node.Document doc = new Document(); // Enter a dummy field into the document. DocumentBuilder builder = new DocumentBuilder(doc); builder.insertField("MERGEFIELD Field"); // GetText will retrieve all field codes and special characters System.out.println("GetText() Result: " + doc.getText()); // ToString will export the node to the specified format. When converted to text it will not retrieve fields code // or special characters, but will still contain some natural formatting characters such as paragraph markers etc. // This is the same as "viewing" the document as if it was opened in a text editor. System.out.println("ToString() Result: " + doc.toString(SaveFormat.TEXT));
Example:
Exports the content of a node to string in HTML format using default options.Document doc = new Document(getMyDir() + "Document.doc"); // Extract the last paragraph in the document to convert to HTML. Node node = doc.getLastSection().getBody().getLastParagraph(); // When ToString is called using the SaveFormat overload then conversion is executed using default save options. // When saving to HTML using default options the following settings are set: // ExportImagesAsBase64 = true // CssStyleSheetType = CssStyleSheetType.Inline // ExportFontResources = false String nodeAsHtml = node.toString(SaveFormat.HTML);