3.37.  XFA Data

Overview

The XML Forms Architecture, (XFA) is an extension of the PDF structure using XML information. Its goal is to integrate PDF forms better into workflow processes.

XFA forms are not compatible with Acro Forms. Therefore, tests for acroforms cannot be used for XFA data. Tests for XFA data are mainly based on XPath.

// Methods around XFA data:
.hasXFAData()
.hasXFAData().matchingXPath(..) 
.hasXFAData().withNode(..)

.hasNoXFAData()

Existence and Absence of XFA

The first test focuses on the existence of XFA data:

@Test
public void hasXFAData() throws Exception {
  String filename = "documentUnderTest.pdf";

  AssertThat.document(filename)
            .hasXFAData()
  ;
}

You can also check that a PDF document does not contain XFA data:

@Test
public void hasNoXFAData() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .hasNoXFAData()
  ;
}

Validate Single XML-Tags

The next examples use the following XFA data (extract):

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<xdp:xdp xmlns:xdp="http://ns.adobe.com/xdp/">
  ...
  <x:xmpmeta xmlns:x="adobe:ns:meta/"
             x:xmptk="Adobe XMP Core 4.2.1-c041 52.337767, 2008/04/13-15:41:00"
  >
    <config xmlns="http://www.xfa.org/schema/xci/2.6/">
      ...
      <log xmlns="http://www.xfa.org/schema/xci/2.6/">
        <to>memory</to>
        <mode>overwrite</mode>
      </log>
      ...
    </config>
    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      ...
      <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" >
        <xmp:MetadataDate>2009-12-03T17:50:52Z</xmp:MetadataDate>
      </rdf:Description>
      ...
    </rdf:RDF>
  </x:xmpmeta>
 ...
</xdp:xdp>

To test a particular node, an instance of com.pdfunit.XMLNode has to be created. The constructor needs an XPath expression and the expected value of the node:

@Test
public void hasXFAData_WithNode() throws Exception {
  String filename = "documentUnderTest.pdf";
  XMLNode xmpNode = new XMLNode("xmp:MetadataDate", "2009-12-03T17:50:52Z");  1
  
  AssertThat.document(filename)
            .hasXFAData()
            .withNode(xmpNode)
  ;
}

1

PDFUnit analyzes the XFA data from the current PDF document and determines the namespaces automatically. Only the default namespace has to be specified.

When processing the XPath expression PDFUnit internally adds the path element "//" to the given XPath expression. For this reason the expression need not contain the document root "/".

If the XPath expression evaluates to a node set, the first node is used.

If the XMLNode instance contains the expected value, this value will be used to compare it against the actual node value. If the XMLNode instance does not have an expected value, PDFUnit checks only for the existence of the node in the XFA data.

Tests on attribute nodes are of course also possible:

@Test
public void hasXFAData_WithNode_NamespaceDD() throws Exception {
  String filename = "documentUnderTest.pdf";
  XMLNode ddNode = new XMLNode("dd:dataDescription/@dd:name", "movie");
  
  AssertThat.document(filename)
            .hasXFAData()
            .withNode(ddNode)
  ;
}

XPath based XFA Tests

To take advantage of the full power of XPath, the method matchingXPath(..) is provided. The following two examples help give an idea of what is possible:

@Test
public void hasXFAData_MatchingXPath_FunctionStartsWith() throws Exception {
  String filename = "documentUnderTest.pdf";
  String xpathString = "starts-with(//dd:dataDescription/@dd:name, 'mov')";
  XPathExpression expressionWithFunction = new XPathExpression(xpathString);
  
  AssertThat.document(filename)
            .hasXFAData()
            .matchingXPath(expressionWithFunction) 
  ;
}
@Test
public void hasXFAData_MatchingXPath_FunctionCount_MultipleInvocation() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  String xpathProducer = "//pdf:Producer[. ='Adobe LiveCycle Designer ES 8.2']";
  String xpathPI       = "count(//processing-instruction()) = 30";

  XPathExpression exprPI       = new XPathExpression(xpathPI);
  XPathExpression exprProducer = new XPathExpression(xpathProducer);
  
  AssertThat.document(filename)
            .hasXFAData()
            .matchingXPath(exprProducer)
            .matchingXPath(exprPI)
  ;
  
  // The same test in a different style:
  AssertThat.document(filename)
            .hasXFAData().matchingXPath(exprProducer)
            .hasXFAData().matchingXPath(exprPI)
  ;
}

One limitation has to be mentioned. The evaluation of the XPath expressions depends on the implemented features of the XPath engine you are using. By default PDFUnit uses the JAXP implementation of the your JDK. So, the XPath compatibility also depends on the version of your JDK.

Chapter 13.12: “JAXP-Configuration” explains how to use any XPath-Engine, for example from the Xerces-project.

Default Namespaces in XPath

XML namespaces are detected automatically, but the default namespace has to be declared explicitly using an instance of DefaultNamespace. This instance must have a prefix. Any value can be chosen for the prefix:

@Test
public void hasXFAData_WithDefaultNamespace_XPathExpression() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  String namespaceURI  = "http://www.xfa.org/schema/xfa-template/2.6/";
  String xpathSubform  = "count(//default:subform[@name ='movie']//default:field) = 5";

  DefaultNamespace defaultNS   = new DefaultNamespace(namespaceURI);
  XPathExpression exprSubform  = new XPathExpression(xpathSubform, defaultNS);
  
  AssertThat.document(filename)
            .hasXFAData()
            .matchingXPath(exprSubform)
  ;
}

The default namespace must be used not only with the class XPathExpression, but also with the class XMLNode:

/**
 * The default namespace has to be declared, 
 * but any alias can be used for it.
 */
@Test
public void hasXFAData_WithDefaultNamespace_XMLNode() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  String namespaceXCI = "http://www.xfa.org/schema/xci/2.6/";
  DefaultNamespace defaultNS = new DefaultNamespace(namespaceXCI);
  XMLNode aliasFoo = new XMLNode("foo:log/foo:to", "memory", defaultNS);

  AssertThat.document(filename)
            .hasXFAData()
            .withNode(aliasFoo)
  ;
}