Ant
is a build utility produced as part of the Apache Jakarta
project. It is broadly equivalent in function to make
under Linux/Unix, or
nmake
under Windows. These make
utilities work
by comparing the date of an output file to the date of the input files required to build it.
If any of the input files is newer than the output file, the output file needs to be rebuilt.
This is a simple rule, but one that generally produces the right results.
Unlike traditional make
utilities, Ant is written in Java,
so Ant is a good cross-platform solution for controlling the automatic building of files.
That is good news for anyone developing cross-platform XSLT scripts, because you only need
to target the one build environment. Anyone who has tried writing and maintaining equivalent
Windows & Linux/Unix batch scripts knows how hard it is to get the same behaviour across
different platforms.
So why would you use Ant & XSLT together? If all you are doing is applying a single XSLT stylesheet to a single XML input file, using a single XSLT engine, then there is probably nothing to be gained. However, if
you need to apply one or more XSLT scripts to one or more XML input files in some sequence, in order to build your final output file(s);
you need to run multiple XSLT engines on the same XML input file(s) as part of your regression or integration testing
then Ant is a good, quick way to implement the workflow you need to transform your input(s) into your output(s).
Using Ant for a simple "1 input, 1 stylesheet, 1 output" transformation may be overkill,
but it is a good way to learn how to use Ant. Assume that the input is input.xml
,
the stylesheet is transform.xsl
, and the output is output.html
.
Then a matching Ant 1.5 project file build.xml
is
<project default="do-it"> <target name="do-it"> <xslt processor="trax" in="input.xml" style="transform.xsl" out="output.html"/> </target> </project>
The root element of an Ant build file is
project
. It can contain a number of
target
elements. Its
default
attribute contains the name
of the target to build if no targets are given on the command line. Since the example
project file defaults to building the target do-it
, the output file
could be built equally using any of the following command lines:
$ ant $ ant do-it $ ant -buildfile build.xml $ ant -buildfile build.xml do-it
Unlike the Unix make
and its clones, which can use filenames
for targets, Ant only uses target names defined in the build file. So every
target
must have a unique name.
Within a target, any number of tasks can be performed. The
xslt
task comes by default with Ant 1.5.
With the
processor
attribute set to
trax
, the
xslt
task uses the default
JAXP/TraX
XSLT engine on your computer to perform the transformation.
Now a more complicated XSLT workflow. There are 3 input files
(in1.xml
, in2.xml
and in3.xml
).
Each of these has the same kind of information, but the formats are different. So, they are
normalized to a common format by 3 separate stylesheets
(norm1.xsl
, norm2.xsl
and
norm3.xsl
respectively). A standard merging stylesheet exists,
merge.xsl
, but it only merges two inputs (the usual input plus
a filename passed as a parameter to the stylesheet). So it has to be used twice in
order to merge the 3 normalized files. The merged sum of the 3 is sorted to
produce the final output file, out.xml
.
A matching Ant build file (explained in detail afterwards) is
<project default="sort"> <target name="normalize"> <xslt processor="trax" in="in1.xml" style="norm1.xsl" out="nm1.xml"/> <xslt processor="trax" in="in2.xml" style="norm2.xsl" out="nm2.xml"/> <xslt processor="trax" in="in3.xml" style="norm3.xsl" out="nm3.xml"/> </target> <target name="check12"> <uptodate property="skip.merge12" targetfile="m12.xml"> <srcfiles dir="."> <include name="nm1.xml"/> <include name="nm2.xml"/> <include name="merge.xsl"/> </srcfiles> </uptodate> </target> <target name="merge12" depends="normalize,check12" unless="skip.merge12"> <xslt processor="trax" in="nm1.xml" style="merge.xsl" out="m12.xml" force="true"> <param name="source2" expression="nm2.xml"/> </xslt> </target> <target name="check123"> <uptodate property="skip.merge123" targetfile="123.xml"> <srcfiles dir="."> <include name="m12.xml"/> <include name="nm3.xml"/> <include name="merge.xsl"/> </srcfiles> </uptodate> </target> <target name="merge123" depends="normalize,merge12,check123" unless="skip.merge123"> <xslt processor="trax" in="m12.xml" style="merge.xsl" out="123.xml" force="true"> <param name="source2" expression="nm3.xml"/> </xslt> </target> <target name="sort" depends="merge123"> <xslt processor="trax" in="123.xml" style="sort.xsl" out="out.xml"/> </target> <target name="clean"> <delete> <fileset dir="."> <include name="output.html"/> <include name="nm*.xml"/> <include name="m12.xml"/> <include name="123.xml"/> <include name="out.xml"/> </fileset> </delete> </target> </project>
Note that Ant takes account of timestamps on files, just like the Unix
make
. It will not re-run the transformation unless either the
input file or the stylesheet is newer than the output file (which usually means that
the input file or the stylesheet has been modified since the last build). So if
in1.xml
is modified, nm2.xml
and
nm3.xml
will not be rebuilt. Alternatively, if
in3.xml
is modified, m12.xml
will
not be rebuilt. This can save a lot of development time is situations where
one of the transformations takes much longer than the others.
The workings of this Ant project file are as follows:
The default target is sort
.
Sorting is the last thing that needs to be done, so making sort
the default target means that the whole build process is carried out by default.
The normalize
target is used to run the
3 normalization stylesheets. Although you could use 3 separate targets, there is no need, since
each xslt
task only runs when its output
(nm1.xml
, nm2.xml
or nm3.xml
)
actually needs to be rebuilt.
The merge.xsl
stylesheet is special, because one of its
input filenames is passed as a stylesheet parameter. There is no way that the standard
xslt
can know this, so it is necessary to manually tell Ant
when a rebuild is or is not required. The check12
target uses
Ant's uptodate
task to check whether
m12.xml
is newer than all 3 of
nm1.xml
, nm2.xml
and merge.xsl
.
The result is stored in the Ant property skip.merge12
.
The merge12
target is used to merge
nm1.xml
and nm2.xml
, but it only runs when
skip.merge12
is false, i.e. when
m12.xml
is not up to date. The xslt
task
which runs merge.xsl
has an extra attribute force
which is set to true, to override the default checking of whether a rebuild is necessary.
Note that this complexity comes about purely because a filename has been passed
to the stylesheet as a parameter. It is a special case, but one which is nonetheless not too difficult
to solve. The same logic applies to the targets check123
and
merge123
.
Finally, the sort
target, which is the default
target for the project, applies sort.xsl
to 123.xml
to produce
the end result, out.xml
.
The files for this project are provided with the zipped examples.
You now know everything you need to start using the standard Ant
xslt
task in your own projects. However, you should also
take the time to read the full description of this task in the Ant documentation.
XSLT stylesheets can provide a good cross-platform solution for manipulating XML,
but your different platforms may use different XSLT engines. Sites that are using the Apache
Web server often use Apache Xalan. Sites that are using PHP are likely to use Sablotron.
Oracle sites often use the Oracle XDK (as this may be the only XSLT engine that the operations people will allow).
A lot of XML consultants use & recommend Saxon. Microsoft sites generally use MSXML.
Although these XSLT engines behave similarly, there are still some differences, so you need
to plan to test with all of the XSLT engines that are likely to be used with your XSLT stylesheets.
For this article, we will focus on the Java XSLT engines, since they are the ones supported
natively by the Ant xslt
task.
When testing with multiple engines, it is useful to be able to run the same test
using each XSLT engine from within the one Ant build file. However, there is a problem.
The JAXP/Trax interface uses the Java javax.xml.transform.TransformerFactory
property to define which class should be instantiated as a factory for creating XSLT engines.
So, in order to use the XSLT engine of your choice, this property needs to be set appropriately.
However, there is no easy way to do that within Ant, and hence no easy way to change
XSLT engines within a single Ant build file (the best you can do is to launch a separate
Java process and then call Ant from within that new process). To overcome this problem, the
best solution is to create a new XSLT task for Ant, one which makes it easy to select the
desired XSLT TransformerFactory
.
mtxslt (short for “multi-XSLT”)
is an Ant task that makes it easy to select the Java XSLT engine(s) of choice within an Ant build file.
mtxslt
extends the standard Ant xslt
task so that it maintains full compatibility with the standard task. Anything that works with the
xslt
task also works with mtxslt
.
With mtxslt
, it is possible to ignore the value of the Java
javax.xml.transform.TransformerFactory
property and simply load a particular
XSLT engine directly. At the time of writing, mtxslt
supports
Xalan 2,
Saxon 6/7 and
Oracle XDK 9.
The easiest way to explain how to use mtxslt
is via an example.
This example uses a few new Ant elements. A taskdef
is required to associate the task name mtxslt
with the Java class
which implements it. Actually, you can call mtxslt
anything you
want just by changing the name in the taskdef
. The choice is
yours.
The property
definitions are used to define
values that can be retrieved by name throughout the build file. This is no different to a
defining a string variable in a programming language. Here, property definitions are used to
define short names for qualified Java class names and for file paths, since both of these
tend to be long, and both reduce the readability and maintainability of the build file if
repeated throughout the build file.
In this example, different XSLT engines are used to apply the same stylesheet
transform.xsl
to the same input input.xml
.
The resultant HTML files can then be compared. The targets in this example build file
are explained in detail afterwards.
<project name="test" default="all"> <taskdef name="mtxslt" classname="org.xmLP.ant.taskdefs.xslt.XSLTProcess"/> <property name="trax" value="org.xmLP.ant.taskdefs.optional.TraXLiaison"/> <property name="xalan2" value="org.xmLP.ant.taskdefs.optional.Xalan2Liaison"/> <property name="xalan2.classpath" value="D:\home\tony\XSLT\xalan-j_2_4_0\bin\xalan.jar"/> <property name="saxon6" value="org.xmLP.ant.taskdefs.optional.Saxon6Liaison"/> <property name="saxon6.classpath" value="D:\home\tony\XSLT\Saxon-6.5.2\saxon.jar"/> <property name="saxon7" value="org.xmLP.ant.taskdefs.optional.Saxon7Liaison"/> <property name="saxon7.classpath" value="D:\home\tony\XSLT\Saxon-7.1\saxon7.jar"/> <property name="oracle9" value="org.xmLP.ant.taskdefs.optional.Oracle9Liaison"/> <property name="oracle9.classpath" value="D:\home\tony\XSLT\xdk_java_9_2_0_3_0\lib\xmlparserv2.jar"/> <target name="all" depends="trax1,trax2,trax3,trax4,xalan2,saxon6,saxon7,oracle9"/> <target name="trax1"> <xslt processor="trax" in="input.xml" style="transform.xsl" out="trax1.html"> <param name="target" expression="trax1"/> </xslt> </target> <target name="trax2"> <mtxslt processor="trax" in="input.xml" style="transform.xsl" out="trax2.html"> <param name="target" expression="trax2"/> </mtxslt> </target> <target name="trax3"> <xslt processor="${trax}" in="input.xml" style="transform.xsl" out="trax3.html"> <param name="target" expression="trax3"/> </xslt> </target> <target name="trax4"> <mtxslt processor="${trax}" in="input.xml" style="transform.xsl" out="trax4.html"> <param name="target" expression="trax4"/> </mtxslt> </target> <target name="xalan2"> <mtxslt processor="${xalan2}" in="input.xml" style="transform.xsl" out="xalan2.html" classpath="${xalan2.classpath}"> <param name="target" expression="xalan2"/> </mtxslt> </target> <target name="saxon6"> <mtxslt processor="${saxon6}" in="input.xml" style="transform.xsl" out="saxon6.html" classpath="${saxon6.classpath}"> <param name="target" expression="saxon6"/> </mtxslt> </target> <target name="saxon7"> <mtxslt processor="${saxon7}" in="input.xml" style="transform.xsl" out="saxon7.html" classpath="${saxon7.classpath}"> <param name="target" expression="saxon7"/> </mtxslt> </target> <target name="oracle9"> <mtxslt processor="${oracle9}" in="input.xml" style="transform.xsl" out="oracle9.html" classpath="${oracle9.classpath}"> <param name="target" expression="oracle9"/> </mtxslt> </target> <target name="clean"> <delete> <fileset dir="." includes="*.html"/> </delete> </target> </project>
The target trax1
simply uses the standard
xslt
task to transform the input file, as in the earlier examples.
The target trax2
is identical to
trax1
except that it uses mtxslt
instead of xslt
. This is to demonstrate that
mtxslt
implements the standard behaviour of the
xslt
task.
The target trax3
is similar to
trax1
, except that the value of the
processor
attribute is the value of the property
trax
, i.e. org.xmLP.ant.taskdefs.optional.TraXLiaison
.
This is a feature of the xslt
task that only becomes
apparent when you look at the Ant source code. The processor
can optionally be a qualified class name for an Ant XSLT liaison class. This is the mechanism that
mtxslt
exploits to support multiple XSLT engines.
This particular XSLT liaison class connects with the default JAXP/TraX XSLT engine,
so the result is the identical to that produced by the target
trax1
.
The target trax4
is identical to
trax3
except that it uses mtxslt
instead of xslt
.
The targets xalan2
,
saxon6
, saxon7
and oracle9
use
mtxslt
to directly call Xalan 2, Saxon 6, Saxon 7
and Oracle XDK 9 respectively. Once the appropriate properties have been defined,
mtxslt
attributes look almost identical to
standard xslt
attributes, which was one of the
design goals. However, notice the addition of a classpath
attribute, which is required so that Ant loads the correct JAR archive for each XSLT engine.
Note that the target
parameter that is passed to the stylesheet
is purely so that the Ant target name can be embedded in each HTML product file, to make identification
of the files easier. It serves no other purpose.
That is all there is to it. You now not only know how to use Ant to control XSLT,
you also know how to use mtxslt
to control which XSLT engines
are used within an Ant build. Note that all of the example files from this article can be
downloaded as a ZIP archive.
Ant is a powerful cross-platform tool for controlling build processes, and
is ideal for controlling multi-file builds involving XSLT stylesheets. Using
mtxslt
, you can go further and invoke multiple
Java XSLT engines during a single build, which is ideal for portability testing.
People expect authors of technical articles to “eat their own dog food”. So it is worth noting that this article was written using an extended version of DocBook 4.2, and then converted to XHTML (the preferred format of the XML.com editors) using an XSLT stylesheet, with the process controlled by an Ant build file. As well as building the article, Ant controlled the extraction of the Ant build file code out of the DocBook source and into the example build files, and also the regression testing of the examples. It really works!
Ant;
Apache Jakarta project;
JAXP/TraX API from JDK 1.4.1.